Downloads: 5
Research Paper | Computer Science and Engineering | Volume 15 Issue 1, January 2026 | Pages: 1321 - 1337 | India
HFFN for Java: A Hybrid Feature Fusion Network for Automated Code Summarization
Abstract: Objectives: This study aims to advance the field of automated software documentation by introducing a specialized adaptation of the Hybrid Feature Fusion Network (HFFN) for Java source code summarization. The primary objective is to architect a novel, multi-view deep learning framework that explicitly integrates and dynamically weights lexical, syntactic, and semantic feature representations to generate highly accurate and contextually relevant natural language summaries, thereby establishing a new state-of-the-art benchmark for Java. Methods: A new, large-scale JavaCodeSum corpus was meticulously curated, comprising 9,120 functionally validated Java methods sourced from diverse, high-quality open-source repositories and canonical textbooks. The proposed HFFN-Java architecture employs dedicated state-of-the-art encoders: a multi-filter CNN for lexical token sequences, a Child-Sum Tree-LSTM for Abstract Syntax Tree (AST) syntactic structures, and a Graph Neural Network (GNN) operating on a combined Control Flow Graph (CFG) and Data Flow Graph (DFG) for semantic modeling. A hierarchical gated fusion mechanism synergistically combines these representations, which are subsequently decoded by a transformer-based generator. The model was rigorously evaluated against nine established and pre-trained baselines, including CodeBERT and CodeT5, using a comprehensive suite of metrics: ROUGE-L, BLEU-4, Exact Match (EM), BERT Score, and CodeBLEU. Results: The HFFN-Java model demonstrated statistically significant superiority (p < 0.001, two-tailed paired t-test) over all baseline models across every evaluation metric. It achieved a notable ROUGE-L score of 0.92, a BERT Score of 0.94, and a marked 48% relative improvement in Exact Match accuracy over the strongest pre- trained baseline. An extensive ablation study quantified the critical, complementary contributions of each feature modality, with the removal of syntactic features resulting in the most substantial performance degradation (-8.7% in ROUGE-L), unequivocally validating the hybrid design principle. Conclusions: This research conclusively demonstrates that a deliberate, hierarchical fusion of complementary code representations- lexical, syntactic, and semantic- yields a profound performance advantage for summarizing Java source code, outperforming generalized pre-training strategies. The HFFN-Java framework provides a robust, interpretable, and high-performing foundation for automated documentation tools.
Keywords: code summarization, hybrid feature fusion, graph neural networks, transformers, software documentation
How to Cite?: Shruthi D, Chethan H K, Agughasi Victor Ikechukwu, "HFFN for Java: A Hybrid Feature Fusion Network for Automated Code Summarization", Volume 15 Issue 1, January 2026, International Journal of Science and Research (IJSR), Pages: 1321-1337, https://www.ijsr.net/getabstract.php?paperid=SR26120174145, DOI: https://dx.dx.doi.org/10.21275/SR26120174145