Abstract
Bug reports play a critical role in the software development lifecycle by helping developers identify and resolve defects efficiently. However, the quality of bug report titles, particularly in open-source communities, can vary significantly, which complicates the bug triage and resolution processes. Existing approaches, such as iTAPE, treat title generation as a one-sentence summarization task using sequence-to-sequence models. While these methods show promise, they face two major limitations: (1) they do not consider the distinct components of bug reports, treating the entire report as a homogeneous input, and (2) they struggle to handle the variability between template-based and non-template-based reports, often resulting in suboptimal titles. To address these limitations, we propose TAB, a hybrid framework that combines a Document Component Analyzer based on a pre-trained BERT model and a Title Generation Model based on CodeT5. TAB addresses the first limitation by segmenting bug reports into four components-Description, Reproduction, Expected Behavior, and Others-to ensure better alignment between input and output. For the second limitation, TAB uses a divergent approach: for template-based reports, titles are generated directly, while for non-template reports, DCA extracts key components to improve title relevance and clarity. We evaluate TAB on both template-based and non-template-based bug reports, demonstrating that it significantly outperforms existing methods. Specifically, TAB achieves average improvements of 170.4–389.5% in METEOR, 67.8–190.0% in ROUGE-L, and 65.7–124.5% in chrF(AF) compared to baseline approaches on template-based reports. Additionally, on non-template-based reports, TAB shows an average improvement of 64% in METEOR, 3.6% in ROUGE-L, and 14.8% in chrF(AF) over the state-of-the-art. These results confirm the robustness of TAB in generating high-quality titles across diverse bug report formats.




Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Abebe, S.L., Ali, N., Hassan, A.E.: An empirical study of software release notes. Empir. Softw. Eng. 21(3), 1107–1142 (2016)
Anonymous.: (2024). https://anonymous.4open.science/r/TAB-7E70/
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Bettenburg, N., Just, S., Schröter, A., Weiss, C., Premraj, R., Zimmermann, T.: What makes a good bug report? In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 308–318 (2008)
Bhattacharya, P., Ulanova, L., Neamtiu, I., Koduru, S.C.: An empirical analysis of bug reports and bug fixing in open source android apps. In: 2013 17th European Conference on Software Maintenance and Reengineering, pp. 133–143 (2013). https://doi.org/10.1109/CSMR.2013.23
Chaparro, O., Lu, J., Zampetti, F., Moreno, L., Di Penta, M., Marcus, A., Bavota, G., Ng, V.: Detecting missing information in bug descriptions. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 396–407 (2017)
Chaparro, O., Bernal-Cárdenas, C., Lu, J., Moran, K., Marcus, A., Di Penta, M., Poshyvanyk, D., Ng, V.: Assessing the quality of the steps to reproduce in bug reports. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 86–96 (2019)
Chaparro, O., Plorez, J.M., Singh, U., Marcus, A.: Reformulating queries for duplicate bug report detection. In: In Proceedings of The26th International Conference on Software Analysis,Evolution and Reengineering, pp. 218–229, IEEE (2019)
Chaparro, O., Plorez, J.M., Singh, U., Marcus, A.: Deeptriage:explor-ing the effectiveness of deep learning for bug triaging. In: In Proceedings of the Indiajoint International Conference on Data Science and Management of Data, pp. 171–179, Association for Computing Machinery (2019)
Chen, S., Xie, X., Yin, B., Ji, Y., Chen, L., Xu, B.: Stay professional and efficient: automatically generate titles for your bug reports. In: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 385–397, IEEE (2020)
Davies, S., Roper, M.: What’s in a bug report? In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10 (2014)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Devlin, M.C., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186, Association for Computational Linguistics (2019)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971)
Guo, S.L., N. Duan, Y.W., M. Zhou, J.Y.: Unixcoder: Unified cross-modal pre-training for code representation. In: in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, S. Muresan, P. Nakov, and A. Villavicencio, Eds, pp. 7212–7225, Association for Computational Linguistics (2022)
Haiduc, S., Aponte, J., Moreno, L., Marcus, A.: On the use of automated text summarization techniques for summarizing source code. In: Reverse Engineering (WCRE), 2010 17th Working Conference On, pp. 35–44, IEEE (2010)
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation with hybrid lexical and syntactical information. Empir. Softw. Eng. 25, 2179 (2019)
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension, pp. 200–210 (2018)
Huang, Q., Xia, X., Lo, D., Murphy, G.C.: Automating intention mining. IEEE Trans. Softw. Eng. 46(10), 1098–1119 (2018)
Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Summarizing source code using a neural attention model. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2073–2083 (2016)
Jiang, H., Zhang, J., Ma, H., Nazar, N., Ren, Z.: Mining authorship characteristics in bug repositories. Sci. China Inf. Sci. 60(1), 1–16 (2017)
Karim, M.R., Ihara, A., Yang, X., Iida, H., Matsumoto, K.: Understanding key features of high-impact bug reports. In: 2017 8th International Workshop on Empirical Software Engineering in Practice (IWESEP), pp. 53–58, IEEE (2017)
Ko, A.J., Chilana, P.K.: Design, discussion, and dissent in open bug reports. In: Proceedings of the 2011 IConference. iConference ’11, pp. 106–113. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1940761.1940776
Ko, A.J., Myers, B.A., Chau, D.H.: A linguistic analysis of how people describe software problems. In: Visual Languages and Human-Centric Computing (VL/HCC’06), pp. 127–134, IEEE (2006)
Lavie, A., Agarwal, A.: Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation. StatMT ’07, pp. 228–231. Association for Computational Linguistics, USA (2007)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Li, H., Yan, M., Sun, W., Liu, X., Wu, Y.: A first look at bug report templates on GitHub. J. Syst. Softw. 202, 111709 (2023)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004) (2004)
Liu, P., Fu, J., Hayashi, H., et al.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)
Liu, Z., Xia, X., Hassan, A.E., Lo, D., Xing, Z., Wang, X.: Neural-machine-translation-based commit message generation: how far are we? In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 373–384 (2018)
Liu, Q., Liu, Z., Zhu, H., Fan, H., Du, B., Qian, Y.: Generating commit messages from diffs using pointer-generator network. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 299–309, IEEE (2019)
Lotufo, R., Malik, Z., Czarnecki, K.: Modelling the hurried bug report reading process to summarize bug reports. Empir. Softw. Eng. 20(2), 516–548 (2015)
Mani, S., Catherine, R., Sinha, V.S., Dubey, A.: Ausum: approach for unsupervised bug report summarization. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp. 1–11 (2012)
McBurney, P.W., McMillan, C.: Automatic source code summarization of context for java methods. IEEE Trans. Softw. Eng. 42(2), 103–119 (2016)
McBurney, P.W., McMillan, C.: Automatic documentation generation via source code summarization of method context. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 279–290, ACM (2014)
Mills, C., Pantiuchina, J., Parra, E., Bavota, G., Haiduc, S.: Are bug reports enough for text retrieval-based bug localiza-tion? In: In Proceedings of the International Conference on Software Maintenance and Evolution, pp. 381–392, IEEE (2018)
Moreno, L., Bavota, G., Di Penta, M., Oliveto, R., Marcus, A., Canfora, G.: Arena: an approach for the automated generation of release notes. IEEE Trans. Softw. Eng. 43(2), 106–127 (2016)
Moreno, L., Aponte, J., Sridhara, G., Marcus, A., Pollock, L., Vijay-Shanker, K.: Automatic generation of natural language summaries for java classes. In: Program Comprehension (ICPC), 2013 IEEE 21st International Conference On, pp. 23–32, IEEE (2013)
Moreno, L., Bavota, G., Di Penta, M., Oliveto, R., Marcus, A., Canfora, G.: Automatic generation of release notes. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 484–495, ACM (2014)
Nijkamp, E., Pang, B., Hayashi, L. H. Tu, Wang, H., Zhou, Y., Savarese, S., Xiong, C.: Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022)
Popović, M.: chrf: character n-gram f-score for automatic MT evaluation. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp. 392–395 (2015)
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2023). https://arxiv.org/abs/1910.10683
Rastkar, S., Murphy, G.C., Murray, G.: Automatic summarization of bug reports. IEEE Trans. Softw. Eng. 40(4), 366–380 (2014)
Rastkar, S., Murphy, G.C., Murray, G.: Summarizing software artifacts: a case study of bug reports. In: 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1, pp. 505–514, IEEE (2010)
Roy, D., Fakhoury, S., Arnaoudova, V.: Reassessing automatic evaluation metrics for code summarization tasks. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1105–1116 (2021)
Ruan, H., Chen, B., Peng, X., Zhao, W.: Deeplink: re-covering issue-commit links based on deep learning. J. Syst. Softw. 158, 110406 (2019)
Sahoo, S.K., Criswell, J., Adve, V.: An empirical study of reported bugs in server software with implications for automated bug diagnosis. In: 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1, pp. 485–494 (2010). https://doi.org/10.1145/1806799.1806870
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Sharma, S., El Asri, L., Schulz, H., Zumer, J.: Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation. CoRR abs/1706.09799 (2017)
Sridhara, G., Hill, E., Muppaneni, D., Pollock, L., Vijay-Shanker, K.: Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 43–52, ACM (2010)
Sun, Y., Wang, Q., Yang, Y.: Frlink: improving the recovery of miss-ing issue-commit links by revisiting file relevance. Inf. Sofiw. Technol. 84, 33–47 (2017)
Sureka, A., Indukuri, K.V.: Linguistic analysis of bugreport titles with respect to the dimension of bug importance. In: In Proceedings of the 3rd Annual Bangalore Conference, pp. 1–6, Association for Computing Machinery (2010)
Tabassum, J., Maddela, M., Xu, W., Ritter, A.: Code and named entity recognition in stackoverflow. arXiv preprint arXiv:2005.01634 (2020)
Tian, Y., Sun, C., Lo, D.: Improved duplicate bug re-port identification. In: In Proceedings of the 16th European Conference on SofiwareMaintenance and Reengineering, pp. 385–390, IEEE (2012)
Wan, Y., Zhao, Z., Yang, M., Xu, G., Ying, H., Wu, J., Yu, P.S.: Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 397–407 (2018)
Wang, M.W., Y. Liu, Y.W., Shenyang, R.W.: Understanding and facilitating the co-evolution of production and test code. In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 272–283, IEEE (2021)
Wang, J., Zhang, H.: Predicting defect numbers based on defect state transition models. In: Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 191–200 (2012). https://doi.org/10.1145/2372251.2372287
Wei, B.: Retrieve and refine: Exemplar-based neural comment generation. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1250–1252 (2019). https://doi.org/10.1109/ASE.2019.00152
Wong, E., Yang, J., Tan, L.: Autocomment: Mining question and answer sites for automatic comment generation. In: Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference On, pp. 562–567, IEEE (2013)
Xuan, J., Jiang, H., Ren, Z., Zou, W.: Developer prioritization in bug repositories. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 25–35 (2012). https://doi.org/10.1109/ICSE.2012.6227209
Zhang, T., Chen, J., Luo, X., Li, T.: Bug reports for desktop software and mobile apps in GitHub: What’s the difference? IEEE Softw. 36(1), 63–71 (2017)
Zhang, J., Wang, X., Zhang, H., Sun, H., Liu, X.: Retrieval-based neural source code summarization. In: Proceedings of the 42nd International Conference on Software Engineering (2020)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (No. 62372071), the Scientific and Technological Research Program of Chongqing Municipal Education Commission (No. KJQN202300547), the Chongqing Municipal Construction Science and Technology Plan Project (Chengke Zi 2024 No. 8-7), the State Key Laboratory of Intelligent Vehicle Safety Technology (No. IVSTSKL-202412) and the Natural Science Foundation of Chongqing (No. CSTB2023NSCQ-MSX0914).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, X., Xu, Y., Sun, W. et al. Tab: template-aware bug report title generation via two-phase fine-tuned models. Autom Softw Eng 32, 32 (2025). https://doi.org/10.1007/s10515-025-00505-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-025-00505-9