Tab: template-aware bug report title generation via two-phase fine-tuned models

Liu, Xiao; Xu, Yinkang; Sun, Weifeng; Huang, Naiqi; Sun, Song; Li, Qiang; Yang, Dan; Yan, Meng

doi:10.1007/s10515-025-00505-9

Tab: template-aware bug report title generation via two-phase fine-tuned models

Published: 22 March 2025

Volume 32, article number 32, (2025)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Xiao Liu¹^na1,
Yinkang Xu¹^na1,
Weifeng Sun¹,
Naiqi Huang¹,
Song Sun²,
Qiang Li¹,
Dan Yang³ &
…
Meng Yan¹

77 Accesses
Explore all metrics

Abstract

Bug reports play a critical role in the software development lifecycle by helping developers identify and resolve defects efficiently. However, the quality of bug report titles, particularly in open-source communities, can vary significantly, which complicates the bug triage and resolution processes. Existing approaches, such as iTAPE, treat title generation as a one-sentence summarization task using sequence-to-sequence models. While these methods show promise, they face two major limitations: (1) they do not consider the distinct components of bug reports, treating the entire report as a homogeneous input, and (2) they struggle to handle the variability between template-based and non-template-based reports, often resulting in suboptimal titles. To address these limitations, we propose TAB, a hybrid framework that combines a Document Component Analyzer based on a pre-trained BERT model and a Title Generation Model based on CodeT5. TAB addresses the first limitation by segmenting bug reports into four components-Description, Reproduction, Expected Behavior, and Others-to ensure better alignment between input and output. For the second limitation, TAB uses a divergent approach: for template-based reports, titles are generated directly, while for non-template reports, DCA extracts key components to improve title relevance and clarity. We evaluate TAB on both template-based and non-template-based bug reports, demonstrating that it significantly outperforms existing methods. Specifically, TAB achieves average improvements of 170.4–389.5% in METEOR, 67.8–190.0% in ROUGE-L, and 65.7–124.5% in chrF(AF) compared to baseline approaches on template-based reports. Additionally, on non-template-based reports, TAB shows an average improvement of 64% in METEOR, 3.6% in ROUGE-L, and 14.8% in chrF(AF) over the state-of-the-art. These results confirm the robustness of TAB in generating high-quality titles across diverse bug report formats.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

KeyTitle: towards better bug report title generation by keywords planning

Article 13 September 2024

Enriching automatic test case generation by extracting relevant test inputs from bug reports

Article Open access 24 March 2025

Using bug descriptions to reformulate queries during text-retrieval-based bug localization

Article 11 January 2019

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Notes

References

Abebe, S.L., Ali, N., Hassan, A.E.: An empirical study of software release notes. Empir. Softw. Eng. 21(3), 1107–1142 (2016)
MATH Google Scholar
Anonymous.: (2024). https://anonymous.4open.science/r/TAB-7E70/
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Bettenburg, N., Just, S., Schröter, A., Weiss, C., Premraj, R., Zimmermann, T.: What makes a good bug report? In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 308–318 (2008)
Bhattacharya, P., Ulanova, L., Neamtiu, I., Koduru, S.C.: An empirical analysis of bug reports and bug fixing in open source android apps. In: 2013 17th European Conference on Software Maintenance and Reengineering, pp. 133–143 (2013). https://doi.org/10.1109/CSMR.2013.23
Chaparro, O., Lu, J., Zampetti, F., Moreno, L., Di Penta, M., Marcus, A., Bavota, G., Ng, V.: Detecting missing information in bug descriptions. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 396–407 (2017)
Chaparro, O., Bernal-Cárdenas, C., Lu, J., Moran, K., Marcus, A., Di Penta, M., Poshyvanyk, D., Ng, V.: Assessing the quality of the steps to reproduce in bug reports. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 86–96 (2019)
Chaparro, O., Plorez, J.M., Singh, U., Marcus, A.: Reformulating queries for duplicate bug report detection. In: In Proceedings of The26th International Conference on Software Analysis,Evolution and Reengineering, pp. 218–229, IEEE (2019)
Chaparro, O., Plorez, J.M., Singh, U., Marcus, A.: Deeptriage:explor-ing the effectiveness of deep learning for bug triaging. In: In Proceedings of the Indiajoint International Conference on Data Science and Management of Data, pp. 171–179, Association for Computing Machinery (2019)
Chen, S., Xie, X., Yin, B., Ji, Y., Chen, L., Xu, B.: Stay professional and efficient: automatically generate titles for your bug reports. In: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 385–397, IEEE (2020)
Davies, S., Roper, M.: What’s in a bug report? In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10 (2014)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Devlin, M.C., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186, Association for Computational Linguistics (2019)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971)
MATH Google Scholar
Guo, S.L., N. Duan, Y.W., M. Zhou, J.Y.: Unixcoder: Unified cross-modal pre-training for code representation. In: in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, S. Muresan, P. Nakov, and A. Villavicencio, Eds, pp. 7212–7225, Association for Computational Linguistics (2022)
Haiduc, S., Aponte, J., Moreno, L., Marcus, A.: On the use of automated text summarization techniques for summarizing source code. In: Reverse Engineering (WCRE), 2010 17th Working Conference On, pp. 35–44, IEEE (2010)
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation with hybrid lexical and syntactical information. Empir. Softw. Eng. 25, 2179 (2019)
MATH Google Scholar
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension, pp. 200–210 (2018)
Huang, Q., Xia, X., Lo, D., Murphy, G.C.: Automating intention mining. IEEE Trans. Softw. Eng. 46(10), 1098–1119 (2018)
Google Scholar
Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Summarizing source code using a neural attention model. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2073–2083 (2016)
Jiang, H., Zhang, J., Ma, H., Nazar, N., Ren, Z.: Mining authorship characteristics in bug repositories. Sci. China Inf. Sci. 60(1), 1–16 (2017)
MATH Google Scholar
Karim, M.R., Ihara, A., Yang, X., Iida, H., Matsumoto, K.: Understanding key features of high-impact bug reports. In: 2017 8th International Workshop on Empirical Software Engineering in Practice (IWESEP), pp. 53–58, IEEE (2017)
Ko, A.J., Chilana, P.K.: Design, discussion, and dissent in open bug reports. In: Proceedings of the 2011 IConference. iConference ’11, pp. 106–113. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1940761.1940776
Ko, A.J., Myers, B.A., Chau, D.H.: A linguistic analysis of how people describe software problems. In: Visual Languages and Human-Centric Computing (VL/HCC’06), pp. 127–134, IEEE (2006)
Lavie, A., Agarwal, A.: Meteor: An automatic metric for mt evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation. StatMT ’07, pp. 228–231. Association for Computational Linguistics, USA (2007)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
MATH Google Scholar
Li, H., Yan, M., Sun, W., Liu, X., Wu, Y.: A first look at bug report templates on GitHub. J. Syst. Softw. 202, 111709 (2023)
Google Scholar
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004) (2004)
Liu, P., Fu, J., Hayashi, H., et al.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)
MATH Google Scholar
Liu, Z., Xia, X., Hassan, A.E., Lo, D., Xing, Z., Wang, X.: Neural-machine-translation-based commit message generation: how far are we? In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 373–384 (2018)
Liu, Q., Liu, Z., Zhu, H., Fan, H., Du, B., Qian, Y.: Generating commit messages from diffs using pointer-generator network. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 299–309, IEEE (2019)
Lotufo, R., Malik, Z., Czarnecki, K.: Modelling the hurried bug report reading process to summarize bug reports. Empir. Softw. Eng. 20(2), 516–548 (2015)
Google Scholar
Mani, S., Catherine, R., Sinha, V.S., Dubey, A.: Ausum: approach for unsupervised bug report summarization. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, pp. 1–11 (2012)
McBurney, P.W., McMillan, C.: Automatic source code summarization of context for java methods. IEEE Trans. Softw. Eng. 42(2), 103–119 (2016)
MATH Google Scholar
McBurney, P.W., McMillan, C.: Automatic documentation generation via source code summarization of method context. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 279–290, ACM (2014)
Mills, C., Pantiuchina, J., Parra, E., Bavota, G., Haiduc, S.: Are bug reports enough for text retrieval-based bug localiza-tion? In: In Proceedings of the International Conference on Software Maintenance and Evolution, pp. 381–392, IEEE (2018)
Moreno, L., Bavota, G., Di Penta, M., Oliveto, R., Marcus, A., Canfora, G.: Arena: an approach for the automated generation of release notes. IEEE Trans. Softw. Eng. 43(2), 106–127 (2016)
Google Scholar
Moreno, L., Aponte, J., Sridhara, G., Marcus, A., Pollock, L., Vijay-Shanker, K.: Automatic generation of natural language summaries for java classes. In: Program Comprehension (ICPC), 2013 IEEE 21st International Conference On, pp. 23–32, IEEE (2013)
Moreno, L., Bavota, G., Di Penta, M., Oliveto, R., Marcus, A., Canfora, G.: Automatic generation of release notes. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 484–495, ACM (2014)
Nijkamp, E., Pang, B., Hayashi, L. H. Tu, Wang, H., Zhou, Y., Savarese, S., Xiong, C.: Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022)
Popović, M.: chrf: character n-gram f-score for automatic MT evaluation. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp. 392–395 (2015)
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2023). https://arxiv.org/abs/1910.10683
Rastkar, S., Murphy, G.C., Murray, G.: Automatic summarization of bug reports. IEEE Trans. Softw. Eng. 40(4), 366–380 (2014)
MATH Google Scholar
Rastkar, S., Murphy, G.C., Murray, G.: Summarizing software artifacts: a case study of bug reports. In: 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1, pp. 505–514, IEEE (2010)
Roy, D., Fakhoury, S., Arnaoudova, V.: Reassessing automatic evaluation metrics for code summarization tasks. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1105–1116 (2021)
Ruan, H., Chen, B., Peng, X., Zhao, W.: Deeplink: re-covering issue-commit links based on deep learning. J. Syst. Softw. 158, 110406 (2019)
Google Scholar
Sahoo, S.K., Criswell, J., Adve, V.: An empirical study of reported bugs in server software with implications for automated bug diagnosis. In: 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1, pp. 485–494 (2010). https://doi.org/10.1145/1806799.1806870
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Sharma, S., El Asri, L., Schulz, H., Zumer, J.: Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation. CoRR abs/1706.09799 (2017)
Sridhara, G., Hill, E., Muppaneni, D., Pollock, L., Vijay-Shanker, K.: Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 43–52, ACM (2010)
Sun, Y., Wang, Q., Yang, Y.: Frlink: improving the recovery of miss-ing issue-commit links by revisiting file relevance. Inf. Sofiw. Technol. 84, 33–47 (2017)
MATH Google Scholar
Sureka, A., Indukuri, K.V.: Linguistic analysis of bugreport titles with respect to the dimension of bug importance. In: In Proceedings of the 3rd Annual Bangalore Conference, pp. 1–6, Association for Computing Machinery (2010)
Tabassum, J., Maddela, M., Xu, W., Ritter, A.: Code and named entity recognition in stackoverflow. arXiv preprint arXiv:2005.01634 (2020)
Tian, Y., Sun, C., Lo, D.: Improved duplicate bug re-port identification. In: In Proceedings of the 16th European Conference on SofiwareMaintenance and Reengineering, pp. 385–390, IEEE (2012)
Wan, Y., Zhao, Z., Yang, M., Xu, G., Ying, H., Wu, J., Yu, P.S.: Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 397–407 (2018)
Wang, M.W., Y. Liu, Y.W., Shenyang, R.W.: Understanding and facilitating the co-evolution of production and test code. In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 272–283, IEEE (2021)
Wang, J., Zhang, H.: Predicting defect numbers based on defect state transition models. In: Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 191–200 (2012). https://doi.org/10.1145/2372251.2372287
Wei, B.: Retrieve and refine: Exemplar-based neural comment generation. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1250–1252 (2019). https://doi.org/10.1109/ASE.2019.00152
Wong, E., Yang, J., Tan, L.: Autocomment: Mining question and answer sites for automatic comment generation. In: Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference On, pp. 562–567, IEEE (2013)
Xuan, J., Jiang, H., Ren, Z., Zou, W.: Developer prioritization in bug repositories. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 25–35 (2012). https://doi.org/10.1109/ICSE.2012.6227209
Zhang, T., Chen, J., Luo, X., Li, T.: Bug reports for desktop software and mobile apps in GitHub: What’s the difference? IEEE Softw. 36(1), 63–71 (2017)
MATH Google Scholar
Zhang, J., Wang, X., Zhang, H., Sun, H., Liu, X.: Retrieval-based neural source code summarization. In: Proceedings of the 42nd International Conference on Software Engineering (2020)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (No. 62372071), the Scientific and Technological Research Program of Chongqing Municipal Education Commission (No. KJQN202300547), the Chongqing Municipal Construction Science and Technology Plan Project (Chengke Zi 2024 No. 8-7), the State Key Laboratory of Intelligent Vehicle Safety Technology (No. IVSTSKL-202412) and the Natural Science Foundation of Chongqing (No. CSTB2023NSCQ-MSX0914).

Author information

Xiao Liu and Yinkang Xu have contributed equally to this work.

Authors and Affiliations

School of Big Data and Software Engineering, Chongqing University, Chongqing, China
Xiao Liu, Yinkang Xu, Weifeng Sun, Naiqi Huang, Qiang Li & Meng Yan
Chongqing Normal University, Chongqing, China
Song Sun
Southwest Jiaotong University, Chengdu, China
Dan Yang

Authors

Xiao Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yinkang Xu
View author publications
You can also search for this author inPubMed Google Scholar
Weifeng Sun
View author publications
You can also search for this author inPubMed Google Scholar
Naiqi Huang
View author publications
You can also search for this author inPubMed Google Scholar
Song Sun
View author publications
You can also search for this author inPubMed Google Scholar
Qiang Li
View author publications
You can also search for this author inPubMed Google Scholar
Dan Yang
View author publications
You can also search for this author inPubMed Google Scholar
Meng Yan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Weifeng Sun or Meng Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Xu, Y., Sun, W. et al. Tab: template-aware bug report title generation via two-phase fine-tuned models. Autom Softw Eng 32, 32 (2025). https://doi.org/10.1007/s10515-025-00505-9

Download citation

Received: 10 December 2024
Accepted: 02 March 2025
Published: 22 March 2025
DOI: https://doi.org/10.1007/s10515-025-00505-9

Keywords

Part of a collection:

Call for papers, Foundation Models for Software Engineering

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tab: template-aware bug report title generation via two-phase fine-tuned models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

KeyTitle: towards better bug report title generation by keywords planning

Enriching automatic test case generation by extracting relevant test inputs from bug reports

Using bug descriptions to reformulate queries during text-retrieval-based bug localization

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now