Empirically revisiting and enhancing automatic classification of bug and non-bug issues

Li, Zhong; Pan, Minxue; Pei, Yu; Zhang, Tian; Wang, Linzhang; Li, Xuandong

doi:10.1007/s11704-023-2771-z

Empirically revisiting and enhancing automatic classification of bug and non-bug issues

Research Article
Published: 23 December 2023

Volume 18, article number 185207, (2024)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Zhong Li^1,2,
Minxue Pan^1,3,
Yu Pei⁴,
Tian Zhang^1,2,
Linzhang Wang^1,2 &
…
Xuandong Li^1,2

139 Accesses
6 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems (ITSs). Although the existing approaches have shown promising performance, the different design choices, including the different textual fields, feature representation methods and machine learning algorithms adopted by existing approaches, have not been comprehensively compared and analyzed. To fill this gap, we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches. Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification, including: (1) Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification; (2) Word embedding with Long Short-Term Memory (LSTM) can better extract features from the textual fields in the issues, and hence, lead to better issue classification models; (3) There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues; (4) The performance of the issue classification model is not sensitive to the choices of ML algorithms. Based on our study outcomes, we further propose an advanced issue classification approach, DeepLabel, which can achieve better performance compared with the existing issue classification approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient multi-target classification for bug priority and resolution time prediction

Article 29 August 2024

LLM-BRC: A large language model-based bug report classification framework

Article 24 May 2024

On the feasibility of automated prediction of bug and non-bug issues

Article Open access 14 September 2020

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Merten T, Mager B, Hübner P, Quirchmayr T, Paech B, Bürsner S. Requirements communication in issue tracking systems in four open-source projects. In: Proceedings of the Joint Proceedings of REFSQ-2015 Workshops, Research Method Track, and Poster Track Co-Located with the 21st International Conference on Requirements Engineering: Foundation for Software Quality. 2015, 114–125
Bertram D, Voida A, Greenberg S, Walker R. Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. In: Proceedings of 2010 ACM Conference on Computer Supported Cooperative Work. 2010, 291–300
Bissyandé T F, Lo D, Jiang L, Réveillère L, Klein J, Le Traon Y. Got issues? Who cares about it? A large scale investigation of issue trackers from GitHuB. In: Proceedings of the 24th International Symposium on Software Reliability Engineering. 2013, 188–197
Yan Y, Cheng D, Feng J E, Li H, Yue J. Survey on applications of algebraic state space theory of logical systems to finite state machines. Science China Information Sciences, 2023, 66(1): 111201
Article MathSciNet Google Scholar
Fan Q, Yu Y, Yin G, Wang T, Wang H. Where is the road for issue reports classification based on text mining?. In: Proceedings of 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 2017, 121–130
Breu S, Premraj R, Sillito J, Zimmermann T. Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of 2010 ACM Conference on Computer Supported Cooperative Work. 2010, 301–310
Limsettho N, Hata H, Monden A, Matsumoto K. Automatic unsupervised bug report categorization. In: Proceedings of the 6th International Workshop on Empirical Software Engineering in Practice. 2014, 7–12
Hammad M, Alzyoudi R, Otoom A F. Automatic clustering of bug reports. International Journal of Advanced Computer Research, 2018, 8(39): 313–323
Article Google Scholar
Chawla I, Singh S K. Automated labeling of issue reports using semi supervised approach. Journal of Computational Methods in Sciences and Engineering, 2018, 18(1): 177–191
Article Google Scholar
Antoniol G, Ayari K, Di Penta M, Khomh F, Guéhéneuc Y G. Is it a bug or an enhancement?: a text-based approach to classify change requests. In: Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering. 2018, 2–16
Pingclasai N, Hata H, Matsumoto K I. Classifying bug reports to bugs and other requests using topic modeling. In: Proceedings of the 20th Asia-Pacific Software Engineering Conference. 2013, 13–18
Limsettho N, Hata H, Matsumoto K I. Comparing hierarchical dirichlet process with latent dirichlet allocation in bug report multiclass classification. In: Proceedings of the 15th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. 2014, 1–6
Chawla I, Singh S K. An automated approach for bug categorization using fuzzy logic. In: Proceedings of the 8th India Software Engineering Conference. 2015, 90–99
Zhou Y, Tong Y, Gu R, Gall H. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process, 2016, 28(3): 150–176
Google Scholar
Terdchanakul P, Hata H, Phannachitta P, Matsumoto K. Bug or not? Bug report classification using N-Gram IDF. In: Proceedings of 2017 IEEE International Conference on Software Maintenance and Evolution. 2017, 534–538
Pandey N, Sanyal D K, Hudait A, Sen A. Automated classification of software issue reports using machine learning techniques: an empirical study. Innovations in Systems and Software Engineering, 2017, 13(4): 279–297
Article Google Scholar
Qin H, Sun X. Classifying bug reports into bugs and non-bugs using LSTM. In: Proceedings of the 10th Asia-Pacific Symposium on Internetware. 2018, 20
Zolkeply M S, Shao J. Classifying software issue reports through association mining. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019, 1860–1863
Otoom A F, Al-Jdaeh S, Hammad M. Automated classification of software bug reports. In: Proceedings of the 9th International Conference on Information Communication and Management. 2019, 17–21
Kallis R, Di Sorbo A, Canfora G, Panichella S. Ticket tagger: machine learning driven issue classification. In: Proceedings of 2019 IEEE International Conference on Software Maintenance and Evolution. 2019, 406–409
Herzig K, Just S, Zeller A. It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: Proceedings of the 35th International Conference on Software Engineering. 2013, 392–401
Li Z, Pan M, Pei Y, Zhang T, Wang L, Li X. DeepLabel: automated issue classification for issue tracking systems. In: Proceedings of the 13th Asia-Pacific Symposium on Internetware. 2022, 231–241
Ortu M, Destefanis G, Kassab M, Marchesi M. Measuring and understanding the effectiveness of JIRA developers communities. In: Proceedings of the 6th IEEE/ACM International Workshop on Emerging Trends in Software Metrics. 2015, 3–10
Wohlin C. Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. 2014, 38
Limsettho N, Hata H, Monden A, Matsumoto K. Unsupervised bug report categorization using clustering and labeling algorithm. International Journal of Software Engineering and Knowledge Engineering, 2016, 26(7): 1027–1053
Article Google Scholar
Pandey N, Hudait A, Sanyal D K, Sen A. Automated classification of issue reports from a software issue tracker. In: Sa P K, Sahoo M N, Murugappan M, Wu Y, Majhi B, eds. Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. Singapore: Springer, 2018, 423–430
Chapter Google Scholar
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780
Article Google Scholar
Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017, 427–431
Herbold S, Trautsch A, Trautsch F. On the feasibility of automated prediction of bug and non-bug issues. In: Koziolek A, Schaefer I, Seidl C, eds. Software Engineering 2021. Bonn: Gesellschaft für Informatik e.V., 2021, 55–56
Google Scholar
Perez Q, Jean P A, Urtado C, Vauttier S. Bug or not bug? That is the question. In: Proceedings of the 29th IEEE/ACM International Conference on Program Comprehension. 2021, 47–58
Trautsch A, Trautsch F, Herbold S, Ledel B, Grabowski J. The SmartSHARK ecosystem for software repository mining. In: Proceedings of the 42nd International Conference on Software Engineering. 2020, 25–28
Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques. 3rd ed. San Francisco: Morgan Kaufmann, 2011
Kochhar P S, Thung F, Lo D. Automatic fine-grained issue report reclassification. In: Proceedings of the 19th International Conference on Engineering of Complex Computer Systems. 2014, 126–135
Li Z, Yu Y, Yin G, Wang T, Fan Q, Wang H. Automatic classification of review comments in pull-based development model. In: Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering. 2017, 572–577
Tukey J W. Comparing individual means in the analysis of variance. Biometrics, 1949, 5(2): 99–114
Article MathSciNet Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000–6010
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: Proceedings of the 1st International Conference on Learning Representations. 2013
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
Wilcoxon F. Individual comparisons by ranking methods. In: Kotz S, Johnson N L, eds. Breakthroughs in Statistics: Methodology and Distribution. New York: Springer, 1992, 196–202
Chapter Google Scholar
Cliff N. Ordinal Methods for Behavioral Data Analysis. New York: Psychology Press, 1996
Google Scholar
Fan Y, Xia X, da Costa D A, Lo D, Hassan A E, Li S. The impact of mislabeled changes by SZZ on just-in-time defect prediction. IEEE Transactions on Software Engineering, 2021, 47(8): 1559–1586
Article Google Scholar
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le Scao T, Gugger S, Drame M, Lhoest Q, Rush A. Transformers: state-of-the-art natural language processing. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2020, 38–45
Wiegreffe S, Pinter Y. Attention is not not explanation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 11–20
Chang C H, Creager E, Goldenberg A, Duvenaud D. Explaining image classifiers by counterfactual generation. In: Proceedings of the 7th International Conference on Learning Representations. 2019
Dabkowski P, Gal Y. Real time image saliency for black box classifiers. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6970–6979
Fong R, Patrick M, Vedaldi A. Understanding deep networks via extremal perturbations and smooth masks. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 2950–2958
Fong R C, Vedaldi A. Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 3449–3457
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 618–626
Shrikumar A, Greenside P, Kundaje A. Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 3145–3153
Springenberg J T, Dosovitskiy A, Brox T, Riedmiller M A. Striving for simplicity: the all convolutional net. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
Ribeiro M T, Singh S, Guestrin C. “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1135–1144
Lundberg S M, Lee S I. A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4768–4777
Guo W, Mu D, Xu J, Su P, Wang G, Xing X. LEMNA: explaining deep learning based security applications. In: Proceedings of 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018, 364–379
Gegick M, Rotella P, Xie T. Identifying security bug reports via text mining: an industrial case study. In: Proceedings of the 7th International Working Conference on Mining Software Repositories. 2010, 11–20
McMahan H B, Holt G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D, Chikkerur S, Liu D, Wattenberg M, Hrafnkelsson A M, Boulos T, Kubica J. Ad click prediction: a view from the trenches. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1222–1230
Sahoo D, Pham Q, Lu J, Hoi S C H. Online deep learning: learning deep neural networks on the fly. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018, 2660–2666
Hoi S C H, Sahoo D, Lu J, Zhao P. Online learning: a comprehensive survey. Neurocomputing, 2021, 459: 249–289
Article Google Scholar

Download references

Acknowledgements

We thank the anonymous reviewers for their valuable feedback. This research was supported by the National Natural Science Foundation of China (Grant No. 61972193), and the Program B for Outstanding PhD Candidate of Nanjing University.

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Zhong Li, Minxue Pan, Tian Zhang, Linzhang Wang & Xuandong Li
Department of Computer Science and Technology, Nanjing University, Nanjing, 210023, China
Zhong Li, Tian Zhang, Linzhang Wang & Xuandong Li
Software Institute, Nanjing University, Nanjing, 210093, China
Minxue Pan
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
Yu Pei

Authors

Zhong Li
View author publications
You can also search for this author inPubMed Google Scholar
Minxue Pan
View author publications
You can also search for this author inPubMed Google Scholar
Yu Pei
View author publications
You can also search for this author inPubMed Google Scholar
Tian Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Linzhang Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xuandong Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Minxue Pan.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Zhong Li holds a BS in computer science and technology from Nanjing University of Posts and Telecommunications, China. He is currently a PhD student in Department of Computer Science and Technology, Nanjing University, China. His main research interests lie in intelligent software engineering.

Minxue Pan received the BS and PhD degrees in computer science and technology from Nanjing University, China. He is an associate professor with the State Key Laboratory for Novel Software Technology and the Software Institute of Nanjing University, China. His research interests include software modeling and verification, software analysis and testing, cyber-physical systems, mobile computing, and intelligent software engineering.

Yu Pei is an assistant professor with the Department of Computing, The Hong Kong Polytechnic University, China. His main research interests include automated program repair, software fault localization, and automated software testing.

Tian Zhang received the PhD degree from Nanjing University, China. He is a professor with Nanjing University, China. His research interests include model driven aspects of software engineering, with the aim of facilitating the rapid and reliable development and maintenance of both large and small software systems.

Linzhang Wang received the PhD degree from Nanjing University, China in 2005. He is currently a full professor with the State Key Laboratory of Novel Software Technology at Nanjing University, China. His research interests include software engineering, software testing, and software security.

Xuandong Li received the BS, MS and PhD degrees from Nanjing University, China in 1985, 1991 and 1994, respectively. He is a full professor in Department of Computer Science and Technology, Nanjing University, China. His research interests include formal support for design and analysis of reactive, distributed, realtime, hybrid, and cyber-physical systems, and software testing and verification.

Electronic supplementary material