skip to main content
research-article

Code-line-level Bugginess Identification: How Far have We Come, and How Far have We Yet to Go?

Authors Info & Claims
Published:27 May 2023Publication History
Skip Abstract Section

Abstract

Background. Code-line-level bugginess identification (CLBI) is a vital technique that can facilitate developers to identify buggy lines without expending a large amount of human effort. Most of the existing studies tried to mine the characteristics of source codes to train supervised prediction models, which have been reported to be able to discriminate buggy code lines amongst others in a target program.

Problem. However, several simple and clear code characteristics, such as complexity of code lines, have been disregarded in the current literature. Such characteristics can be acquired and applied easily in an unsupervised way to conduct more accurate CLBI, which also can decrease the application cost of existing CLBI approaches by a large margin.

Objective. We aim at investigating the status quo in the field of CLBI from the perspective of (1) how far we have really come in the literature, and (2) how far we have yet to go in the industry, by analyzing the performance of state-of-the-art (SOTA) CLBI approaches and tools, respectively.

Method. We propose a simple heuristic baseline solution GLANCE (aiminG at controL- ANd ComplEx-statements) with three implementations (i.e., GLANCE-MD, GLANCE-EA, and GLANCE-LR). GLANCE is a two-stage CLBI framework: first, use a simple model to predict the potentially defective files; second, leverage simple code characteristics to identify buggy code lines in the predicted defective files. We use GLANCE as the baseline to investigate the effectiveness of the SOTA CLBI approaches, including natural language processing (NLP) based, model interpretation techniques (MIT) based, and popular static analysis tools (SAT).

Result. Based on 19 open-source projects with 142 different releases, the experimental results show that GLANCE framework has a prediction performance comparable or even superior to the existing SOTA CLBI approaches and tools in terms of 8 different performance indicators.

Conclusion. The results caution us that, if the identification performance is the goal, the real progress in CLBI is not being achieved as it might have been envisaged in the literature and there is still a long way to go to really promote the effectiveness of static analysis tools in industry. In addition, we suggest using GLANCE as a baseline in future studies to demonstrate the usefulness of any newly proposed CLBI approach.

REFERENCES

  1. [1] Agrawal Amritanshu, Fu Wei, Chen Di, Shen Xipeng, and Menzies Tim. 2021. How to “DODGE” complex software analytics. IEEE Transactions on Software Engineering 47, 10 (2021), 21822194. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Aman Hirohisa, Amasaki Sousuke, Yokogawa Tomoyuki, and Kawahara Minoru. 2019. A survival analysis-based prioritization of code checker warning: A case study using PMD. In Proceedings of the Big Data, Cloud Computing, and Data Science Engineering, Selected Papers from The 4th IEEE/ACIS International Conference on Big Data, Cloud Computing, Data Science and Engineering.Lee Roger Y. (Ed.), Vol. 844, Springer, 6983. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Arar Ömer Faruk and Ayan Kürsat. 2015. Software defect prediction using cost-sensitive neural network. Applied Soft Computing 33 (2015), 263277. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Basili Victor R., Briand Lionel C., and Melo Walcélio L.. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22, 10 (1996), 751761. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Beller Moritz, Bholanath Radjino, McIntosh Shane, and Zaidman Andy. 2016. Analyzing the state of static analysis: A large-scale evaluation in open source software. In Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering. IEEE Computer Society, 470481. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Benjamini Yoav and Hochberg Yosef. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series b-methodological 57, 1 (1995), 289300.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Bian Pan, Liang Bin, Shi Wenchang, Huang Jianjun, and Cai Yan. 2018. NAR-miner: Discovering negative association rules from code for bug detection. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.Leavens Gary T., Garcia Alessandro, and Pasareanu Corina S. (Eds.), ACM, 411422. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] BugDet 2022. Dataset and scripts. (2022). Retrieved from https://github.com/Naplues/BugDet. Accessed November 8, 2022.Google ScholarGoogle Scholar
  9. [9] Campbell Joshua Charles, Hindle Abram, and Amaral José Nelson. 2014. Syntax errors just aren’t natural: Improving error reporting with language models. In Proceedings of the 11th Working Conference on Mining Software Repositories.Devanbu Premkumar T., Kim Sung, and Pinzger Martin (Eds.), ACM, 252261. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Chen Jianfeng, Nair Vivek, Krishna Rahul, and Menzies Tim. 2019. “Sampling” as a baseline optimizer for search-based software engineering. IEEE Trans. Software Eng. 45, 6 (2019), 597614. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] CLBI 2022. Replication kit. (2022). Retrieved from https://github.com/Naplues/CLBI. Accessed November 8, 2022.Google ScholarGoogle Scholar
  12. [12] Fenton Norman E. and Ohlsson Niclas. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering 26, 8 (2000), 797814. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Fu Wei and Menzies Tim. 2017. Easy over hard: A case study on deep learning. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. Bodden Eric, Schäfer Wilhelm, Deursen Arie van, and Zisman Andrea (Eds.), ACM, 4960. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Giger Emanuel, D’Ambros Marco, Pinzger Martin, and Gall Harald C.. 2012. Method-level bug prediction. In Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.Runeson Per, Höst Martin, Mendes Emilia, Andrews Anneliese Amschler, and Harrison Rachel (Eds.), ACM, 171180. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Guo Zhaoqiang, Liu Shiran, Liu Jinping, Li Yanhui, Chen Lin, Lu Hongmin, and Zhou Yuming. 2021. How far have we progressed in identifying self-admitted technical debts? A comprehensive empirical study. ACM Transactions on Software Engineering and Methodology 30, 4 (2021), 45:1–45:56. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Habib Andrew and Pradel Michael. 2018. How many of all bugs do we find? A study of static bug detectors. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering.Huchard Marianne, Kästner Christian, and Fraser Gordon (Eds.), ACM, 317328. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Halstead Maurice H.. 1977. Elements of software science (Operating and Programming Systems Series). Elsevier Science Inc. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Hassan Ahmed E.. 2009. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering. IEEE, 7888. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] He Peng, Li Bing, Liu Xiao, Chen Jun, and Ma Yutao. 2015. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology 59 (2015), 170190. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Hellendoorn Vincent J. and Devanbu Premkumar T.. 2017. Are deep neural networks the best choice for modeling source code?. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. Bodden Eric, Schäfer Wilhelm, Deursen Arie van, and Zisman Andrea (Eds.), ACM, 763773. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Herbold Steffen, Trautsch Alexander, Ledel Benjamin, Aghamohammadi Alireza, Ghaleb Taher Ahmed, Chahal Kuljit Kaur, Bossenmaier Tim, Nagaria Bhaveet, Makedonski Philip, Ahmadabadi Matin Nili, Szabados Kristof, Spieker Helge, Madeja Matej, Hoy Nathaniel, Lenarduzzi Valentina, Wang Shangwen, Rodríguez-Pérez Gema, Palacios Ricardo Colomo, Verdecchia Roberto, Singh Paramvir, Qin Yihao, Chakroborti Debasish, Davis Willard, Walunj Vijay, Wu Hongjun, Marcilio Diego, Alam Omar, Aldaeej Abdullah, Amit Idan, Turhan Burak, Eismann Simon, Wickert Anna-Katharina, Malavolta Ivano, Sulír Matús, Fard Fatemeh H., Henley Austin Z., Kourtzanidis Stratos, Tuzun Eray, Treude Christoph, Shamasbi Simin Maleki, Pashchenko Ivan, Wyrich Marvin, Davis James, Serebrenik Alexander, Albrecht Ella, Aktas Ethem Utku, Strüber Daniel, and Erbel Johannes. 2022. A fine-grained data set and analysis of tangling in bug fixing commits. Empirical Software Engineering 27, 6 (2022), 125. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Hindle Abram, Barr Earl T., Su Zhendong, Gabel Mark, and Devanbu Premkumar T.. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering.Glinz Martin, Murphy Gail C., and Pezzè Mauro (Eds.), IEEE Computer Society, 837847. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Hoang Thong, Dam Hoa Khanh, Kamei Yasutaka, Lo David, and Ubayashi Naoyasu. 2019. DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. In Proceedings of the 16th International Conference on Mining Software Repositories.Storey Margaret-Anne D., Adams Bram, and Haiduc Sonia (Eds.), IEEE / ACM, 3445. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Hovemeyer David and Pugh William W.. 2007. Finding more null pointer bugs, but not too many. In Proceedings of the 7th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering.Das Manuvir and Grossman Dan (Eds.), ACM, 914. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Imtiaz Nasif, Murphy Brendan, and Williams Laurie A.. 2019. How do developers act on static analysis alerts? An empirical study of coverity usage. In Proceedings of the 30th IEEE International Symposium on Software Reliability Engineering.Wolter Katinka, Schieferdecker Ina, Gallina Barbara, Cukier Michel, Natella Roberto, Ivaki Naghmeh Ramezani, and Laranjeiro Nuno (Eds.), IEEE, 323333. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Imtiaz Nasif, Rahman Akond, Farhana Effat, and Williams Laurie A.. 2019. Challenges with responding to static analysis tool alerts. In Proceedings of the 16th International Conference on Mining Software Repositories.Storey Margaret-Anne D., Adams Bram, and Haiduc Sonia (Eds.), IEEE / ACM, 245249. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Jiarpakdee Jirayus, Tantithamthavorn Chakkrit, and Treude Christoph. 2020. The impact of automated feature selection techniques on the interpretation of defect models. Empirical Software Engineering 25, 5 (2020), 35903638. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Jiarpakdee Jirayus, Tantithamthavorn Chakkrit Kla, Dam Hoa Khanh, and Grundy John C.. 2022. An empirical study of model-agnostic techniques for defect prediction models. IEEE Transactions on Software Engineering 48, 2 (2022), 166185. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Johnson Brittany, Song Yoonki, Murphy-Hill Emerson R., and Bowdidge Robert W.. 2013. Why don’t software developers use static analysis tools to find bugs?. In Proceedings of the 35th International Conference on Software Engineering.Notkin David, Cheng Betty H. C., and Pohl Klaus (Eds.), IEEE Computer Society, 672681. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Kamei Yasutaka, Fukushima Takafumi, McIntosh Shane, Yamashita Kazuhiro, Ubayashi Naoyasu, and Hassan Ahmed E.. 2016. Studying just-in-time defect prediction using cross-project models. Empirical Software Engineering 21, 5 (2016), 20722106. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Kamei Yasutaka, Matsumoto Shinsuke, Monden Akito, Matsumoto Ken-ichi, Adams Bram, and Hassan Ahmed E.. 2010. Revisiting common bug prediction findings using effort-aware models. In Proceedings of the 26th IEEE International Conference on Software Maintenance.Marinescu Radu, Lanza Michele, and Marcus Andrian (Eds.), IEEE Computer Society, 110. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Kamei Yasutaka, Shihab Emad, Adams Bram, Hassan Ahmed E., Mockus Audris, Sinha Anand, and Ubayashi Naoyasu. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (2013), 757773. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Kapur Ritu and Sodhi Balwinder. 2020. A defect estimator for source code: Linking defect reports with programming constructs usage metrics. ACM Transactions on Software Engineering and Methodology 29, 2 (2020), 12:1–12:35. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Kim Sunghun and Ernst Michael D.. 2007. Which warnings should I fix first?. In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering.Crnkovic Ivica and Bertolino Antonia (Eds.), ACM, 4554. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Kim Sunghun, Zimmermann Thomas, Jr. E. James Whitehead, and Zeller Andreas. 2007. Predicting faults from cached history. In Proceedings of the 29th International Conference on Software Engineering.IEEE Computer Society, 489498. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Kondo Masanari, Germán Daniel M., Mizuno Osamu, and Choi Eun-Hye. 2020. The impact of context metrics on just-in-time defect prediction. Empirical Software Engineering 25, 1 (2020), 890939. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Koru Akif Günes and Liu Hongfang. 2005. An investigation of the effect of module size on defect prediction using static measures. Proceedings of the 2005 Workshop on Predictor Models in Software Engineering 30, 4 (2005), 15. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Kumar Lov, Misra Sanjay, and Rath Santanu Ku.. 2017. An empirical analysis of the effectiveness of software metrics and fault prediction model for identifying faulty classes. Computer Standards and Interfaces 53 (2017), 132. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Lee Jaekwon, Kim Dongsun, Bissyandé Tegawendé F., Jung Woosung, and Traon Yves Le. 2018. Bench4BL: Reproducibility study on the performance of IR-based bug localization. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis.Tip Frank and Bodden Eric (Eds.), ACM, 6172. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Liu Jinping, Zhou Yuming, Yang Yibiao, Lu Hongmin, and Xu Baowen. 2017. Code churn: A neglected metric in effort-aware just-in-time defect prediction. In Proceedings of the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement.Bener Ayse, Turhan Burak, and Biffl Stefan (Eds.), IEEE Computer Society, 1119. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Liu Zhongxin, Xia Xin, Hassan Ahmed E., Lo David, Xing Zhenchang, and Wang Xinyu. 2018. Neural-machine-translation-based commit message generation: How far are we?. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering.Huchard Marianne, Kästner Christian, and Fraser Gordon (Eds.), ACM, 373384. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Majd Amirabbas, Vahidi-Asl Mojtaba, Khalilian Alireza, Poorsarvi-Tehrani Pooria, and Haghighi Hassan. 2020. SLDeep: Statement-level software defect prediction using deep-learning model on static code features. Expert Systems with Applications 147 (2020), 113156. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Majumder Suvodeep, Balaji Nikhila, Brey Katie, Fu Wei, and Menzies Tim. 2018. 500+ times faster than deep learning: A case study exploring faster methods for text mining stackoverflow. In Proceedings of the 15th International Conference on Mining Software Repositories.Zaidman Andy, Kamei Yasutaka, and Hill Emily (Eds.), ACM, 554563. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] McCabe Thomas J.. 1976. A complexity measure. IEEE Transactions on Software Engineering 2, 4 (1976), 308320. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] McIntosh Shane and Kamei Yasutaka. 2018. Are fix-inducing changes a moving target? A longitudinal case study of just-in-time defect prediction. IEEE Transactions on Software Engineering 44, 5 (2018), 412428. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Menzies Tim, Greenwald Jeremy, and Frank Art. 2007. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 33, 1 (2007), 213. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Menzies Tim, Milton Zach, Turhan Burak, Cukic Bojan, Jiang Yue, and Bener Ayse Basar. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375407. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Monden Akito, Hayashi Takuma, Shinoda Shoji, Shirai Kumiko, Yoshida Junichi, Barker Mike, and Matsumoto Ken-ichi. 2013. Assessing the cost effectiveness of fault prediction in acceptance testing. IEEE Transactions on Software Engineering 39, 10 (2013), 13451357. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Nagappan Nachiappan, Ball Thomas, and Zeller Andreas. 2006. Mining metrics to predict component failures. In Proceedings of the 28th International Conference on Software Engineering.Osterweil Leon J., Rombach H. Dieter, and Soffa Mary Lou (Eds.), ACM, 452461. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Nagappan Nachiappan, Murphy Brendan, and Basili Victor R.. 2008. The influence of organizational structure on software quality: An empirical case study. In Proceedings of the 30th International Conference on Software Engineering.Schäfer Wilhelm, Dwyer Matthew B., and Gruhn Volker (Eds.), ACM, 521530. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Ni Chao, Wang Wei, Yang Kaiwen, Xia Xin, Liu Kui, and Lo David. 2022. The best of both worlds: Integrating semantic features with expert features for defect prediction and localization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.Roychoudhury Abhik, Cadar Cristian, and Kim Miryung (Eds.), ACM, 672683. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Ostrand Thomas J., Weyuker Elaine J., and Bell Robert M.. 2004. Where the bugs are. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis.Avrunin George S. and Rothermel Gregg (Eds.), ACM, 8696. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Ostrand Thomas J., Weyuker Elaine J., and Bell Robert M.. 2005. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering 31, 4 (2005), 340355. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Parnin Chris and Orso Alessandro. 2011. Are automated debugging techniques actually helping programmers?. In Proceedings of the 20th International Symposium on Software Testing and Analysis.Dwyer Matthew B. and Tip Frank (Eds.), ACM, 199209. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Pascarella Luca, Palomba Fabio, and Bacchelli Alberto. 2018. Re-evaluating method-level bug prediction. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering.Oliveto Rocco, Penta Massimiliano Di, and Shepherd David C. (Eds.), IEEE Computer Society, 592601. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Pascarella Luca, Palomba Fabio, and Bacchelli Alberto. 2020. On the performance of method-level bug prediction: A negative result. Journal of Systems and Software 161 (2020), 115. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Pornprasit Chanathip and Tantithamthavorn Chakkrit. 2021. JITLine: A simpler, better, faster, finer-grained just-in-time defect prediction. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories. IEEE, 369379. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Radjenovic Danijel, Hericko Marjan, Torkar Richard, and Zivkovic Ales. 2013. Software fault prediction metrics: A systematic literature review. Information and Software Technology 55, 8 (2013), 13971418. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Rahman Foyzur, Khatri Sameer, Barr Earl T., and Devanbu Premkumar T.. 2014. Comparing static bug finders and statistical prediction. In Proceedings of the 36th International Conference on Software Engineering.Jalote Pankaj, Briand Lionel C., and Hoek André van der (Eds.), ACM, 424434. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Ray Baishakhi, Hellendoorn Vincent J., Godhane Saheel, Tu Zhaopeng, Bacchelli Alberto, and Devanbu Premkumar T.. 2016. On the “naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering.Dillon Laura K., Visser Willem, and Williams Laurie A. (Eds.), ACM, 428439. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Raychev Veselin, Vechev Martin T., and Yahav Eran. 2014. Code completion with statistical language models. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.O’Boyle Michael F. P. and Pingali Keshav (Eds.), ACM, 419428. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Ribeiro Marco Túlio, Singh Sameer, and Guestrin Carlos. 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco.Krishnapuram Balaji, Shah Mohak, Smola Alexander J., Aggarwal Charu C., Shen Dou, and Rastogi Rajeev (Eds.), ACM, 11351144. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. [63] Silva Danilo, Silva João Paulo da, Santos Gustavo Jansen de Souza, Terra Ricardo, and Valente Marco Túlio. 2021. RefDiff 2.0: A multi-language refactoring detection tool. IEEE Transactions on Software Engineering 47, 12 (2021), 27862802. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Sliwerski Jacek, Zimmermann Thomas, and Zeller Andreas. 2005. When do changes induce fixes?. In Proceedings of the 2005 International Workshop on Mining Software Repositories. ACM. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. [65] Thummalapenta Suresh and Xie Tao. 2009. Alattin: Mining alternative patterns for detecting neglected conditions. In Proceedings of the ASE 2009, 24th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 283294. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Trautsch Alexander, Herbold Steffen, and Grabowski Jens. 2020. A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects. Empirical Software Engineering 25, 6 (2020), 51375192. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Trautsch Alexander, Herbold Steffen, and Grabowski Jens. 2020. Static source code metrics and static analysis warnings for fine-grained just-in-time defect prediction. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. IEEE, 127138. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Tu Zhaopeng, Su Zhendong, and Devanbu Premkumar T.. 2014. On the localness of software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering.Cheung Shing-Chi, Orso Alessandro, and Storey Margaret-Anne D. (Eds.), ACM, 269280. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. [69] Vassallo Carmine, Panichella Sebastiano, Palomba Fabio, Proksch Sebastian, Gall Harald C., and Zaidman Andy. 2020. How developers engage with static analysis tools in different contexts. Empirical Software Engineering 25, 2 (2020), 14191457. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  70. [70] Wang Song, Chollak Devin, Movshovitz-Attias Dana, and Tan Lin. 2016. Bugram: Bug detection with n-gram language models. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.Lo David, Apel Sven, and Khurshid Sarfraz (Eds.), ACM, 708719. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. [71] Wasylkowski Andrzej, Zeller Andreas, and Lindig Christian. 2007. Detecting object usage anomalies. In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering.Crnkovic Ivica and Bertolino Antonia (Eds.), ACM, 3544. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Wattanakriengkrai Supatsara, Srisermphoak Napat, Sintoplertchaikul Sahawat, Choetkiertikul Morakot, Ragkhitwetsagul Chaiyong, Sunetnanta Thanwadee, Hata Hideaki, and Matsumoto Kenichi. 2019. Automatic classifying self-admitted technical debt using N-Gram IDF. In Proceedings of the 26th Asia-Pacific Software Engineering Conference. IEEE, 316322. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  73. [73] Wattanakriengkrai Supatsara, Thongtanunam Patanamon, Tantithamthavorn Chakkrit, Hata Hideaki, and Matsumoto Kenichi. 2022. Predicting defective lines using a model-agnostic technique. IEEE Transactions on Software Engineering 48, 5 (2022), 14801496. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Wong Chu-Pan, Xiong Yingfei, Zhang Hongyu, Hao Dan, Zhang Lu, and Mei Hong. 2014. Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution. IEEE Computer Society, 181190. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. [75] Wu Rongxin, Wen Ming, Cheung Shing-Chi, and Zhang Hongyu. 2018. ChangeLocator: Locate crash-inducing changes based on crash reports. Empirical Software Engineering 23, 5 (2018), 28662900. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. [76] Wu Rongxin, Zhang Hongyu, Cheung Shing-Chi, and Kim Sunghun. 2014. CrashLocator: Locating crashing faults based on crash stacks. In Proceedings of the International Symposium on Software Testing and Analysis.Pasareanu Corina S. and Marinov Darko (Eds.), ACM, 204214. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. [77] Yan Meng, Fang Yicheng, Lo David, Xia Xin, and Zhang Xiaohong. 2017. File-level defect prediction: Unsupervised vs. supervised models. In Proceedings of the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement.Bener Ayse, Turhan Burak, and Biffl Stefan (Eds.), IEEE Computer Society, 344353. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. [78] Yan Meng, Xia Xin, Fan Yuanrui, Hassan Ahmed E., Lo David, and Li Shanping. 2022. Just-in-time defect identification and localization: A two-phase framework. IEEE Transactions on Software Engineering 48, 2 (2022), 82101. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Yang Xinli, Lo David, Xia Xin, Zhang Yun, and Sun Jianling. 2015. Deep learning for just-in-time defect prediction. In Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security. IEEE, 1726. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. [80] Yang Yibiao, Harman Mark, Krinke Jens, Islam Syed S., Binkley David W., Zhou Yuming, and Xu Baowen. 2016. An empirical study on dependence clusters for effort-aware fault-proneness prediction. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.Lo David, Apel Sven, and Khurshid Sarfraz (Eds.), ACM, 296307. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. [81] Yang Yibiao, Zhou Yuming, Liu Jinping, Zhao Yangyang, Lu Hongmin, Xu Lei, Xu Baowen, and Leung Hareton. 2016. Effort-aware just-in-time defect prediction: Simple unsupervised models could be better than supervised models. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering.Zimmermann Thomas, Cleland-Huang Jane, and Su Zhendong (Eds.), ACM, 157168. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. [82] Yatish Suraj, Jiarpakdee Jirayus, Thongtanunam Patanamon, and Tantithamthavorn Chakkrit. 2019. Mining software defects: Should we consider affected releases?. In Proceedings of the 41st International Conference on Software Engineering.Atlee Joanne M., Bultan Tevfik, and Whittle Jon (Eds.), IEEE / ACM, 654665. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. [83] Yu Zhe, Fahid Fahmid Morshed, Tu Huy, and Menzies Tim. 2022. Identifying self-admitted technical debts with jitterbug: A two-step approach. IEEE Trans. Software Eng. 48, 5 (2022), 16761691. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. [84] Zampetti Fiorella, Scalabrino Simone, Oliveto Rocco, Canfora Gerardo, and Penta Massimiliano Di. 2017. How open source projects use static code analysis tools in continuous integration pipelines. In Proceedings of the 14th International Conference on Mining Software Repositories.González-Barahona Jesús M., Hindle Abram, and Tan Lin (Eds.), IEEE Computer Society, 334344. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. [85] Zhang Hongyu and Cheung S. C.. 2013. A cost-effectiveness criterion for applying software defect prediction models. In Proceedings of the Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering., Meyer Bertrand, Baresi Luciano, and Mezini Mira (Eds.), ACM, 643646. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. [86] Zhou Jian, Zhang Hongyu, and Lo David. 2012. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In 3Proceedings of the 4th International Conference on Software Engineering.Glinz Martin, Murphy Gail C., and Pezzè Mauro (Eds.), IEEE Computer Society, 1424. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Zhou Yuming, Yang Yibiao, Lu Hongmin, Chen Lin, Li Yanhui, Zhao Yangyang, Qian Junyan, and Xu Baowen. 2018. How far we have progressed in the journey? An examination of cross-project defect prediction. ACM Transactions on Software Engineering and Methodology 27, 1 (2018), 1:1–1:51. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. [88] Zimmermann Thomas, Premraj Rahul, and Zeller Andreas. 2007. Predicting defects for eclipse. In Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering. IEEE Computer Society, 9. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Code-line-level Bugginess Identification: How Far have We Come, and How Far have We Yet to Go?

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 4
      July 2023
      938 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3599692
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2023
      • Online AM: 1 February 2023
      • Accepted: 14 December 2022
      • Revised: 8 November 2022
      • Received: 6 January 2022
      Published in tosem Volume 32, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text