skip to main content
research-article

Retrieval from software libraries for bug localization: A comparative study of generic and composite text models - A Retrospective

Published:21 July 2021Publication History
Skip Abstract Section

Abstract

This retrospective on our 2011 MSR publication starts with the research milieu that led to the work reported in our paper. We brie y review the competing ideas of a decade ago that could be applied to solving the problem of identifying the les in a software library related to a query. We were especially interested in nding out if the more complex text retrieval methods of that time would be e ective in the software context. A surprising conclusion of our paper was that the reality was exactly the opposite: the more traditional simpler methods outperformed the complex methods. In addition to this surprising result, our paper was also the rst to report what was considered at that time a large-scale quantitative evaluation of the IR-based approaches to automatic bug localization. Over the years, such quantitative evaluations have become the norm. We believe that these contributions were largely responsible for the popularity of this paper in the research literature.

References

  1. [Agrawal et al. 2018] Amritanshu Agrawal, Wei Fu, and Tim Menzies. 2018. What is Wrong with Topic Modeling? And How to Fix it Using Search-based Software Engineering. Information and Software Technology 98 (2018), 74{88.Google ScholarGoogle Scholar
  2. [Akbar and Kak 2019] S. Akbar and A. Kak. 2019. SCOR: Source Code Retrieval with Semantics and Order. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 1{12.Google ScholarGoogle Scholar
  3. [Biggers et al. 2014] Lauren R Biggers, Cecylia Bocovich, Riley Capshaw, Brian P Eddy, Letha H Etzkorn, and Nicholas A Kraft. 2014. Con_guring Latent Dirichlet Allocation Based Feature Location. Empirical Software Engineering 19, 3 (2014), 465{500.Google ScholarGoogle Scholar
  4. [Chen et al. 2015] Tse-Hsun Chen, Stephen Thomas, and Ahmed E. Hassan. 2015. A Survey on the use of Topic Models when Mining Software Repositories. 21 (2015).Google ScholarGoogle Scholar
  5. [Hemmati et al. 2013] H. Hemmati, S. Nadi, O. Baysal, O. Kononenko, W. Wang, R. Holmes, and M. W. Godfrey. 2013. The MSR Cookbook: Mining a Aecade of Research. In 2013 10th Working Conference on Mining Software Repositories (MSR). 343{352.Google ScholarGoogle Scholar
  6. [Hill et al. 2012] E. Hill, S. Rao, and A. Kak. 2012. On the Use of Stemming for Concern Location and Bug Localization in Java. In 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation. 184{193.Google ScholarGoogle Scholar
  7. [Kim et al. 2013] D. Kim, Y. Tao, S. Kim, and A. Zeller. 2013. Where Should We Fix This Bug? A Two-Phase Recommendation Model. IEEE Transactions on Software Engineering 39, 11 (2013), 1597{1610.Google ScholarGoogle Scholar
  8. [Lam et al. 2017] A. N. Lam, A. T. Nguyen, H. A. Nguyen, and T. N. Nguyen. 2017. Bug Localization with Combination of Deep Learning and Information Retrieval. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC). 218{229.Google ScholarGoogle Scholar
  9. [Le et al. 2016] Tien-Duy B. Le, David Lo, Claire Le Goues, and Lars Grunske. 2016. A Learning-to-Rank Based Fault Localization Approach Using Likely Invariants. In Proceedings of the 25th International Symposium on Software Testing and Analysis (Saarbrucken, Germany) (ISSTA 2016). Association for Computing Machinery, New York, NY, USA, 177{188.Google ScholarGoogle Scholar
  10. [Le et al. 2015] Tien-Duy B. Le, Richard J. Oentaryo, and David Lo. 2015. Information Retrieval and Spectrum Based Bug Localization: Better Together. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015). Association for Computing Machinery, New York, NY, USA, 579{590.Google ScholarGoogle Scholar
  11. [Rao et al. 2013] S. Rao, H. Medeiros, and A. Kak. 2013. An Incremental Update Framework for E_cient Retrieval from Software Libraries for Bug Localization. In 2013 20th Working Conference on Reverse Engineering (WCRE). 62{71.Google ScholarGoogle Scholar
  12. [Rao et al. 2015] Shivani Rao, Henry Medeiros, and Avinash Kak. 2015. Comparing Incremental Latent Semantic Analysis Algorithms for E_cient Retrieval from Software Libraries for Bug Localization. SIGSOFT Softw. Eng. Notes 40, 1 (Feb. 2015), 1{8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [Saha et al. 2013] R. K. Saha, M. Lease, S. Khurshid, and D. E. Perry. 2013. Improving Bug Localization Using Structured Information Retrieval. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 345{355.Google ScholarGoogle Scholar
  14. [Saha et al. 2015] R. K. Saha, L. Zhang, S. Khurshid, and D. E. Perry. 2015. An Information Retrieval Approach for Regression Test Prioritization Based on Program Changes. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 268{279.Google ScholarGoogle Scholar
  15. [Sisman et al. 2017] Bunyamin Sisman, Shayan A. Akbar, and Avinash C. Kak. 2017. Exploiting Spatial Code Proximity and Order for Improved Source Code Retrieval for Bug Localization. Journal of Software: Evolution and Process 29, 1 (2017), e1805. e1805 JSME-16-0104.R1.Google ScholarGoogle Scholar
  16. [Sisman and Kak 2012] B. Sisman and A. C. Kak. 2012. Incorporating Version Histories in Information Retrieval Based Bug Localization. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). 50{59.Google ScholarGoogle Scholar
  17. [Sisman and Kak 2013] B. Sisman and A. C. Kak. 2013. Assisting Code Search with Automatic Query Reformulation for Bug Localization. In 2013 10th Working Conference on Mining Software Repositories (MSR). 309{318.Google ScholarGoogle Scholar
  18. [Thung et al. 2013] F. Thung, S. Wang, D. Lo, and J. Lawall. 2013. Automatic Recommendation of API Methods from Feature Requests. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 290{300.Google ScholarGoogle Scholar
  19. [Wang et al. 2015] Qianqian Wang, Chris Parnin, and Alessandro Orso. 2015. Evaluating the Usefulness of IR-Based Fault Localization Techniques (ISSTA 2015). Association for Computing Machinery, New York, NY, USA, 1{11.Google ScholarGoogle Scholar
  20. [Wang and Lo 2014] Shaowei Wang and David Lo. 2014. Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization. In Proceedings of the 22nd International Conference on Program Comprehension (Hyderabad, India) (ICPC 2014). Association for Computing Machinery, New York, NY, USA, 53{63.Google ScholarGoogle Scholar
  21. [Wang et al. 2011] S. Wang, D. Lo, Z. Xing, and L. Jiang. 2011. Concern Localization using Information Retrieval: An Empirical Study on Linux Kernel. In 2011 18th Working Conference on Reverse Engineering. 92{96.Google ScholarGoogle Scholar
  22. [Wen et al. 2016] M. Wen, R. Wu, and S. Cheung. 2016. Locus: Locating Bugs from Software Changes. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). 262{273.Google ScholarGoogle Scholar
  23. [Wong et al. 2014] C. Wong, Y. Xiong, H. Zhang, D. Hao, L. Zhang, and H. Mei. 2014. Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis. In 2014 IEEE International Conference on Software Maintenance and Evolution. 181{190.Google ScholarGoogle Scholar
  24. [Wong et al. 2016] W. E. Wong, R. Gao, Y. Li, R. Abreu, and F. Wotawa. 2016. A Survey on Software Fault Localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707{740.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [Yang et al. 2014] G. Yang, T. Zhang, and B. Lee. 2014. Towards Semi-automatic Bug Triage and Severity Prediction Based on Topic Model and Multi-feature of Bug Reports. In 2014 IEEE 38th Annual Computer Software and Applications Conference. 97{106.Google ScholarGoogle Scholar
  26. [Ye et al. 2014] Xin Ye, Razvan Bunescu, and Chang Liu. 2014. Learning to Rank Relevant Files for Bug Reports Using Domain Knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (Hong Kong, China) (FSE 2014). Association for Computing Machinery, New York, NY, USA, 689{699. ACM SIGSOFT Software Engineering Newsletter Page 35 July 2021 Volume 46 Number 3Google ScholarGoogle Scholar
  27. [Ye et al. 2016] Xin Ye, Hui Shen, Xiao Ma, Razvan Bunescu, and Chang Liu. 2016. From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering (ICSE '16). Association for Computing Machinery, New York, NY, USA, 404{415.Google ScholarGoogle Scholar
  28. [Zhang et al. 2015] Jie Zhang, Xiaoyin Wang, Dan Hao, Bing Xie, Lu Zhang, and Hong Mei. 2015. A Survey on Bug-Report Analysis. Science China Information Sciences 58, 2 (2015), 1{24.Google ScholarGoogle Scholar
  29. [Zhang et al. 2016] Tao Zhang, Jiachi Chen, Geunseok Yang, Byungjeong Lee, and Xiapu Luo. 2016. Towards More Accurate Severity Prediction and Fixer Recommendation of Software Bugs. Journal of Systems and Software 117 (2016), 166{184.Google ScholarGoogle Scholar
  30. [Zhou et al. 2012] J. Zhou, H. Zhang, and D. Lo. 2012. Where Should the Bugs be Fixed? More Accurate Information Retrieval-based Bug Localization based on bug reports. In 2012 34th International Conference on Software Engineering (ICSE). 14{24.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGSOFT Software Engineering Notes
    ACM SIGSOFT Software Engineering Notes  Volume 46, Issue 3
    July 2021
    40 pages
    ISSN:0163-5948
    DOI:10.1145/3468744
    Issue’s Table of Contents

    Copyright © 2021 Copyright is held by the owner/author(s)

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 21 July 2021

    Check for updates

    Qualifiers

    • research-article
  • Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)4

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader