research-article

How Useful is Code Change Information for Fault Localization in Continuous Integration?

Authors:

Tse-Hsun (Peter) Chen,

Junjie ChenAuthors Info & Claims

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Article No.: 52, Pages 1 - 12

https://doi.org/10.1145/3551349.3556931

Published: 05 January 2023 Publication History

Abstract

Continuous integration (CI) is the process in which code changes are automatically integrated, built, and tested in a shared repository. In CI, developers frequently merge and test code under development, which helps isolate faults with finer-grained change information. To identify faulty code, prior research has widely studied and evaluated the performance of spectrum-based fault localization (SBFL) techniques. While the continuous nature of CI requires the code changes to be atomic and presents fine-grained information on what part of the system is being changed, traditional SBFL techniques do not benefit from it. To overcome the limitation, we propose to integrate the code and coverage change information in fault localization under CI settings. First, code changes show how faults are introduced into the system, and provide developers with better understanding on the root cause. Second, coverage changes show how the code coverage is impacted when faults are introduced. This change information can help limit the search space of code coverage, which offers more opportunities for improving fault localization techniques. Based on the above observations, we propose three new change-based fault localization techniques, and compare them with Ochiai, a commonly used SBFL technique. We evaluate these techniques on 192 real faults from seven software systems. Our results show that all three change-based techniques outperform Ochiai on the Defects4J dataset. In particular, the improvement varies from 7% to 23% and 17% to 24% for average MAP and MRR, respectively. Moreover, we find that our change-based fault localization techniques can be integrated with Ochiai, and boost its performance by up to 53% and 52% for average MAP and MRR, respectively.

References

[1]

2021. Cobertura. https://cobertura.github.io/cobertura/. Last accessed May 5 2021.

[2]

2021. JaCoCo. https://www.eclemma.org/jacoco/. Last accessed May 5 2021.

[3]

2022. Deflaker. https://www.deflaker.org/. Last accessed February 28 2022.

[4]

2022. GZoltar. https://gzoltar.com/. Last accessed February 28 2022.

[5]

2022. Leveraging-Change-Information repository. https://github.com/anonymized-datascientist/Leveraging-Change-Information. Last accessed March 7 2022.

[6]

Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2009. Spectrum-Based Multiple Fault Localization. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering(ASE ’09). 88–99.

[7]

Rui Abreu, Peter Zoeteweij, Rob Golsteijn, and Arjan JC Van Gemund. 2009. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software 82, 11 (2009), 1780–1792.

Digital Library

[8]

Rui Abreu, Peter Zoeteweij, Rob Golsteijn, and Arjan J. C. van Gemund. 2009. A Practical Evaluation of Spectrum-based Fault Localization. Journal of Systems and Software 82, 11 (Nov. 2009), 1780–1792.

Digital Library

[9]

Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and industrial conference practice and research techniques-MUTATION (TAICPART-MUTATION 2007). IEEE, 89–98.

Digital Library

[10]

Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2007. On the Accuracy of Spectrum-based Fault Localization. In Proceedings of the Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION(TAICPART-MUTATION ’07). 89–98.

[11]

Elton Alves, Milos Gligoric, Vilas Jagannath, and Marcelo d’Amorim. 2011. Fault-localization using dynamic slicing and change impact analysis. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). IEEE, 520–523.

Digital Library

[12]

Jonathan Bell, Owolabi Legunsen, Michael Hilton, Lamyaa Eloussi, Tifany Yung, and Darko Marinov. 2018. DeFlaker: Automatically detecting flaky tests. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 433–444.

[13]

Moritz Beller, Georgios Gousios, and Andy Zaidman. 2017. Oops, my tests broke the build: An explorative analysis of travis ci with github. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 356–367.

Digital Library

[14]

Marcel Böhme and Abhik Roychoudhury. 2014. Corebench: Studying complexity of regression errors. In Proceedings of the 2014 international symposium on software testing and analysis. 105–115.

Digital Library

[15]

An Ran Chen, Tse-Hsun Peter Chen, and Shaowei Wang. 2021. Pathidea: Improving information retrieval-based bug localization by re-constructing execution paths using logs. IEEE Transactions on Software Engineering(2021).

Digital Library

[16]

Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. Compiler bug isolation via effective witness test program generation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 223–234.

Digital Library

[17]

Junjie Chen, Haoyang Ma, and Lingming Zhang. 2020. Enhanced compiler bug isolation via memoized search. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 78–89.

Digital Library

[18]

Arpit Christi, Matthew Lyle Olson, Mohammad Amin Alipour, and Alex Groce. 2018. Reduce before you localize: Delta-debugging and spectrum-based fault localization. In 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, 184–191.

[19]

Jackson Antonio do Prado Lima and Silvia Regina Vergilio. 2020. A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Transactions on Software Engineering(2020).

[20]

Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for improving regression testing in continuous integration development environments. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 235–245.

Digital Library

[21]

Dror G Feitelson, Eitan Frachtenberg, and Kent L Beck. 2013. Development and deployment at facebook. IEEE Internet Computing 17, 4 (2013), 8–17.

Digital Library

[22]

Luca Gazzola, Daniela Micucci, and Leonardo Mariani. 2017. Automatic software repair: A survey. IEEE Transactions on Software Engineering 45, 1 (2017), 34–67.

Digital Library

[23]

Michael Hilton, Jonathan Bell, and Darko Marinov. 2018. A large-scale study of test coverage evolution. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 53–63.

Digital Library

[24]

Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, Costs, and Benefits of Continuous Integration in Open-source Projects. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering(ASE 2016). 426–437.

Digital Library

[25]

[25] JavaParser.2019. https://javaparser.org/. Last accessed July 1 2020.

[26]

Jiajun Jiang, Ran Wang, Yingfei Xiong, Xiangping Chen, and Lu Zhang. 2019. Combining spectrum-based fault localization and statistical debugging: An empirical study. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 502–514.

Digital Library

[27]

Yanjie Jiang, Hui Liu, Nan Niu, Lu Zhang, and Yamin Hu. 2021. Extracting concise bug-fixing patches from human-written patches in version control systems. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 686–698.

Digital Library

[28]

James A Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ICSE 2002. IEEE, 467–477.

[29]

René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437–440.

Digital Library

[30]

Pavneet Singh Kochhar, Xin Xia, David Lo, and Shanping Li. 2016. Practitioners’ expectations on automated fault localization. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 165–176.

Digital Library

[31]

Adriaan Labuschagne, Laura Inozemtseva, and Reid Holmes. 2017. Measuring the cost of regression testing in practice: A study of Java projects using continuous integration. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 821–830.

Digital Library

[32]

Tien-Duy B Le, Ferdian Thung, and David Lo. 2013. Theory and practice, do they match? a case with spectrum-based fault localization. In 2013 IEEE International Conference on Software Maintenance. IEEE, 380–383.

Digital Library

[33]

Xia Li, Wei Li, Yuqun Zhang, and Lingming Zhang. 2019. Deepfl: Integrating multiple fault diagnosis dimensions for deep fault localization. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 169–180.

Digital Library

[34]

Yi Li, Shaohua Wang, and Tien N Nguyen. 2021. Fault localization with code coverage representation learning. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 661–673.

Digital Library

[35]

Yiling Lou, Ali Ghanbari, Xia Li, Lingming Zhang, Haotian Zhang, Dan Hao, and Lu Zhang. 2020. Can automated program repair refine fault localization? a unified debugging approach. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 75–87.

Digital Library

[36]

Lucia Lucia, David Lo, Lingxiao Jiang, Ferdian Thung, and Aditya Budi. 2014. Extended comprehensive study of association measures for fault localization. Journal of software: Evolution and Process 26, 2 (2014), 172–219.

Digital Library

[37]

Wes Masri. 2010. Fault localization based on information flow coverage. Software Testing, Verification and Reliability 20, 2(2010), 121–147.

Digital Library

[38]

Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th international conference on Software engineering. 181–190.

Digital Library

[39]

Manish Motwani and Yuriy Brun. 2020. Automatically repairing programs using both tests and bug reports. arXiv preprint arXiv:2011.08340(2020).

[40]

Nachiappan Nagappan and Thomas Ball. 2005. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th international conference on Software engineering (St. Louis, MO, USA) (ICSE ’05). ACM, New York, NY, USA, 284–292.

Digital Library

[41]

Steve Neely and Steve Stolt. 2013. Continuous delivery? easy! just change everything (well, maybe it is not that easy). In 2013 Agile Conference. IEEE, 121–128.

Digital Library

[42]

Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 609–620.

Digital Library

[43]

Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access 5(2017), 3909–3943.

[44]

Jeongju Sohn and Shin Yoo. 2017. Fluccs: Using code and change metrics to improve fault localization. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 273–283.

Digital Library

[45]

Xuezhi Song, Yun Lin, Siang Hwee Ng, Ping Yu, Xin Peng, and Jin Song Dong. 2021. Constructing Regression Dataset from Code Evolution History. arXiv preprint arXiv:2109.12389(2021).

[46]

Matúš Sulír and Jaroslav Porubän. 2016. A quantitative study of java software buildability. In Proceedings of the 7th International Workshop on Evaluation and Usability of Programming Languages and Tools. 17–25.

Digital Library

[47]

Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk. 2017. There and back again: Can you compile that snapshot?Journal of Software: Evolution and Process 29, 4 (2017), e1838.

[48]

Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu, and Vladimir Filkov. 2015. Quality and productivity outcomes relating to continuous integration in GitHub. In Proceedings of the 2015 10th joint meeting on foundations of software engineering. 805–816.

Digital Library

[49]

Shaowei Wang and David Lo. 2014. Version history, similar report, and structure: Putting them together for improved bug localization. In Proceedings of the 22nd International Conference on Program Comprehension. 53–63.

Digital Library

[50]

Shaowei Wang and David Lo. 2016. AmaLgam+: Composing Rich Information Sources for Accurate Bug Localization. Journal of Software: Evolution and Process 28, 10 (2016), 921–942.

Digital Library

[51]

Shaowei Wang and David Lo. 2016. Amalgam+: Composing rich information sources for accurate bug localization. Journal of Software: Evolution and Process 28, 10 (2016), 921–942.

Digital Library

[52]

Xinming Wang, Shing-Chi Cheung, Wing Kwong Chan, and Zhenyu Zhang. 2009. Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In 2009 IEEE 31st International Conference on Software Engineering. IEEE, 45–55.

Digital Library

[53]

Ming Wen, Junjie Chen, Yongqiang Tian, Rongxin Wu, Dan Hao, Shi Han, and Shing-Chi Cheung. 2019. Historical spectrum based fault localization. IEEE Transactions on Software Engineering(2019).

[54]

Ming Wen, Rongxin Wu, and Shing-Chi Cheung. 2016. Locus: Locating bugs from software changes. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 262–273.

Digital Library

[55]

Ming Wen, Rongxin Wu, Yepang Liu, Yongqiang Tian, Xuan Xie, Shing-Chi Cheung, and Zhendong Su. 2019. Exploring and exploiting the correlations between bug-inducing and bug-fixing commits. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 326–337.

Digital Library

[56]

Chu-Pan Wong, Yingfei Xiong, Hongyu Zhang, Dan Hao, Lu Zhang, and Hong Mei. 2014. Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis. In Proceedings of the 2014 IEEE International Conference on Software Maintenance and Evolution(ICSME ’14). 181–190.

Digital Library

[57]

W Eric Wong, Vidroha Debroy, and Byoungju Choi. 2010. A family of code coverage-based heuristics for effective fault localization. Journal of Systems and Software 83, 2 (2010), 188–208.

Digital Library

[58]

W Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707–740.

Digital Library

[59]

Rongxin Wu, Ming Wen, Shing-Chi Cheung, and Hongyu Zhang. 2018. Changelocator: locate crash-inducing changes based on crash reports. Empirical Software Engineering 23, 5 (2018), 2866–2900.

Digital Library

[60]

Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology (TOSEM) 22, 4(2013), 1–40.

Digital Library

[61]

Klaus Changsun Youm, June Ahn, Jeongho Kim, and Eunseok Lee. 2015. Bug localization based on code change histories and bug reports. In 2015 Asia-Pacific Software Engineering Conference (APSEC). IEEE, 190–197.

[62]

Abubakar Zakari, Sai Peck Lee, Rui Abreu, Babiker Hussien Ahmed, and Rasheed Abubakar Rasheed. 2020. Multiple fault localization of software programs: A systematic literature review. Information and Software Technology 124 (2020), 106312.

[63]

Mengshi Zhang, Xia Li, Lingming Zhang, and Sarfraz Khurshid. 2017. Boosting spectrum-based fault localization using pagerank. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 261–272.

Digital Library

[64]

Mengshi Zhang, Yaoxian Li, Xia Li, Lingchao Chen, Yuqun Zhang, Lingming Zhang, and Sarfraz Khurshid. 2019. An empirical study of boosting spectrum-based fault localization via pagerank. IEEE Transactions on Software Engineering 47, 6 (2019), 1089–1113.

[65]

Jian Zhou, Hongyu Zhang, and David Lo. 2012. Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 14–24.

Digital Library

[66]

Daming Zou, Jingjing Liang, Yingfei Xiong, Michael D Ernst, and Lu Zhang. 2019. An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering 47, 2 (2019), 332–347.

Digital Library

Cited By

Zhang XSong YXie XXin QXing CFilkov VRay BZhou M(2024)Do not neglect what's on your hands: localizing software faults with exception trigger streamProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695479(982-994)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695479
Rafi Md'Amorim M(2024)Enhancing Code Representation for Improved Graph Neural Network-Based Fault LocalizationCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3664459(686-688)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3664459
Rafi MKim DChen AChen TWang S(2024)Towards Better Graph Neural Network-Based Fault Localization through Enhanced Code RepresentationProceedings of the ACM on Software Engineering10.1145/36607931:FSE(1937-1959)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660793
Show More Cited By

Index Terms

How Useful is Code Change Information for Fault Localization in Continuous Integration?
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management
        Software maintenance
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Software configuration management and version control systems

Index terms have been assigned to the content through auto-classification.

Recommendations

FLUCCS: using code and change metrics to improve fault localization
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Fault localization aims to support the debugging activities of human developers by highlighting the program elements that are suspected to be responsible for the observed failure. Spectrum Based Fault Localization (SBFL), an existing localization ...
Fault density, fault types, and spectra-based fault localization

This paper presents multiple empirical experiments that investigate the impact of fault quantity and fault type on statistical, coverage-based fault localization techniques and fault-localization interference. Fault-localization interference is a ...
Fault localization for build code errors in makefiles
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

Building is an important process in software development. In large software projects, build code has a high level of complexity, churn rate, and defect proneness. While several automated approaches exist to help developers in localizing faults in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

October 2022

2006 pages

ISBN:9781450394758

DOI:10.1145/3551349

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

ASE '22

ASE '22: 37th IEEE/ACM International Conference on Automated Software Engineering

October 10 - 14, 2022

MI, Rochester, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
254
Total Downloads

Downloads (Last 12 months)86
Downloads (Last 6 weeks)9

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XSong YXie XXin QXing CFilkov VRay BZhou M(2024)Do not neglect what's on your hands: localizing software faults with exception trigger streamProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695479(982-994)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695479
Rafi Md'Amorim M(2024)Enhancing Code Representation for Improved Graph Neural Network-Based Fault LocalizationCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3664459(686-688)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3664459
Rafi MKim DChen AChen TWang S(2024)Towards Better Graph Neural Network-Based Fault Localization through Enhanced Code RepresentationProceedings of the ACM on Software Engineering10.1145/36607931:FSE(1937-1959)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660793
Wang BWei JChen MChen CLin YZhang J(2024)A Systematic Exploration of Mutation‐Based Fault Localization FormulaeSoftware Testing, Verification and Reliability10.1002/stvr.1905Online publication date: 11-Nov-2024
https://doi.org/10.1002/stvr.1905

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten