Fault localization using disparities of dynamic invariants
Introduction
Dynamic invariants are relations among variables that are observed to hold at certain locations in some runs of a program (Nguyen et al., 2012). They can be inserted as assertion statements to detect abnormal behaviors of programs, collected to generate likely documentation and formal specifications, and used in program understanding (Zeller, 2009), etc. In particular, the violation of dynamic invariants can offer useful clues for fault localization and give better explanation for the localization result. However, current applications of dynamic invariants in automated program debugging are not efficient due to the high computational cost of detecting dynamic invariants. For Daikon (Ernst et al., 2001), it maintains an invariant pattern library that limits the detection of invariants with rich expressive power. GenInv (Nguyen et al., 2012) combines mathematical techniques to bring new capabilities to find more expressive dynamic invariants while increasing the complexity of the detection. For Diduce (Hangal and Lam, 2002) and ClearView (Perkins et al., 2009), they can only detect and patch specific types of errors due to the monitoring mechanisms they require. Currently, the work of Sahoo et al. (2013) combines dynamic program invariants with more sophisticated filtering techniques to identify a set of dynamic invariants that hold in selected passed runs but do not hold in failing tests, and returns the locations of the dynamic invariants as the localization results.
In this paper, we present a novel automated fault localization approach via using disparities of dynamic invariants, named FDDI, to locate the root causes of faulty programs via disparities of two sets of dynamic invariants generated from passing and failing test cases respectively. The intuition behind the idea is that a variable is likely to be related to the root cause if it is involved in relations that do not simultaneously hold in a certain number of failed runs and passed runs. In FDDI, spectrum-based fault localization (SBFL) techniques are first applied in function level to find suspect functions. Dynamic invariants are then yielded in these functions one by one. After detecting dynamic invariants, how does FDDI further locate the source code lines with bugs? In the following segment, we demonstrate how to use disparities of dynamic invariants to locate faults by using a snippet that contains a bug at line 5 where “if(d < 6)” should actually be “if(d < 5)”, as shown in Fig. 1. Consider the function f() in Fig. 1 and suppose we have only one invariant schema: i < j, where i and j are metavariables. At the entry of f(), instantiating this schema produces 12 concrete potential invariants:
The upper right portion of Fig. 1 shows the set of invariants that have not been falsified by any of the preceding passed tests. Therefore, the last potential invariant “b < d” survived all three passed executions of function f(). Similarly, as shown in the bottom right portion of Fig. 1, the last potential invariant “b < d” and “c < d” survived all three failed executions of f(). As we can see, two likely invariant sets {b < d} and {b < d, c < d} are yielded via respectively running a passed test suite and a failed test suite, and the disparity between these two sets is {c < d}. FDDI extracts variable “c” and “d” from the disparity and can locate suspect statements line 2 and line 5 via using these variables. As a result, line 5 that is exactly the root cause is captured by our method.
FDDI is neither for certain error types nor under any monitoring environment. To be effective, it consists of two stages in its application. In the first stage, the block hit spectrum based technique is applied to rank functions by their suspiciousness, and n most suspicious functions will be selected for further analysis. By doing this, our method can concentrate on a small portion of the program at a time. Also, this overcomes the problem that a large number of variables in a program will bring existing invariant detecting techniques to their knees. In the second stage, for each suspicious function, two sets of dynamic invariants are first yielded by running a passed and a failed test case suite respectively. The statement-based reduction strategy (Yu et al., 2008) is applied to generate the passed and the failed test case suite. Variables in the difference of the two sets are then used to find data-related statements in the function by static analysis, which reduces the computational cost of filtering redundant and spurious invariants. These statements will be returned to developers in the order of the suspiciousness of the functions where they appear and in the order in which they appear in the same function.
The main contributions of this paper include: (1) We propose to use variables in the disparity of two dynamic invariant sets respectively generated from a failed and a passed test suite to locate bugs in a faulty program. (2) To reduce the high computational cost of current invariant detecting methods, FDDI employs existing dynamic invariant detecting tool, like Daikon, to generate dynamic invariants in the scope of one highly suspect function each time, and block hit spectrum based techniques are applied to rank these functions of a faulty program. (3) The experimental result shows that FDDI locates 75% of 360 common faults in 6 real-life utility programs when examining up to 10% of the executed code, while Naish2, Ochiai and Jaccard all locate around 53%.
The reminder of this paper is organized as follows: Section 2 investigates the proposed approach in detail. Section 3 evaluates the proposed approach and presents the experiment results. Section 4 presents the related work. Section 5 concludes this paper and highlights some future work.
Section snippets
FDDI
In this section, we first illustrate the top-level view of FDDI and present primary parts of FDDI. We then describe and analyze the algorithm of FDDI.
Experiments and results
In this section, we evaluate the performance of the FDDI approach for real projects. First, we present experimental setup. Then, we discuss results of our empirical study, and demonstrate explanatory capabilities of FDDI through two examples. Finally, we finish this section with a threats to validity discussion.
Related work
Fault localization has been an active area of research for the past decades. Since many techniques are emerged for automated program debugging, we only present some works that are highly relevant with our work.
Spectrum-based fault localization (SBFL) is a testing based program debugging approach. It utilizes program spectra and testing results collected during software testing, and applies risk evaluation formulas on each block to calculate a real value that indicates the risk of being faulty
Conclusions and future work
In this paper, we proposed a novel automated fault localization method via using variables in the disparities of two dynamic invariant sets, generated by running a suite of failed and passed test cases respectively. The underlying idea is that variables may be related to the fault if they are involved in dynamic invariants that do not simultaneously hold in a set of failed and passed tests. FDDI combines the spectrum-based fault localization method with the dynamic invariant detecting
Acknowledgements
The authors wish to thank Kailun Luo for his valuable suggestions for improving this paper. This work is supported by State Key Laboratory of Software Development Environment Open Fund (SKLSDE-2012KF-08).
Xiaoyan Wang is a Lecturer of School of Management Science and Engineering at Nanjing Audit University, China. She received her PhD in School of Information Science Technology of Sun Yat-sen University in 2015. Her research interests include software engineering issues on program analysis, testing, debugging and web service composition.
References (40)
- et al.
A dynamic code coverage approach to maximize fault localization efficiency
J. Syst. Softw.
(2014) - et al.
Automatic software fault localization using generic program invariants
Proceedings of the 2008 ACM Symposium on Applied computing
(2008) - et al.
On the Performance of Fault Screeners in Software Development and Deployment
Technical Report
(2008) - et al.
A practical evaluation of spectrum-based fault localization
J. Syst. Softw.
(2009) - et al.
An evaluation of similarity coefficients for software fault localization
Proceedings of the 12th Pacific Rim International Sysmposium on Dependable Computing (PRDC’06)
(2006) - et al.
On the accuracy of spectrum-based fault localization
Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION
(2007) - et al.
Causal inference for statistical fault localization
Proceedings of the 2010 International Symposium on Software Testing and Analysis (ISSTA’10)
(2010) - et al.
Pinpoint: problem determination in large, dynamic internet services
Proceedings of International Conference on Dependable Systems and Networks (DSN’02)
(2002) - et al.
Holmes: Effective statistical debugging via efficient path profiling
Proceedings of 31st IEEE International Conference on Software Engineering (ICSE’09)
(2009) - et al.
Locating causes of program failures
Proceedings of the 27th international conference on Software engineering (ICSE’05)
(2005)
Dynamically discovering likely program invariants
Dynamically discovering likely program invariants to support program evolution
IEEE Trans. Softw. Eng.
Using html5 visualizations in software fault localization
Software Visualization (VISSOFT), 2013 First IEEE Working Conference on
Tracking down software bugs using automatic anomaly detection
Proceedings of the 24th international conference on Software engineering (ICSE’02)
On the adoption of mc/dc and control-flow adequacy for a tight integration of program testing and statistical fault localization
Inf. Softw. Technol.
How well do test case prioritization techniques support statistical fault localization
Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC’09)
Empirical evaluation of the tarantula automatic fault-localization technique
Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering (ASE’05)
Visualization of test information to assist fault localization
Proceedings of the 24th International Conference on Software Engineering (ICSE’02)
Study of the relationship of bug consistency with respect to performance of spectra metrics
Proceedings of the 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT’09)
A model for spectra-based software diagnosis
ACM Trans. Softw. Eng. Methodol
Cited by (10)
Multiple fault localization of software programs: A systematic literature review
2020, Information and Software TechnologyCitation Excerpt :The study concluded that weighting failed test cases caused by multiple faults improve the effectiveness of fault localization techniques. Wang et al. proposed a novel fault localization approach based on disparities of dynamic invariants, named FDDI [36]. FDDI selects a highly-suspected function and then applies invariant detection tools to this function separately.
A single fault localization technique based on failed test input
2019, ArrayCitation Excerpt :Landsberg et al. improve the effectiveness of SFL technique by introducing a new method that generates a viable and efficient test suite for effective fault localization [41]. In another study by Ref. [42], a fault localization technique named FDDI was proposed. The technique (FDDI) chooses the most suspicious function and applies invariant detection tools to the function distinctly.
Review of Software Multiple Fault Localization Approaches
2022, Jisuanji Xuebao/Chinese Journal of ComputersVSBFL: Variable Value Sequence Based Fault Localization for Novice Programs
2021, Proceedings - 2021 21st International Conference on Software Quality, Reliability and Security Companion, QRS-C 2021Improving Fault-Localization Accuracy by Referencing Debugging History to Alleviate Structure Bias in Code Suspiciousness
2020, IEEE Transactions on ReliabilityDefect Comprehension Research: Present, Problem and Prospect
2020, Ruan Jian Xue Bao/Journal of Software
Xiaoyan Wang is a Lecturer of School of Management Science and Engineering at Nanjing Audit University, China. She received her PhD in School of Information Science Technology of Sun Yat-sen University in 2015. Her research interests include software engineering issues on program analysis, testing, debugging and web service composition.
Yongmei Liu is a Professor of Computer Science at Sun Yat-sen University, China. She received her PhD in Computer Science from University of Toronto in 2006. Her research interests lie in Artificial Intelligence, knowledge representation and reasoning, cognitive robotics, program verification and debugging.