Fault localization using disparities of dynamic invariants

doi:10.1016/j.jss.2016.09.014

Journal of Systems and Software

Volume 122, December 2016, Pages 144-154

https://doi.org/10.1016/j.jss.2016.09.014 Get rights and content

Highlights

•
We use disparities of dynamic invariants to locate faults in suspect functions.
•
Our aim is to efficiently apply existing invariant detecting tools to locate faults.
•
Filtering false positives and redundant invariants are not required in our method.
•
75% of 360 common faults are located when examining up to 10% of executed codes.

Abstract

Violations of dynamic invariants may offer useful clues for identifying faults in programs. Although techniques that use violations of dynamic invariants to detect anomalies have been developed, some of them are restrained by the high computational cost of invariant detecting, false positive filtering, and redundancy removing, and others can only discover a few specific types of faults under a complete monitoring environment. This paper presents a novel fault localization approach using disparities of dynamic invariants, named FDDI. To make more efficient use of invariant detecting tools, FDDI first selects highly suspect functions via spectrum-based fault localization techniques, and then applies invariant detecting tools to these functions one by one. For each suspect function, FDDI uses variables that are involved in dynamic invariants that do not simultaneously hold in a set of passed and a set of failed tests to do further analysis, which reduces the time cost in filtering false positives and redundant invariants. Finally, FDDI locates statements that are data-related to these variables. The experimental results show that FDDI is able to locate 75% of 360 common faults in utility programs when examining up to 10% of the executed code, while Naish2, Ochiai and Jaccard all locate around 53%.

Introduction

Dynamic invariants are relations among variables that are observed to hold at certain locations in some runs of a program (Nguyen et al., 2012). They can be inserted as assertion statements to detect abnormal behaviors of programs, collected to generate likely documentation and formal specifications, and used in program understanding (Zeller, 2009), etc. In particular, the violation of dynamic invariants can offer useful clues for fault localization and give better explanation for the localization result. However, current applications of dynamic invariants in automated program debugging are not efficient due to the high computational cost of detecting dynamic invariants. For Daikon (Ernst et al., 2001), it maintains an invariant pattern library that limits the detection of invariants with rich expressive power. GenInv (Nguyen et al., 2012) combines mathematical techniques to bring new capabilities to find more expressive dynamic invariants while increasing the complexity of the detection. For Diduce (Hangal and Lam, 2002) and ClearView (Perkins et al., 2009), they can only detect and patch specific types of errors due to the monitoring mechanisms they require. Currently, the work of Sahoo et al. (2013) combines dynamic program invariants with more sophisticated filtering techniques to identify a set of dynamic invariants that hold in selected passed runs but do not hold in failing tests, and returns the locations of the dynamic invariants as the localization results.

In this paper, we present a novel automated fault localization approach via using disparities of dynamic invariants, named FDDI, to locate the root causes of faulty programs via disparities of two sets of dynamic invariants generated from passing and failing test cases respectively. The intuition behind the idea is that a variable is likely to be related to the root cause if it is involved in relations that do not simultaneously hold in a certain number of failed runs and passed runs. In FDDI, spectrum-based fault localization (SBFL) techniques are first applied in function level to find suspect functions. Dynamic invariants are then yielded in these functions one by one. After detecting dynamic invariants, how does FDDI further locate the source code lines with bugs? In the following segment, we demonstrate how to use disparities of dynamic invariants to locate faults by using a snippet that contains a bug at line 5 where “if(d < 6)” should actually be “if(d < 5)”, as shown in Fig. 1. Consider the function f() in Fig. 1 and suppose we have only one invariant schema: i < j, where i and j are metavariables. At the entry of f(), instantiating this schema produces 12 concrete potential invariants:

The upper right portion of Fig. 1 shows the set of invariants that have not been falsified by any of the preceding passed tests. Therefore, the last potential invariant “b < d” survived all three passed executions of function f(). Similarly, as shown in the bottom right portion of Fig. 1, the last potential invariant “b < d” and “c < d” survived all three failed executions of f(). As we can see, two likely invariant sets {b < d} and {b < d, c < d} are yielded via respectively running a passed test suite and a failed test suite, and the disparity between these two sets is {c < d}. FDDI extracts variable “c” and “d” from the disparity and can locate suspect statements line 2 and line 5 via using these variables. As a result, line 5 that is exactly the root cause is captured by our method.

FDDI is neither for certain error types nor under any monitoring environment. To be effective, it consists of two stages in its application. In the first stage, the block hit spectrum based technique is applied to rank functions by their suspiciousness, and n most suspicious functions will be selected for further analysis. By doing this, our method can concentrate on a small portion of the program at a time. Also, this overcomes the problem that a large number of variables in a program will bring existing invariant detecting techniques to their knees. In the second stage, for each suspicious function, two sets of dynamic invariants are first yielded by running a passed and a failed test case suite respectively. The statement-based reduction strategy (Yu et al., 2008) is applied to generate the passed and the failed test case suite. Variables in the difference of the two sets are then used to find data-related statements in the function by static analysis, which reduces the computational cost of filtering redundant and spurious invariants. These statements will be returned to developers in the order of the suspiciousness of the functions where they appear and in the order in which they appear in the same function.

The main contributions of this paper include: (1) We propose to use variables in the disparity of two dynamic invariant sets respectively generated from a failed and a passed test suite to locate bugs in a faulty program. (2) To reduce the high computational cost of current invariant detecting methods, FDDI employs existing dynamic invariant detecting tool, like Daikon, to generate dynamic invariants in the scope of one highly suspect function each time, and block hit spectrum based techniques are applied to rank these functions of a faulty program. (3) The experimental result shows that FDDI locates 75% of 360 common faults in 6 real-life utility programs when examining up to 10% of the executed code, while Naish2, Ochiai and Jaccard all locate around 53%.

The reminder of this paper is organized as follows: Section 2 investigates the proposed approach in detail. Section 3 evaluates the proposed approach and presents the experiment results. Section 4 presents the related work. Section 5 concludes this paper and highlights some future work.

Section snippets

FDDI

In this section, we first illustrate the top-level view of FDDI and present primary parts of FDDI. We then describe and analyze the algorithm of FDDI.

Experiments and results

In this section, we evaluate the performance of the FDDI approach for real projects. First, we present experimental setup. Then, we discuss results of our empirical study, and demonstrate explanatory capabilities of FDDI through two examples. Finally, we finish this section with a threats to validity discussion.

Related work

Fault localization has been an active area of research for the past decades. Since many techniques are emerged for automated program debugging, we only present some works that are highly relevant with our work.

Spectrum-based fault localization (SBFL) is a testing based program debugging approach. It utilizes program spectra and testing results collected during software testing, and applies risk evaluation formulas on each block to calculate a real value that indicates the risk of being faulty

Conclusions and future work

In this paper, we proposed a novel automated fault localization method via using variables in the disparities of two dynamic invariant sets, generated by running a suite of failed and passed test cases respectively. The underlying idea is that variables may be related to the fault if they are involved in dynamic invariants that do not simultaneously hold in a set of failed and passed tests. FDDI combines the spectrum-based fault localization method with the dynamic invariant detecting

Acknowledgements

The authors wish to thank Kailun Luo for his valuable suggestions for improving this paper. This work is supported by State Key Laboratory of Software Development Environment Open Fund (SKLSDE-2012KF-08).

Xiaoyan Wang is a Lecturer of School of Management Science and Engineering at Nanjing Audit University, China. She received her PhD in School of Information Science Technology of Sun Yat-sen University in 2015. Her research interests include software engineering issues on program analysis, testing, debugging and web service composition.

References (40)

A. Perez et al.
A dynamic code coverage approach to maximize fault localization efficiency
J. Syst. Softw.
(2014)
R. Abreu et al.
Automatic software fault localization using generic program invariants
Proceedings of the 2008 ACM Symposium on Applied computing
(2008)
R. Abreu et al.
On the Performance of Fault Screeners in Software Development and Deployment
Technical Report
(2008)
R. Abreu et al.
A practical evaluation of spectrum-based fault localization
J. Syst. Softw.
(2009)
R. Abreu et al.
An evaluation of similarity coefficients for software fault localization
Proceedings of the 12th Pacific Rim International Sysmposium on Dependable Computing (PRDC’06)
(2006)
R. Abreu et al.
On the accuracy of spectrum-based fault localization
Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION
(2007)
G.K. Baah et al.
Causal inference for statistical fault localization
Proceedings of the 2010 International Symposium on Software Testing and Analysis (ISSTA’10)
(2010)
M.Y. Chen et al.
Pinpoint: problem determination in large, dynamic internet services
Proceedings of International Conference on Dependable Systems and Networks (DSN’02)
(2002)
T.M. Chilimbi et al.
Holmes: Effective statistical debugging via efficient path profiling
Proceedings of 31st IEEE International Conference on Software Engineering (ICSE’09)
(2009)
H. Cleve et al.
Locating causes of program failures
Proceedings of the 27th international conference on Software engineering (ICSE’05)
(2005)

M.D. Ernst

Dynamically discovering likely program invariants

(2000)

M.D. Ernst et al.

Dynamically discovering likely program invariants to support program evolution

IEEE Trans. Softw. Eng.

(2001)

C. Gouveia et al.

Using html5 visualizations in software fault localization

Software Visualization (VISSOFT), 2013 First IEEE Working Conference on

(2013)

S. Hangal et al.

Tracking down software bugs using automatic anomaly detection

Proceedings of the 24th international conference on Software engineering (ICSE’02)

(2002)

B. Jiang et al.

On the adoption of mc/dc and control-flow adequacy for a tight integration of program testing and statistical fault localization

Inf. Softw. Technol.

(2012)

B. Jiang et al.

How well do test case prioritization techniques support statistical fault localization

Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC’09)

(2009)

J.A. Jones et al.

Empirical evaluation of the tarantula automatic fault-localization technique

Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering (ASE’05)

(2005)

J.A. Jones et al.

Visualization of test information to assist fault localization

Proceedings of the 24th International Conference on Software Engineering (ICSE’02)

(2002)

H.J. Lee et al.

Study of the relationship of bug consistency with respect to performance of spectra metrics

Proceedings of the 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT’09)

(2009)

N. Lee et al.

A model for spectra-based software diagnosis

ACM Trans. Softw. Eng. Methodol

(2011)

Cited by (10)

Multiple fault localization of software programs: A systematic literature review
2020, Information and Software Technology
Citation Excerpt :
The study concluded that weighting failed test cases caused by multiple faults improve the effectiveness of fault localization techniques. Wang et al. proposed a novel fault localization approach based on disparities of dynamic invariants, named FDDI [36]. FDDI selects a highly-suspected function and then applies invariant detection tools to this function separately.
Multiple fault localization (MFL) is the act of identifying the locations of multiple faults (more than one fault) in a faulty software program. This is known to be more complicated, tedious, and costly in comparison to the traditional practice of presuming that a software contains a single fault. Due to the increasing interest in MFL by the research community, a broad spectrum of MFL debugging approaches and solutions have been proposed and developed.
The aim of this study is to systematically review existing research on MFL in the software fault localization (SFL) domain. This study also aims to identify, categorize, and synthesize relevant studies in the research domain.
Consequently, using an evidence-based systematic methodology, we identified 55 studies relevant to four research questions. The methodology provides a systematic selection and evaluation process with rigorous and repeatable evidence-based studies selection process.
The result of the systematic review shows that research on MFL is gaining momentum with stable growth in the last 5 years. Three prominent MFL debugging approaches were identified, i.e. One-bug-at-a-time debugging approach (OBA), parallel debugging approach, and multiple-bug-at-a-time debugging approach (MBA), with OBA debugging approach being utilized the most.
The study concludes with some identified research challenges and suggestions for future research. Although MFL is becoming of grave concern, existing solutions in the field are less mature. Studies utilizing real faults in their experiments are scarce. Concrete solutions to reduce MFL debugging time and cost by adopting an approach such as MBA debugging approach are also less, which require more attention from the research community.
A single fault localization technique based on failed test input
2019, Array
Citation Excerpt :
Landsberg et al. improve the effectiveness of SFL technique by introducing a new method that generates a viable and efficient test suite for effective fault localization [41]. In another study by Ref. [42], a fault localization technique named FDDI was proposed. The technique (FDDI) chooses the most suspicious function and applies invariant detection tools to the function distinctly.
Testing and debugging are very important tasks in software development. Fault localization is a very critical activity in the debugging process and also is one of the most difficult and time-consuming activities. The demand for effective fault localization techniques that can aid developers to the location of faults is high. In this paper, a fault localization technique based on complex network theory named FLCN-S is proposed to improve localization effectiveness on single-fault subject programs. The proposed technique diagnoses and ranks faulty program statements based on their behavioral anomalies and distance between statements in failed tests execution by utilizing two network centrality measures (degree centrality and closeness centrality). The proposed technique is evaluated on a well-known standard benchmark (Siemens test suite) and four Unix real-life utility subject programs (gzip, sed, flex, and grep). Overall, the results show that FLCN-S is significantly more effective in locating faults in comparison with other techniques. Furthermore, we observed that both degree and closeness centrality play a vital role in the identification of faults.
Review of Software Multiple Fault Localization Approaches
2022, Jisuanji Xuebao/Chinese Journal of Computers
VSBFL: Variable Value Sequence Based Fault Localization for Novice Programs
2021, Proceedings - 2021 21st International Conference on Software Quality, Reliability and Security Companion, QRS-C 2021
Improving Fault-Localization Accuracy by Referencing Debugging History to Alleviate Structure Bias in Code Suspiciousness
2020, IEEE Transactions on Reliability
Defect Comprehension Research: Present, Problem and Prospect
2020, Ruan Jian Xue Bao/Journal of Software

View all citing articles on Scopus

Yongmei Liu is a Professor of Computer Science at Sun Yat-sen University, China. She received her PhD in Computer Science from University of Toronto in 2006. Her research interests lie in Artificial Intelligence, knowledge representation and reasoning, cognitive robotics, program verification and debugging.

View full text

Fault localization using disparities of dynamic invariants

Highlights

Abstract

Introduction

Section snippets

FDDI

Experiments and results

Related work

Conclusions and future work

Acknowledgements

J. Syst. Softw.

Automatic software fault localization using generic program invariants

Proceedings of the 2008 ACM Symposium on Applied computing

On the Performance of Fault Screeners in Software Development and Deployment

Technical Report

A practical evaluation of spectrum-based fault localization

J. Syst. Softw.

An evaluation of similarity coefficients for software fault localization

Proceedings of the 12th Pacific Rim International Sysmposium on Dependable Computing (PRDC’06)

On the accuracy of spectrum-based fault localization

Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION

Causal inference for statistical fault localization

Proceedings of the 2010 International Symposium on Software Testing and Analysis (ISSTA’10)

Pinpoint: problem determination in large, dynamic internet services

Proceedings of International Conference on Dependable Systems and Networks (DSN’02)

Holmes: Effective statistical debugging via efficient path profiling

Proceedings of 31st IEEE International Conference on Software Engineering (ICSE’09)

Locating causes of program failures

Proceedings of the 27th international conference on Software engineering (ICSE’05)