research-article

DARWIN: An approach to debugging evolving programs

Authors:

Abhik Roychoudhury,

Kapil VaswaniAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 21, Issue 3

Article No.: 19, Pages 1 - 29

https://doi.org/10.1145/2211616.2211622

Published: 03 July 2012 Publication History

Abstract

Bugs in programs are often introduced when programs evolve from a stable version to a new version. In this article, we propose a new approach called DARWIN for automatically finding potential root causes of such bugs. Given two programs—a reference program and a modified program—and an input that fails on the modified program, our approach uses symbolic execution to automatically synthesize a new input that (a) is very similar to the failing input and (b) does not fail. We find the potential cause(s) of failure by comparing control-flow behavior of the passing and failing inputs and identifying code fragments where the control flows diverge.

A notable feature of our approach is that it handles hard-to-explain bugs, like code missing errors, by pointing to code in the reference program. We have implemented this approach and conducted experiments using several real-world applications, such as the Apache Web server, libPNG (a library for manipulating PNG images), and TCPflow (a program for displaying data sent through TCP connections). In each of these applications, DARWIN was able to localize bugs with high accuracy. Even though these applications contain several thousands of lines of code, DARWIN could usually narrow down the potential root cause(s) to less than ten lines. In addition, we find that the inputs synthesized by DARWIN provide additional value by revealing other undiscovered errors.

References

[1]

Agrawal, H. and Horgan, J. R. 1990. Dynamic program slicing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'90). ACM Press, New York, NY, 246--256.

Digital Library

[2]

Apache. 2009. Apache Web server. http://httpd.apache.org/.

[3]

Apiwattanapong, T., Orso, A., and Harrold, M. 2004. A differencing algorithm for object-oriented programs. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA.

Digital Library

[4]

Ball, T., Naik, M., and Rajamani, S. 2003. From symptom to cause: Localizing errors in counterexample traces. In Proceedings of the International Symposium on Principles of Programming Languages (POPL). ACM Press, New York, NY.

Digital Library

[5]

Barrett, C. and Tinelli, C. 2007. CVC3. In Proceedings of the 19th International Conference on Computer-Aided Verification. 298--302.

Digital Library

[6]

Brumley, D., Caballero, J., Liang, Z., Newsome, J., and Song, D. 2007. Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation. In Proceedings of the USENIX Security Conference. USENIX Association, Berkeley, CA.

Digital Library

[7]

Brummayer, R. and Biere, A. 2009. Boolector: An efficient smt solver for bit-vectors and arrays. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'09). 174--177.

Digital Library

[8]

Bruttomesso, R., Cimatti, A., Franzén, A., Griggio, A., and Sebastiani, R. 2008. The MathSAT 4 SMT Solver. In Proceedings of the International Conference on Computer Aided Verification. 299--303.

Digital Library

[9]

Chen, Y., Rosenblum, D., and Vo, K. 1994. Testtube: A system for selective regression testing. In Proceedings of the International Conference on Software Engineering. IEEE Computer Society Press, Los Alamitos, CA.

Digital Library

[10]

Csallner, C. and Smaragdakis, Y. 2006. DSD-Crasher: A hybrid analysis tool for bug finding. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY.

Digital Library

[11]

de Moura, L. and Bjorner, N. 2008. Z3: An efficient SMT solver. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS).

Digital Library

[12]

Elbaum, S., Malishevsky, A., and Rothermel, G. 2000. Prioritizing test cases for regression testing. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY.

Digital Library

[13]

Ganesh, V. and Dill, D. L. 2007. A decision procedure for bit-vectors and arrays. In Proceedings of the Computer Aided Verification Conference (CAV). 524--536. Available online at http://sites.google.com/site/stpfastprover/.

Digital Library

[14]

Giroux, O. and Robillard, M. P. 2006. Detecting increases in feature coupling using regression tests. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'06/FSE-14). ACM Press, New York, NY, 163--174.

Digital Library

[15]

Godefroid, P., Klarlund, N., and Sen, K. 2005. DART: Directed automated random testing. In Proceedings of the Conference on Programming Languages Design and Implementation (PLDI). ACM Press, New York. NY.

Digital Library

[16]

Guo, L., Roychoudhury, A., and Wang, T. 2006. Accurately choosing execution runs for software fault localization. In Proceedings of the International Conference on Compiler Construction (CC).

Digital Library

[17]

Horowitz, S. 1990. Identifying the semantic and textual differences between two versions of a program. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY.

Digital Library

[18]

Hovemeyer, D. and Pugh, W. 2004. Finding bugs is easy. In Proceedings of the Companion to the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'04). ACM Press, New York, NY, 132--136.

Digital Library

[19]

Huang, S. 2009. Miniweb Web server. http://miniweb.sourceforge.net/.

[20]

Jackson, D. and Ladd, D. A. 1994. Semantic diff: A tool for summarizing the effects of modifications. In Proceedings of the International Conference on Software Maintenance. 243--252.

Digital Library

[21]

Korel, B. and Laski, J. W. 1988. Dynamic program slicing. Inform. Process. Letters 29, 3, 155--163.

Digital Library

[22]

Liblit, B. 2005. Cooperative bug isolation. Ph.D. dissertation, UC Berkeley.

Digital Library

[23]

Liblit, B., Naik, M., Zheng, A., Aiken, A., and Jordan, M. 2005. Scalable statistical bug isolation. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY.

Digital Library

[24]

LibPNG. 2009. libPNG library. http://www.libpng.org.

[25]

Person, S., Dwyer, M., Elbaum, S., and Pasareanu, C. 2008. Differential symbolic execution. In Proceedings of the International Conference on Foundations of Software Engineering (FSE). ACM Press, New York, NY.

Digital Library

[26]

QEMU. 2009. QEMU emulator. http://www.qemu.org.

[27]

Qi, D., Roychoudhury, A., Liang, Z., and Vaswani, K. 2009. Darwin: An approach for debugging evolving programs. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC-FSE). ACM Press, New York, NY, 33--42.

Digital Library

[28]

Ranise, S. and Tinelli, C. 2003. The SMT-LIB format: An initial proposal. In Proceedings of the Workshop on Pragmatics of Decision Procedures in Automated Reasoning (PDPAR).

[29]

Ren, X., Shah, F., Tip, F., Ryder, B. G., and Chesley, O. 2004. Chianti: A tool for change impact analysis of java programs. In Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'04). ACM Press, New York, NY, 432--448.

Digital Library

[30]

Renieris, M. and Reiss, S. P. 2003. Fault localization with nearest neighbor queries. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA.

[31]

Rothermel, G. and Harrold, M. J. 1997. A safe, efficient regression test selection technique. ACM Trans. Softw. Eng. Methodol. 6, 2, 173--210.

Digital Library

[32]

Santelices, R., Chittimalli, P., Apiwattanapong, T., Orso, A., and Harrold, M. 2008. Test-suite augmentation for evolving software. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA.

Digital Library

[33]

Savant. 2009. Savant Web server. http://savant.sourceforge.net/info.html.

[34]

Seacord, R., Plakosh, D., and Lewis, G. 2003. Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices. Addison-Wesley, Boston, MA.

Digital Library

[35]

Sen, K., Marinov, D., and Agha, G. 2005. Cute: A concolic unit testing engine for c. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Press, New York, NY, 263--272.

Digital Library

[36]

Sillito, J., Murphy, G., and De Volder, K. 2006. Questions programmers ask during software evolution tasks. In Proceedings of the International Conference on Foundations of Software Engineering (FSE). ACM Press, New York, NY.

Digital Library

[37]

Song, D., Brumley, D., Yin, H., Caballero, J., Jager, I., Kang, M. G., Liang, Z., Newsome, J., Poosankam, P., and Saxena, P. 2008. BitBlaze: A new approach to computer security via binary analysis. In Proceedings of the 4th International Conference on Information Systems Security. Keynote invited paper.

Digital Library

[38]

Sridharan, M., Fink, S. J., and Bodik, R. 2007. Thin slicing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). ACM Press, New York, NY, 112--122.

Digital Library

[39]

Srivastava, A. and Thiagarajan, J. 2002. Effectively prioritizing tests in development environment. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY, 97--106.

Digital Library

[40]

Wang, T. and Roychoudhury, A. 2004. Using compressed bytecode traces for slicing Java programs. In Proceedings of the 26th International Conference on Software Engineering (ICSE). IEEE Computer Society, Los Alamitos, CA, 512--521.

Digital Library

[41]

Zeller, A. 1999. Yesterday, my program worked. Today, it does not. Why&quest; In Proceedings of the 7th European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering. 253--267.

Digital Library

[42]

Zeller, A. 2002. Isolating cause-effect chains from computer programs. In Proceedings of the 10th ACM SIGSOFT Symposium on Foundations of Software Engineering. ACM Press, New York, NY, 1--10.

Digital Library

[43]

Zeller, A. and Hildebrandt, R. 2002. Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28, 2, 183--200.

Digital Library

[44]

Zhang, X., Gupta, N., and Gupta, R. 2006. Pruning dynamic slices with confidence. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY, 169--180.

Digital Library

[45]

Zhang, X., Tallam, S., Gupta, N., and Gupta, R. 2007. Towards locating execution omission errors. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY, 415--424.

Digital Library

Cited By

Wang BLi RLi MSaxena PChandra SBlincoe KTonella P(2023)TransMap: Pinpointing Mistakes in Neural Code TranslationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616322(999-1011)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616322
Badihi SAhmed KLi YRubin JGrundy JPollock LPenta M(2023)Responsibility in Context: On Applicability of Slicing in Semantic Regression AnalysisProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00057(563-575)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00057
Wang HLin YYang ZSun JLiu YDong JZheng QLiu T(2021)Explaining Regressions via Alignment Slicing and MendingIEEE Transactions on Software Engineering10.1109/TSE.2019.294956847:11(2421-2437)Online publication date: 1-Nov-2021
https://doi.org/10.1109/TSE.2019.2949568
Show More Cited By

Index Terms

DARWIN: An approach to debugging evolving programs

Recommendations

Darwin: an approach for debugging evolving programs
ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

Debugging refers to the laborious process of finding causes of program failures. Often, such failures are introduced when a program undergoes changes and evolves from a stable version to a new, modified version. In this paper, we propose an automated ...
Exposing Complex Bug-Triggering Conditions in Distributed Systems via Graph Mining
ICPP '11: Proceedings of the 2011 International Conference on Parallel Processing

Software bugs in distributed systems are notoriously hard to find due to the large number of components involved and the non-determinism introduced by race conditions between messages. This paper introduces Pop Mine, a tool for diagnosing corner-case ...
Debugging as a Science, that too, when your Program is Changing

Program debugging is an extremely time-consuming process, and it takes up a large portion of software development time. In practice, debugging is still very much of an art, with the developer painstakingly going through volumes of execution traces to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 21, Issue 3

June 2012

239 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/2211616

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2012

Accepted: 01 February 2011

Revised: 01 May 2010

Received: 01 October 2009

Published in TOSEM Volume 21, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Ministry of Education - Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
998
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang BLi RLi MSaxena PChandra SBlincoe KTonella P(2023)TransMap: Pinpointing Mistakes in Neural Code TranslationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616322(999-1011)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616322
Badihi SAhmed KLi YRubin JGrundy JPollock LPenta M(2023)Responsibility in Context: On Applicability of Slicing in Semantic Regression AnalysisProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00057(563-575)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00057
Wang HLin YYang ZSun JLiu YDong JZheng QLiu T(2021)Explaining Regressions via Alignment Slicing and MendingIEEE Transactions on Software Engineering10.1109/TSE.2019.294956847:11(2421-2437)Online publication date: 1-Nov-2021
https://doi.org/10.1109/TSE.2019.2949568
Johnson BBrun YMeliou ARothermel GBae D(2020)Causal testingProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380377(87-99)Online publication date: 27-Jun-2020
https://dl.acm.org/doi/10.1145/3377811.3380377
Zhang LLi ZFeng YZhang ZChan WZhang JZhou Y(2020)Improving Fault-Localization Accuracy by Referencing Debugging History to Alleviate Structure Bias in Code SuspiciousnessIEEE Transactions on Reliability10.1109/TR.2020.298297569:3(1021-1049)Online publication date: Sep-2020
https://doi.org/10.1109/TR.2020.2982975
Setiani NFerdiana RHartanto R(2020)Test Case Understandability ModelIEEE Access10.1109/ACCESS.2020.30228768(169036-169046)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3022876
Akber ARizvi SKhan MUddin VHashmani MAhmad J(2019)Dimensions of Robust Security Testing in Global Software EngineeringHuman Factors in Global Software Engineering10.4018/978-1-5225-9448-2.ch010(252-272)Online publication date: 2019
https://doi.org/10.4018/978-1-5225-9448-2.ch010
Insa DPérez SSilva JTamarit S(2018)Behaviour Preservation across Code Versions in ErlangScientific Programming10.1155/2018/92517622018Online publication date: 13-Jun-2018
https://dl.acm.org/doi/10.1155/2018/9251762
Mechtaev SGriggio ACimatti ARoychoudhury ALeavens GGarcia APăsăreanu C(2018)Symbolic execution with existential second-order constraintsProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236049(389-399)Online publication date: 26-Oct-2018
https://dl.acm.org/doi/10.1145/3236024.3236049
Wang HLiu TGuan XShen CZheng QYang Z(2017)Dependence Guided Symbolic ExecutionIEEE Transactions on Software Engineering10.1109/TSE.2016.258406343:3(252-271)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1109/TSE.2016.2584063
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents