skip to main content
research-article

DARWIN: An approach to debugging evolving programs

Published:03 July 2012Publication History
Skip Abstract Section

Abstract

Bugs in programs are often introduced when programs evolve from a stable version to a new version. In this article, we propose a new approach called DARWIN for automatically finding potential root causes of such bugs. Given two programs—a reference program and a modified program—and an input that fails on the modified program, our approach uses symbolic execution to automatically synthesize a new input that (a) is very similar to the failing input and (b) does not fail. We find the potential cause(s) of failure by comparing control-flow behavior of the passing and failing inputs and identifying code fragments where the control flows diverge.

A notable feature of our approach is that it handles hard-to-explain bugs, like code missing errors, by pointing to code in the reference program. We have implemented this approach and conducted experiments using several real-world applications, such as the Apache Web server, libPNG (a library for manipulating PNG images), and TCPflow (a program for displaying data sent through TCP connections). In each of these applications, DARWIN was able to localize bugs with high accuracy. Even though these applications contain several thousands of lines of code, DARWIN could usually narrow down the potential root cause(s) to less than ten lines. In addition, we find that the inputs synthesized by DARWIN provide additional value by revealing other undiscovered errors.

References

  1. Agrawal, H. and Horgan, J. R. 1990. Dynamic program slicing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'90). ACM Press, New York, NY, 246--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apache. 2009. Apache Web server. http://httpd.apache.org/.Google ScholarGoogle Scholar
  3. Apiwattanapong, T., Orso, A., and Harrold, M. 2004. A differencing algorithm for object-oriented programs. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ball, T., Naik, M., and Rajamani, S. 2003. From symptom to cause: Localizing errors in counterexample traces. In Proceedings of the International Symposium on Principles of Programming Languages (POPL). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Barrett, C. and Tinelli, C. 2007. CVC3. In Proceedings of the 19th International Conference on Computer-Aided Verification. 298--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Brumley, D., Caballero, J., Liang, Z., Newsome, J., and Song, D. 2007. Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation. In Proceedings of the USENIX Security Conference. USENIX Association, Berkeley, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Brummayer, R. and Biere, A. 2009. Boolector: An efficient smt solver for bit-vectors and arrays. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'09). 174--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bruttomesso, R., Cimatti, A., Franzén, A., Griggio, A., and Sebastiani, R. 2008. The MathSAT 4 SMT Solver. In Proceedings of the International Conference on Computer Aided Verification. 299--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chen, Y., Rosenblum, D., and Vo, K. 1994. Testtube: A system for selective regression testing. In Proceedings of the International Conference on Software Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Csallner, C. and Smaragdakis, Y. 2006. DSD-Crasher: A hybrid analysis tool for bug finding. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. de Moura, L. and Bjorner, N. 2008. Z3: An efficient SMT solver. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Elbaum, S., Malishevsky, A., and Rothermel, G. 2000. Prioritizing test cases for regression testing. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ganesh, V. and Dill, D. L. 2007. A decision procedure for bit-vectors and arrays. In Proceedings of the Computer Aided Verification Conference (CAV). 524--536. Available online at http://sites.google.com/site/stpfastprover/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Giroux, O. and Robillard, M. P. 2006. Detecting increases in feature coupling using regression tests. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'06/FSE-14). ACM Press, New York, NY, 163--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Godefroid, P., Klarlund, N., and Sen, K. 2005. DART: Directed automated random testing. In Proceedings of the Conference on Programming Languages Design and Implementation (PLDI). ACM Press, New York. NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Guo, L., Roychoudhury, A., and Wang, T. 2006. Accurately choosing execution runs for software fault localization. In Proceedings of the International Conference on Compiler Construction (CC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Horowitz, S. 1990. Identifying the semantic and textual differences between two versions of a program. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hovemeyer, D. and Pugh, W. 2004. Finding bugs is easy. In Proceedings of the Companion to the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA'04). ACM Press, New York, NY, 132--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Huang, S. 2009. Miniweb Web server. http://miniweb.sourceforge.net/.Google ScholarGoogle Scholar
  20. Jackson, D. and Ladd, D. A. 1994. Semantic diff: A tool for summarizing the effects of modifications. In Proceedings of the International Conference on Software Maintenance. 243--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Korel, B. and Laski, J. W. 1988. Dynamic program slicing. Inform. Process. Letters 29, 3, 155--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Liblit, B. 2005. Cooperative bug isolation. Ph.D. dissertation, UC Berkeley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Liblit, B., Naik, M., Zheng, A., Aiken, A., and Jordan, M. 2005. Scalable statistical bug isolation. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. LibPNG. 2009. libPNG library. http://www.libpng.org.Google ScholarGoogle Scholar
  25. Person, S., Dwyer, M., Elbaum, S., and Pasareanu, C. 2008. Differential symbolic execution. In Proceedings of the International Conference on Foundations of Software Engineering (FSE). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. QEMU. 2009. QEMU emulator. http://www.qemu.org.Google ScholarGoogle Scholar
  27. Qi, D., Roychoudhury, A., Liang, Z., and Vaswani, K. 2009. Darwin: An approach for debugging evolving programs. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC-FSE). ACM Press, New York, NY, 33--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ranise, S. and Tinelli, C. 2003. The SMT-LIB format: An initial proposal. In Proceedings of the Workshop on Pragmatics of Decision Procedures in Automated Reasoning (PDPAR).Google ScholarGoogle Scholar
  29. Ren, X., Shah, F., Tip, F., Ryder, B. G., and Chesley, O. 2004. Chianti: A tool for change impact analysis of java programs. In Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'04). ACM Press, New York, NY, 432--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Renieris, M. and Reiss, S. P. 2003. Fault localization with nearest neighbor queries. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA.Google ScholarGoogle Scholar
  31. Rothermel, G. and Harrold, M. J. 1997. A safe, efficient regression test selection technique. ACM Trans. Softw. Eng. Methodol. 6, 2, 173--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Santelices, R., Chittimalli, P., Apiwattanapong, T., Orso, A., and Harrold, M. 2008. Test-suite augmentation for evolving software. In Proceedings of the International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Savant. 2009. Savant Web server. http://savant.sourceforge.net/info.html.Google ScholarGoogle Scholar
  34. Seacord, R., Plakosh, D., and Lewis, G. 2003. Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices. Addison-Wesley, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Sen, K., Marinov, D., and Agha, G. 2005. Cute: A concolic unit testing engine for c. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Press, New York, NY, 263--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sillito, J., Murphy, G., and De Volder, K. 2006. Questions programmers ask during software evolution tasks. In Proceedings of the International Conference on Foundations of Software Engineering (FSE). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Song, D., Brumley, D., Yin, H., Caballero, J., Jager, I., Kang, M. G., Liang, Z., Newsome, J., Poosankam, P., and Saxena, P. 2008. BitBlaze: A new approach to computer security via binary analysis. In Proceedings of the 4th International Conference on Information Systems Security. Keynote invited paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sridharan, M., Fink, S. J., and Bodik, R. 2007. Thin slicing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). ACM Press, New York, NY, 112--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Srivastava, A. and Thiagarajan, J. 2002. Effectively prioritizing tests in development environment. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM Press, New York, NY, 97--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Wang, T. and Roychoudhury, A. 2004. Using compressed bytecode traces for slicing Java programs. In Proceedings of the 26th International Conference on Software Engineering (ICSE). IEEE Computer Society, Los Alamitos, CA, 512--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Zeller, A. 1999. Yesterday, my program worked. Today, it does not. Why? In Proceedings of the 7th European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering. 253--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zeller, A. 2002. Isolating cause-effect chains from computer programs. In Proceedings of the 10th ACM SIGSOFT Symposium on Foundations of Software Engineering. ACM Press, New York, NY, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Zeller, A. and Hildebrandt, R. 2002. Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28, 2, 183--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Zhang, X., Gupta, N., and Gupta, R. 2006. Pruning dynamic slices with confidence. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY, 169--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zhang, X., Tallam, S., Gupta, N., and Gupta, R. 2007. Towards locating execution omission errors. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY, 415--424. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DARWIN: An approach to debugging evolving programs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Software Engineering and Methodology
            ACM Transactions on Software Engineering and Methodology  Volume 21, Issue 3
            June 2012
            239 pages
            ISSN:1049-331X
            EISSN:1557-7392
            DOI:10.1145/2211616
            Issue’s Table of Contents

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 3 July 2012
            • Accepted: 1 February 2011
            • Revised: 1 May 2010
            • Received: 1 October 2009
            Published in tosem Volume 21, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader