Skip to main content

Is This a Bug or an Obsolete Test?

  • Conference paper
ECOOP 2013 – Object-Oriented Programming (ECOOP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7920))

Included in the following conference series:

Abstract

In software evolution, developers typically need to identify whether the failure of a test is due to a bug in the source code under test or the obsoleteness of the test code when they execute a test suite. Only after finding the cause of a failure can developers determine whether to fix the bug or repair the obsolete test. Researchers have proposed several techniques to automate test repair. However, test-repair techniques typically assume that test failures are always due to obsolete tests. Thus, such techniques may not be applicable in real world software evolution when developers do not know whether the failure is due to a bug or an obsolete test. To know whether the cause of a test failure lies in the source code under test or in the test code, we view this problem as a classification problem and propose an automatic approach based on machine learning. Specifically, we target Java software using the JUnit testing framework and collect a set of features that may be related to failures of tests. Using this set of features, we adopt the Best-first Decision Tree Learning algorithm to train a classifier with some existing regression test failures as training instances. Then, we use the classifier to classify future failed tests. Furthermore, we evaluated our approach using two Java programs in three scenarios (within the same version, within different versions of a program, and between different programs), and found that our approach can effectively classify the causes of failed tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: On the accuracy of spectrum-based fault localization. In: Testing: Academic and Industrial Conference Practice and Research Techniques, pp. 89–98 (2007)

    Google Scholar 

  2. Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar, A., Ernst, M.D.: Finding bugs in web applications using dynamic test generation and explicit state model checking. IEEE Transactions on Software Engineering 34(4), 474–494 (2010)

    Article  Google Scholar 

  3. Bowring, J.F., Rehg, J.M., Harrold, M.J.: Active learning for automatic classification of software behavior. In: ISSTA, pp. 195–205 (2004)

    Google Scholar 

  4. Brun, Y., Ernst, M.D.: Finding latent code errors via machine learning over program executions. In: ICSE, pp. 480–490 (2004)

    Google Scholar 

  5. Black, J., Melachrinoudis, E., Kaeli, D.: Bi-criteria models for all-uses test suite reduction. In: ICSE, pp. 106–115 (2004)

    Google Scholar 

  6. Daniel, B., Dig, D., Gvero, T., Jagannath, V., Jiaa, J., Mitchell, D., Nogiec, J., Tan, S.H., Marinov, D.: ReAssert: a tool for repairing broken unit tests. In: ICSE, pp. 1010–1012 (2011)

    Google Scholar 

  7. Daniel, B., Gvero, T., Marinov, D.: On test repair using symbolic execution. In: ISSTA, pp. 207–218 (2010)

    Google Scholar 

  8. Daniel, B., Jagannath, V., Dig, D., Marinov, D.: Reassert: suggesting repairs for broken unit tests. In: ASE, pp. 433–444 (2009)

    Google Scholar 

  9. Dean, B.C., Pressly, W.B., Malloy, B.A., Whitley, A.A.: A linear programming approach for automated localization of multiple faults. In: ASE, pp. 640–644 (2009)

    Google Scholar 

  10. Dig, D., Comertoglu, C., Marinov, D., Johnson, R.: Automated detection of refactorings in evolving components. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 404–428. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Dig, D., Marrero, J., Ernst, M.D.: Refactoring sequential Java code for concurrency via concurrent libraries. In: ICSE, pp. 397–407 (2009)

    Google Scholar 

  12. Elbaum, S., Malishevsky, A.G., Rothermel, G.: Test case prioritization: a family of empirical studies. IEEE Transactions on Software Engineering 28(2), 159–182 (2002)

    Article  Google Scholar 

  13. Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley (1999)

    Google Scholar 

  14. Francis, P., Leon, D., Minch, M., Podgurski, A.: Tree-based methods for classifying software failures. In: ISSRE, pp. 451–462 (2004)

    Google Scholar 

  15. Galli, M., Lanza, M., Nierstrasz, O., Wuyts, R.: Ordering broken unit tests for focused debugging. In: ICSM, pp. 114–123 (2004)

    Google Scholar 

  16. Ghezzi, C., Jazayeri, M., Mandrioli, D.: Fundamentals of Software Engineering. Prentice Hall PTR (2002)

    Google Scholar 

  17. Goues, C.L., Nguyen, T., Forrest, S., Weimer, W.: GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38(1), 54–72 (2012)

    Article  Google Scholar 

  18. Hao, D., Wu, X., Zhang, L.: An empirical study of execution-data classification based on machine learning. In: SEKE, pp. 283–288 (2012)

    Google Scholar 

  19. Hao, D., Xie, T., Zhang, L., Wang, X., Mei, H., Sun, J.: Test input reduction for result inspection to facilitate fault localization. Automated Software Engineering 17(1), 5–31 (2010)

    Article  Google Scholar 

  20. Hao, D., Zhang, L., Pan, Y., Mei, H., Sun, J.: On similarity-awareness in testing-based fault-localization. Automated Software Engineering 15(2), 207–249 (2008)

    Article  Google Scholar 

  21. Hao, D., Zhang, L., Wu, X., Mei, H., Rothermel, G.: On-demand test suite reduction. In: ICSE, pp. 738–748 (2012)

    Google Scholar 

  22. Hao, D., Zhang, L., Xie, T., Mei, H., Sun, J.: Interactive fault localization using test information. Journal of Computer Science and Technology 24(5), 962–974 (2009)

    Article  Google Scholar 

  23. Haran, M., Karr, A., Orso, A., Porter, A., Sanil, A.: Applying classification techniques to remotely-collected program execution data. In: FSE, pp. 146–155 (2005)

    Google Scholar 

  24. Harman, M., Alshahwan, N.: Automated session data repair for web application regression testing. In: ICST, pp. 298–307 (2008)

    Google Scholar 

  25. Harrold, M.J., Gupta, R., Soffa, M.L.: A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology 2(3), 270–285 (1993)

    Article  Google Scholar 

  26. Høst, E.W., Østvold, B.M.: Debugging method name. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 294–317. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  27. Jones, J.A., Harrold, M.J.: Empirical evaluation of tarantula automatic fault-localization technique. In: ASE, pp. 273–282 (2005)

    Google Scholar 

  28. Jose, M., Majumdar, R.: Cause clue clauses: Error localization using maximum satisfiability. In: PLDI, pp. 437–446 (2011)

    Google Scholar 

  29. Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches. In: ICSE (to appear, 2013)

    Google Scholar 

  30. Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: ICSE, pp. 481–490 (2011)

    Google Scholar 

  31. Koskinen, J.: Software Maintenance Cost (2003), http://users.jyu.fi/koskinen/smcosts.htm

  32. Lehman, M.M., Belady, L.A.: Program Evolution C Processes of Software Change. Academic Press, London (1985)

    Google Scholar 

  33. Leung, H.K.N., White, L.: Insights into regression testing. In: ICSM, pp. 60–69 (1989)

    Google Scholar 

  34. Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: PLDI, pp. 141–154 (2003)

    Google Scholar 

  35. Mansour, N., El-Fakih, K.: Simulated annealing and genetic algorithms for optimal regression testing. Journal of Software Maintenance 11(1), 19–34 (1999)

    Article  Google Scholar 

  36. Memon, A.M.: Automatically repairing event sequence-based gui test suites for regression testing. ACM Transactions on Software Engineering and Methodology 18(2), 1–35 (2008)

    Article  MathSciNet  Google Scholar 

  37. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 32(11), 1–12 (2007)

    Google Scholar 

  38. Mei, H., Hao, D., Zhang, L., Zhang, L., Zhou, J., Rothermel, G.: A static approach to prioritizing Junit test cases. IEEE Transactions on Software Engineering 38(6), 1258–1275 (2012)

    Article  Google Scholar 

  39. McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI, pp. 41–48 (1998)

    Google Scholar 

  40. Mirzaaghaei, M., Pastore, F., Pezze, M.: Supporting test suite evolution through test case adaption. In: ICST, pp. 231–240 (2012)

    Google Scholar 

  41. Pinto, L.S., Sinha, S., Orso, A.: Understanding myths and realities of test-suite evolution. In: FSE, pp. 1–11 (2012)

    Google Scholar 

  42. Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., Wang, B.: Automated support for classifying software failure reports. In: ICSE, pp. 465–474 (2003)

    Google Scholar 

  43. Quinlan, J.R.: Introduction of Decision Trees. Machine Learning (1986)

    Google Scholar 

  44. Rajan, A., Whalen, M.W., Heimdahl, M.P.E.: The effect of program and model structure on MC/DC test adequacy coverage. In: ICSE, pp. 161–170 (2008)

    Google Scholar 

  45. Robinson, B., Ernst, M.D., Perkins, J.H., Augustine, V., Li, N.: Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs. In: ASE, pp.23-32 (2011)

    Google Scholar 

  46. Seacord, R.C., Plakosh, D., Lewis, G.A.: Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley (2003)

    Google Scholar 

  47. Shi, H.: Best-First Decision Tree Learning. Master Thesis, the University of Waikato (2007)

    Google Scholar 

  48. Stiglic, G., Kocbek, S., Kokol, P.: Comprehensibility of classifiers for future microarray analysis datasets. In: PRIB, pp. 1–11 (2009)

    Google Scholar 

  49. Taneja, K., Dig, D., Xie, T.: Automated detection of api refactorings in libraries. In: ASE, pp. 377–380 (2007)

    Google Scholar 

  50. Tillmann, N., Schulte, W.: Unit Tests Reloaded: Parameterized Unit Testing with Symbolic Execution. Microsoft Research. Technical Report (2005)

    Google Scholar 

  51. Wang, T., Roychoudhury, A.: Automated path generation for software fault localization. In: ASE, pp. 347–351 (2005)

    Google Scholar 

  52. Wang, X., Cheung, S.C., Chan, W.K., Zhang, Z.: Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In: ICSE, pp. 45–55 (2009)

    Google Scholar 

  53. Wang, X., Dang, Y., Zhang, L., Zhang, D., Lan, E., Mei, H.: Can I clone this piece of code here? In: ASE, pp. 170–179 (2012)

    Google Scholar 

  54. Wang, X., Zhang, L., Xie, T., Anvik, J.: Sun. J.: An approach to detecting duplicate bug reports using natural language and execution information. In: ICSE, pp. 461–470 (2008)

    Google Scholar 

  55. Weiβgerer, P., Diehl, S.: Identifying refactorings from source-code changes. In: ASE, pp. 231–240 (2006)

    Google Scholar 

  56. Weimer, W., Nguyen, T., Goues, C.L., Forrest, S.: Automatically finding patches using genetic programming. In: ICSE, pp. 364–374 (2009)

    Google Scholar 

  57. Wloka, J., Ryder, B.G., Tip, F.: JUnitMx - A change-aware unit testing tool. In: ICSE, pp. 567–570 (2009)

    Google Scholar 

  58. Yang, G., Khurshid, S., Kim, M.: Specification-based test repair using a lightweight formal method. In: Giannakopoulou, D., Méry, D. (eds.) FM 2012. LNCS, vol. 7436, pp. 455–470. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  59. Zaidman, A., Rompaey, B.V., Demeyer, S., Deursen, A.: Mining software repositories to study co-evolution of production & test code. In: ICST, pp. 433–444 (2008)

    Google Scholar 

  60. Zhang, H., Zhang, X., Gu, M.: Predicting defective software components from code complexity measures. In: PRDC, pp. 93–96 (2007)

    Google Scholar 

  61. Zhang, L., Hao, D., Zhang, L., Rothermel, G., Mei, H.: Bridging the gap between the total and the additional test-case prioritization strategies. In: ICSE, pp. 192–203 (2013)

    Google Scholar 

  62. Zhang, L., Hou, S., Guo, C., Xie, T., Mei, H.: Time-aware test-case prioritization using integer linear programming. In: ISSTA, pp. 213–223 (2009)

    Google Scholar 

  63. Zhang, L., Hou, S., Hu, J., Xie, T., Mei, H.: Is operator-based mutant selection superior to random mutant selection? In: ICSE, pp. 435–444 (2010)

    Google Scholar 

  64. Zhang, L., Kim, M., Khurshid, S.: Localizing failure-inducing program edits based on spectrum information. In: ICSM, pp. 23–32 (2011)

    Google Scholar 

  65. Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: An empirical study of JUnit test-suite reduction. In: ISSRE, pp. 170–179 (2011)

    Google Scholar 

  66. Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: Regression mutation testing. In: ISSTA, pp. 331–341 (2012)

    Google Scholar 

  67. Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Jtop: Managing JUnit test cases in absence of coverage information. In: ASE (Research Demo Track), pp. 673–675 (2009)

    Google Scholar 

  68. Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Prioritizing JUnit test cases in absence of coverage information. In: ICSM, pp. 19–28 (2009)

    Google Scholar 

  69. Zheng, X., Chen, M.H.: Maintaining multi-tier web applications. In: ICSM, pp. 355–364 (2007)

    Google Scholar 

  70. Zhong, H., Xie, T., Zhang, L., Pei, J., Mei, H.: MAPO: Mining and recommending API usage patterns. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 318–343. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  71. Zhong, H., Zhang, L., Mei, H.: An experimental study of four typical test suite reduction techniques. Information and Software Technolgoy 50, 534–546 (2008)

    Article  Google Scholar 

  72. Zimmermann, T., Nagappan, N., Gall, H.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: FSE, pp. 91–100 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hao, D., Lan, T., Zhang, H., Guo, C., Zhang, L. (2013). Is This a Bug or an Obsolete Test?. In: Castagna, G. (eds) ECOOP 2013 – Object-Oriented Programming. ECOOP 2013. Lecture Notes in Computer Science, vol 7920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39038-8_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39038-8_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39037-1

  • Online ISBN: 978-3-642-39038-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics