Is This a Bug or an Obsolete Test?

Hao, Dan; Lan, Tian; Zhang, Hongyu; Guo, Chao; Zhang, Lu

doi:10.1007/978-3-642-39038-8_25

Dan Hao^17,18,
Tian Lan^17,18,
Hongyu Zhang¹⁹,
Chao Guo^17,18 &
…
Lu Zhang^17,18

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7920))

Included in the following conference series:

European Conference on Object-Oriented Programming

1848 Accesses
14 Citations

Abstract

In software evolution, developers typically need to identify whether the failure of a test is due to a bug in the source code under test or the obsoleteness of the test code when they execute a test suite. Only after finding the cause of a failure can developers determine whether to fix the bug or repair the obsolete test. Researchers have proposed several techniques to automate test repair. However, test-repair techniques typically assume that test failures are always due to obsolete tests. Thus, such techniques may not be applicable in real world software evolution when developers do not know whether the failure is due to a bug or an obsolete test. To know whether the cause of a test failure lies in the source code under test or in the test code, we view this problem as a classification problem and propose an automatic approach based on machine learning. Specifically, we target Java software using the JUnit testing framework and collect a set of features that may be related to failures of tests. Using this set of features, we adopt the Best-first Decision Tree Learning algorithm to train a classifier with some existing regression test failures as training instances. Then, we use the classifier to classify future failed tests. Furthermore, we evaluated our approach using two Java programs in three scenarios (within the same version, within different versions of a program, and between different programs), and found that our approach can effectively classify the causes of failed tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: On the accuracy of spectrum-based fault localization. In: Testing: Academic and Industrial Conference Practice and Research Techniques, pp. 89–98 (2007)
Google Scholar
Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar, A., Ernst, M.D.: Finding bugs in web applications using dynamic test generation and explicit state model checking. IEEE Transactions on Software Engineering 34(4), 474–494 (2010)
Article Google Scholar
Bowring, J.F., Rehg, J.M., Harrold, M.J.: Active learning for automatic classification of software behavior. In: ISSTA, pp. 195–205 (2004)
Google Scholar
Brun, Y., Ernst, M.D.: Finding latent code errors via machine learning over program executions. In: ICSE, pp. 480–490 (2004)
Google Scholar
Black, J., Melachrinoudis, E., Kaeli, D.: Bi-criteria models for all-uses test suite reduction. In: ICSE, pp. 106–115 (2004)
Google Scholar
Daniel, B., Dig, D., Gvero, T., Jagannath, V., Jiaa, J., Mitchell, D., Nogiec, J., Tan, S.H., Marinov, D.: ReAssert: a tool for repairing broken unit tests. In: ICSE, pp. 1010–1012 (2011)
Google Scholar
Daniel, B., Gvero, T., Marinov, D.: On test repair using symbolic execution. In: ISSTA, pp. 207–218 (2010)
Google Scholar
Daniel, B., Jagannath, V., Dig, D., Marinov, D.: Reassert: suggesting repairs for broken unit tests. In: ASE, pp. 433–444 (2009)
Google Scholar
Dean, B.C., Pressly, W.B., Malloy, B.A., Whitley, A.A.: A linear programming approach for automated localization of multiple faults. In: ASE, pp. 640–644 (2009)
Google Scholar
Dig, D., Comertoglu, C., Marinov, D., Johnson, R.: Automated detection of refactorings in evolving components. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 404–428. Springer, Heidelberg (2006)
Chapter Google Scholar
Dig, D., Marrero, J., Ernst, M.D.: Refactoring sequential Java code for concurrency via concurrent libraries. In: ICSE, pp. 397–407 (2009)
Google Scholar
Elbaum, S., Malishevsky, A.G., Rothermel, G.: Test case prioritization: a family of empirical studies. IEEE Transactions on Software Engineering 28(2), 159–182 (2002)
Article Google Scholar
Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley (1999)
Google Scholar
Francis, P., Leon, D., Minch, M., Podgurski, A.: Tree-based methods for classifying software failures. In: ISSRE, pp. 451–462 (2004)
Google Scholar
Galli, M., Lanza, M., Nierstrasz, O., Wuyts, R.: Ordering broken unit tests for focused debugging. In: ICSM, pp. 114–123 (2004)
Google Scholar
Ghezzi, C., Jazayeri, M., Mandrioli, D.: Fundamentals of Software Engineering. Prentice Hall PTR (2002)
Google Scholar
Goues, C.L., Nguyen, T., Forrest, S., Weimer, W.: GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38(1), 54–72 (2012)
Article Google Scholar
Hao, D., Wu, X., Zhang, L.: An empirical study of execution-data classification based on machine learning. In: SEKE, pp. 283–288 (2012)
Google Scholar
Hao, D., Xie, T., Zhang, L., Wang, X., Mei, H., Sun, J.: Test input reduction for result inspection to facilitate fault localization. Automated Software Engineering 17(1), 5–31 (2010)
Article Google Scholar
Hao, D., Zhang, L., Pan, Y., Mei, H., Sun, J.: On similarity-awareness in testing-based fault-localization. Automated Software Engineering 15(2), 207–249 (2008)
Article Google Scholar
Hao, D., Zhang, L., Wu, X., Mei, H., Rothermel, G.: On-demand test suite reduction. In: ICSE, pp. 738–748 (2012)
Google Scholar
Hao, D., Zhang, L., Xie, T., Mei, H., Sun, J.: Interactive fault localization using test information. Journal of Computer Science and Technology 24(5), 962–974 (2009)
Article Google Scholar
Haran, M., Karr, A., Orso, A., Porter, A., Sanil, A.: Applying classification techniques to remotely-collected program execution data. In: FSE, pp. 146–155 (2005)
Google Scholar
Harman, M., Alshahwan, N.: Automated session data repair for web application regression testing. In: ICST, pp. 298–307 (2008)
Google Scholar
Harrold, M.J., Gupta, R., Soffa, M.L.: A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology 2(3), 270–285 (1993)
Article Google Scholar
Høst, E.W., Østvold, B.M.: Debugging method name. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 294–317. Springer, Heidelberg (2009)
Chapter Google Scholar
Jones, J.A., Harrold, M.J.: Empirical evaluation of tarantula automatic fault-localization technique. In: ASE, pp. 273–282 (2005)
Google Scholar
Jose, M., Majumdar, R.: Cause clue clauses: Error localization using maximum satisfiability. In: PLDI, pp. 437–446 (2011)
Google Scholar
Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches. In: ICSE (to appear, 2013)
Google Scholar
Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: ICSE, pp. 481–490 (2011)
Google Scholar
Koskinen, J.: Software Maintenance Cost (2003), http://users.jyu.fi/koskinen/smcosts.htm
Lehman, M.M., Belady, L.A.: Program Evolution C Processes of Software Change. Academic Press, London (1985)
Google Scholar
Leung, H.K.N., White, L.: Insights into regression testing. In: ICSM, pp. 60–69 (1989)
Google Scholar
Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: PLDI, pp. 141–154 (2003)
Google Scholar
Mansour, N., El-Fakih, K.: Simulated annealing and genetic algorithms for optimal regression testing. Journal of Software Maintenance 11(1), 19–34 (1999)
Article Google Scholar
Memon, A.M.: Automatically repairing event sequence-based gui test suites for regression testing. ACM Transactions on Software Engineering and Methodology 18(2), 1–35 (2008)
Article MathSciNet Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 32(11), 1–12 (2007)
Google Scholar
Mei, H., Hao, D., Zhang, L., Zhang, L., Zhou, J., Rothermel, G.: A static approach to prioritizing Junit test cases. IEEE Transactions on Software Engineering 38(6), 1258–1275 (2012)
Article Google Scholar
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI, pp. 41–48 (1998)
Google Scholar
Mirzaaghaei, M., Pastore, F., Pezze, M.: Supporting test suite evolution through test case adaption. In: ICST, pp. 231–240 (2012)
Google Scholar
Pinto, L.S., Sinha, S., Orso, A.: Understanding myths and realities of test-suite evolution. In: FSE, pp. 1–11 (2012)
Google Scholar
Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., Wang, B.: Automated support for classifying software failure reports. In: ICSE, pp. 465–474 (2003)
Google Scholar
Quinlan, J.R.: Introduction of Decision Trees. Machine Learning (1986)
Google Scholar
Rajan, A., Whalen, M.W., Heimdahl, M.P.E.: The effect of program and model structure on MC/DC test adequacy coverage. In: ICSE, pp. 161–170 (2008)
Google Scholar
Robinson, B., Ernst, M.D., Perkins, J.H., Augustine, V., Li, N.: Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs. In: ASE, pp.23-32 (2011)
Google Scholar
Seacord, R.C., Plakosh, D., Lewis, G.A.: Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley (2003)
Google Scholar
Shi, H.: Best-First Decision Tree Learning. Master Thesis, the University of Waikato (2007)
Google Scholar
Stiglic, G., Kocbek, S., Kokol, P.: Comprehensibility of classifiers for future microarray analysis datasets. In: PRIB, pp. 1–11 (2009)
Google Scholar
Taneja, K., Dig, D., Xie, T.: Automated detection of api refactorings in libraries. In: ASE, pp. 377–380 (2007)
Google Scholar
Tillmann, N., Schulte, W.: Unit Tests Reloaded: Parameterized Unit Testing with Symbolic Execution. Microsoft Research. Technical Report (2005)
Google Scholar
Wang, T., Roychoudhury, A.: Automated path generation for software fault localization. In: ASE, pp. 347–351 (2005)
Google Scholar
Wang, X., Cheung, S.C., Chan, W.K., Zhang, Z.: Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In: ICSE, pp. 45–55 (2009)
Google Scholar
Wang, X., Dang, Y., Zhang, L., Zhang, D., Lan, E., Mei, H.: Can I clone this piece of code here? In: ASE, pp. 170–179 (2012)
Google Scholar
Wang, X., Zhang, L., Xie, T., Anvik, J.: Sun. J.: An approach to detecting duplicate bug reports using natural language and execution information. In: ICSE, pp. 461–470 (2008)
Google Scholar
Weiβgerer, P., Diehl, S.: Identifying refactorings from source-code changes. In: ASE, pp. 231–240 (2006)
Google Scholar
Weimer, W., Nguyen, T., Goues, C.L., Forrest, S.: Automatically finding patches using genetic programming. In: ICSE, pp. 364–374 (2009)
Google Scholar
Wloka, J., Ryder, B.G., Tip, F.: JUnitMx - A change-aware unit testing tool. In: ICSE, pp. 567–570 (2009)
Google Scholar
Yang, G., Khurshid, S., Kim, M.: Specification-based test repair using a lightweight formal method. In: Giannakopoulou, D., Méry, D. (eds.) FM 2012. LNCS, vol. 7436, pp. 455–470. Springer, Heidelberg (2012)
Chapter Google Scholar
Zaidman, A., Rompaey, B.V., Demeyer, S., Deursen, A.: Mining software repositories to study co-evolution of production & test code. In: ICST, pp. 433–444 (2008)
Google Scholar
Zhang, H., Zhang, X., Gu, M.: Predicting defective software components from code complexity measures. In: PRDC, pp. 93–96 (2007)
Google Scholar
Zhang, L., Hao, D., Zhang, L., Rothermel, G., Mei, H.: Bridging the gap between the total and the additional test-case prioritization strategies. In: ICSE, pp. 192–203 (2013)
Google Scholar
Zhang, L., Hou, S., Guo, C., Xie, T., Mei, H.: Time-aware test-case prioritization using integer linear programming. In: ISSTA, pp. 213–223 (2009)
Google Scholar
Zhang, L., Hou, S., Hu, J., Xie, T., Mei, H.: Is operator-based mutant selection superior to random mutant selection? In: ICSE, pp. 435–444 (2010)
Google Scholar
Zhang, L., Kim, M., Khurshid, S.: Localizing failure-inducing program edits based on spectrum information. In: ICSM, pp. 23–32 (2011)
Google Scholar
Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: An empirical study of JUnit test-suite reduction. In: ISSRE, pp. 170–179 (2011)
Google Scholar
Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: Regression mutation testing. In: ISSTA, pp. 331–341 (2012)
Google Scholar
Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Jtop: Managing JUnit test cases in absence of coverage information. In: ASE (Research Demo Track), pp. 673–675 (2009)
Google Scholar
Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Prioritizing JUnit test cases in absence of coverage information. In: ICSM, pp. 19–28 (2009)
Google Scholar
Zheng, X., Chen, M.H.: Maintaining multi-tier web applications. In: ICSM, pp. 355–364 (2007)
Google Scholar
Zhong, H., Xie, T., Zhang, L., Pei, J., Mei, H.: MAPO: Mining and recommending API usage patterns. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 318–343. Springer, Heidelberg (2009)
Chapter Google Scholar
Zhong, H., Zhang, L., Mei, H.: An experimental study of four typical test suite reduction techniques. Information and Software Technolgoy 50, 534–546 (2008)
Article Google Scholar
Zimmermann, T., Nagappan, N., Gall, H.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: FSE, pp. 91–100 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of High Confidence Software Technologies, Peking University, P.R. China
Dan Hao, Tian Lan, Chao Guo & Lu Zhang
MoE Institute of Software, School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, P.R. China
Dan Hao, Tian Lan, Chao Guo & Lu Zhang
Tsinghua University, 100084, P.R. China
Hongyu Zhang

Authors

Dan Hao
View author publications
You can also search for this author in PubMed Google Scholar
Tian Lan
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Guo
View author publications
You can also search for this author in PubMed Google Scholar
Lu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNRS, PPS - Université Paris Diderot, Case 7014, 75205, Paris Cedex 13, France
Giuseppe Castagna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hao, D., Lan, T., Zhang, H., Guo, C., Zhang, L. (2013). Is This a Bug or an Obsolete Test?. In: Castagna, G. (eds) ECOOP 2013 – Object-Oriented Programming. ECOOP 2013. Lecture Notes in Computer Science, vol 7920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39038-8_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-39038-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39037-1
Online ISBN: 978-3-642-39038-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics