Abstract
In software evolution, developers typically need to identify whether the failure of a test is due to a bug in the source code under test or the obsoleteness of the test code when they execute a test suite. Only after finding the cause of a failure can developers determine whether to fix the bug or repair the obsolete test. Researchers have proposed several techniques to automate test repair. However, test-repair techniques typically assume that test failures are always due to obsolete tests. Thus, such techniques may not be applicable in real world software evolution when developers do not know whether the failure is due to a bug or an obsolete test. To know whether the cause of a test failure lies in the source code under test or in the test code, we view this problem as a classification problem and propose an automatic approach based on machine learning. Specifically, we target Java software using the JUnit testing framework and collect a set of features that may be related to failures of tests. Using this set of features, we adopt the Best-first Decision Tree Learning algorithm to train a classifier with some existing regression test failures as training instances. Then, we use the classifier to classify future failed tests. Furthermore, we evaluated our approach using two Java programs in three scenarios (within the same version, within different versions of a program, and between different programs), and found that our approach can effectively classify the causes of failed tests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: On the accuracy of spectrum-based fault localization. In: Testing: Academic and Industrial Conference Practice and Research Techniques, pp. 89–98 (2007)
Artzi, S., Kiezun, A., Dolby, J., Tip, F., Dig, D., Paradkar, A., Ernst, M.D.: Finding bugs in web applications using dynamic test generation and explicit state model checking. IEEE Transactions on Software Engineering 34(4), 474–494 (2010)
Bowring, J.F., Rehg, J.M., Harrold, M.J.: Active learning for automatic classification of software behavior. In: ISSTA, pp. 195–205 (2004)
Brun, Y., Ernst, M.D.: Finding latent code errors via machine learning over program executions. In: ICSE, pp. 480–490 (2004)
Black, J., Melachrinoudis, E., Kaeli, D.: Bi-criteria models for all-uses test suite reduction. In: ICSE, pp. 106–115 (2004)
Daniel, B., Dig, D., Gvero, T., Jagannath, V., Jiaa, J., Mitchell, D., Nogiec, J., Tan, S.H., Marinov, D.: ReAssert: a tool for repairing broken unit tests. In: ICSE, pp. 1010–1012 (2011)
Daniel, B., Gvero, T., Marinov, D.: On test repair using symbolic execution. In: ISSTA, pp. 207–218 (2010)
Daniel, B., Jagannath, V., Dig, D., Marinov, D.: Reassert: suggesting repairs for broken unit tests. In: ASE, pp. 433–444 (2009)
Dean, B.C., Pressly, W.B., Malloy, B.A., Whitley, A.A.: A linear programming approach for automated localization of multiple faults. In: ASE, pp. 640–644 (2009)
Dig, D., Comertoglu, C., Marinov, D., Johnson, R.: Automated detection of refactorings in evolving components. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 404–428. Springer, Heidelberg (2006)
Dig, D., Marrero, J., Ernst, M.D.: Refactoring sequential Java code for concurrency via concurrent libraries. In: ICSE, pp. 397–407 (2009)
Elbaum, S., Malishevsky, A.G., Rothermel, G.: Test case prioritization: a family of empirical studies. IEEE Transactions on Software Engineering 28(2), 159–182 (2002)
Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the Design of Existing Code. Addison-Wesley (1999)
Francis, P., Leon, D., Minch, M., Podgurski, A.: Tree-based methods for classifying software failures. In: ISSRE, pp. 451–462 (2004)
Galli, M., Lanza, M., Nierstrasz, O., Wuyts, R.: Ordering broken unit tests for focused debugging. In: ICSM, pp. 114–123 (2004)
Ghezzi, C., Jazayeri, M., Mandrioli, D.: Fundamentals of Software Engineering. Prentice Hall PTR (2002)
Goues, C.L., Nguyen, T., Forrest, S., Weimer, W.: GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38(1), 54–72 (2012)
Hao, D., Wu, X., Zhang, L.: An empirical study of execution-data classification based on machine learning. In: SEKE, pp. 283–288 (2012)
Hao, D., Xie, T., Zhang, L., Wang, X., Mei, H., Sun, J.: Test input reduction for result inspection to facilitate fault localization. Automated Software Engineering 17(1), 5–31 (2010)
Hao, D., Zhang, L., Pan, Y., Mei, H., Sun, J.: On similarity-awareness in testing-based fault-localization. Automated Software Engineering 15(2), 207–249 (2008)
Hao, D., Zhang, L., Wu, X., Mei, H., Rothermel, G.: On-demand test suite reduction. In: ICSE, pp. 738–748 (2012)
Hao, D., Zhang, L., Xie, T., Mei, H., Sun, J.: Interactive fault localization using test information. Journal of Computer Science and Technology 24(5), 962–974 (2009)
Haran, M., Karr, A., Orso, A., Porter, A., Sanil, A.: Applying classification techniques to remotely-collected program execution data. In: FSE, pp. 146–155 (2005)
Harman, M., Alshahwan, N.: Automated session data repair for web application regression testing. In: ICST, pp. 298–307 (2008)
Harrold, M.J., Gupta, R., Soffa, M.L.: A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology 2(3), 270–285 (1993)
Høst, E.W., Østvold, B.M.: Debugging method name. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 294–317. Springer, Heidelberg (2009)
Jones, J.A., Harrold, M.J.: Empirical evaluation of tarantula automatic fault-localization technique. In: ASE, pp. 273–282 (2005)
Jose, M., Majumdar, R.: Cause clue clauses: Error localization using maximum satisfiability. In: PLDI, pp. 437–446 (2011)
Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches. In: ICSE (to appear, 2013)
Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: ICSE, pp. 481–490 (2011)
Koskinen, J.: Software Maintenance Cost (2003), http://users.jyu.fi/koskinen/smcosts.htm
Lehman, M.M., Belady, L.A.: Program Evolution C Processes of Software Change. Academic Press, London (1985)
Leung, H.K.N., White, L.: Insights into regression testing. In: ICSM, pp. 60–69 (1989)
Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: PLDI, pp. 141–154 (2003)
Mansour, N., El-Fakih, K.: Simulated annealing and genetic algorithms for optimal regression testing. Journal of Software Maintenance 11(1), 19–34 (1999)
Memon, A.M.: Automatically repairing event sequence-based gui test suites for regression testing. ACM Transactions on Software Engineering and Methodology 18(2), 1–35 (2008)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 32(11), 1–12 (2007)
Mei, H., Hao, D., Zhang, L., Zhang, L., Zhou, J., Rothermel, G.: A static approach to prioritizing Junit test cases. IEEE Transactions on Software Engineering 38(6), 1258–1275 (2012)
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI, pp. 41–48 (1998)
Mirzaaghaei, M., Pastore, F., Pezze, M.: Supporting test suite evolution through test case adaption. In: ICST, pp. 231–240 (2012)
Pinto, L.S., Sinha, S., Orso, A.: Understanding myths and realities of test-suite evolution. In: FSE, pp. 1–11 (2012)
Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., Wang, B.: Automated support for classifying software failure reports. In: ICSE, pp. 465–474 (2003)
Quinlan, J.R.: Introduction of Decision Trees. Machine Learning (1986)
Rajan, A., Whalen, M.W., Heimdahl, M.P.E.: The effect of program and model structure on MC/DC test adequacy coverage. In: ICSE, pp. 161–170 (2008)
Robinson, B., Ernst, M.D., Perkins, J.H., Augustine, V., Li, N.: Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs. In: ASE, pp.23-32 (2011)
Seacord, R.C., Plakosh, D., Lewis, G.A.: Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley (2003)
Shi, H.: Best-First Decision Tree Learning. Master Thesis, the University of Waikato (2007)
Stiglic, G., Kocbek, S., Kokol, P.: Comprehensibility of classifiers for future microarray analysis datasets. In: PRIB, pp. 1–11 (2009)
Taneja, K., Dig, D., Xie, T.: Automated detection of api refactorings in libraries. In: ASE, pp. 377–380 (2007)
Tillmann, N., Schulte, W.: Unit Tests Reloaded: Parameterized Unit Testing with Symbolic Execution. Microsoft Research. Technical Report (2005)
Wang, T., Roychoudhury, A.: Automated path generation for software fault localization. In: ASE, pp. 347–351 (2005)
Wang, X., Cheung, S.C., Chan, W.K., Zhang, Z.: Taming coincidental correctness: Coverage refinement with context patterns to improve fault localization. In: ICSE, pp. 45–55 (2009)
Wang, X., Dang, Y., Zhang, L., Zhang, D., Lan, E., Mei, H.: Can I clone this piece of code here? In: ASE, pp. 170–179 (2012)
Wang, X., Zhang, L., Xie, T., Anvik, J.: Sun. J.: An approach to detecting duplicate bug reports using natural language and execution information. In: ICSE, pp. 461–470 (2008)
Weiβgerer, P., Diehl, S.: Identifying refactorings from source-code changes. In: ASE, pp. 231–240 (2006)
Weimer, W., Nguyen, T., Goues, C.L., Forrest, S.: Automatically finding patches using genetic programming. In: ICSE, pp. 364–374 (2009)
Wloka, J., Ryder, B.G., Tip, F.: JUnitMx - A change-aware unit testing tool. In: ICSE, pp. 567–570 (2009)
Yang, G., Khurshid, S., Kim, M.: Specification-based test repair using a lightweight formal method. In: Giannakopoulou, D., Méry, D. (eds.) FM 2012. LNCS, vol. 7436, pp. 455–470. Springer, Heidelberg (2012)
Zaidman, A., Rompaey, B.V., Demeyer, S., Deursen, A.: Mining software repositories to study co-evolution of production & test code. In: ICST, pp. 433–444 (2008)
Zhang, H., Zhang, X., Gu, M.: Predicting defective software components from code complexity measures. In: PRDC, pp. 93–96 (2007)
Zhang, L., Hao, D., Zhang, L., Rothermel, G., Mei, H.: Bridging the gap between the total and the additional test-case prioritization strategies. In: ICSE, pp. 192–203 (2013)
Zhang, L., Hou, S., Guo, C., Xie, T., Mei, H.: Time-aware test-case prioritization using integer linear programming. In: ISSTA, pp. 213–223 (2009)
Zhang, L., Hou, S., Hu, J., Xie, T., Mei, H.: Is operator-based mutant selection superior to random mutant selection? In: ICSE, pp. 435–444 (2010)
Zhang, L., Kim, M., Khurshid, S.: Localizing failure-inducing program edits based on spectrum information. In: ICSM, pp. 23–32 (2011)
Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: An empirical study of JUnit test-suite reduction. In: ISSRE, pp. 170–179 (2011)
Zhang, L., Marinov, D., Zhang, L., Khurshid, S.: Regression mutation testing. In: ISSTA, pp. 331–341 (2012)
Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Jtop: Managing JUnit test cases in absence of coverage information. In: ASE (Research Demo Track), pp. 673–675 (2009)
Zhang, L., Zhou, J., Hao, D., Zhang, L., Mei, H.: Prioritizing JUnit test cases in absence of coverage information. In: ICSM, pp. 19–28 (2009)
Zheng, X., Chen, M.H.: Maintaining multi-tier web applications. In: ICSM, pp. 355–364 (2007)
Zhong, H., Xie, T., Zhang, L., Pei, J., Mei, H.: MAPO: Mining and recommending API usage patterns. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 318–343. Springer, Heidelberg (2009)
Zhong, H., Zhang, L., Mei, H.: An experimental study of four typical test suite reduction techniques. Information and Software Technolgoy 50, 534–546 (2008)
Zimmermann, T., Nagappan, N., Gall, H.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: FSE, pp. 91–100 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hao, D., Lan, T., Zhang, H., Guo, C., Zhang, L. (2013). Is This a Bug or an Obsolete Test?. In: Castagna, G. (eds) ECOOP 2013 – Object-Oriented Programming. ECOOP 2013. Lecture Notes in Computer Science, vol 7920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39038-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-39038-8_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39037-1
Online ISBN: 978-3-642-39038-8
eBook Packages: Computer ScienceComputer Science (R0)