skip to main content
10.1145/3293882.3330561acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Failure clustering without coverage

Published: 10 July 2019 Publication History

Abstract

Developing and integrating software in the automotive industry is a complex task and requires extensive testing. An important cost factor in testing and debugging is the time required to analyze failing tests. In the context of regression testing, usually, large numbers of tests fail due to a few underlying faults. Clustering failing tests with respect to their underlying faults can, therefore, help in reducing the required analysis time. In this paper, we propose a clustering technique to group failing hardware-in-the-loop tests based on non-code-based features, retrieved from three different sources. To effectively reduce the analysis effort, the clustering tool selects a representative test for each cluster. Instead of analyzing all failing tests, testers only inspect the representative tests to find the underlying faults. We evaluated the effectiveness and efficiency of our solution in a major automotive company using 86 regression test runs, 8743 failing tests, and 1531 faults. The results show that utilizing our clustering tool, testers can reduce the analysis time more than 60% and find more than 80% of the faults only by inspecting the representative tests.

References

[1]
M Ankerst, MM Breunig, HP Kriegel, and J Sander. 1999. OPTICS: Ordering points to identify the clustering structure. SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999: SIGMOD99: PROCEEDINGS OF THE 1999 ACM SIGMOD - INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (1999).
[2]
J. Black, E. Melachrinoudis, and D. Kaeli. 2004. Bi-criteria models for all-uses test suite reduction. In Proceedings. 26th International Conference on Software Failure Clustering without Coverage ISSTA ’19, July 15–19, 2019, Beijing, China Engineering. 106–115.
[3]
James F. Bowring, James M. Rehg, and Mary Jean Harrold. 2004. Active learning for automatic classification of software behavior. ACM SIGSOFT Software Engineering Notes (2004).
[4]
L.C. Briand, Y. Labiche, and S. He. 2009. Automating regression test selection based on UML designs. Information and Software Technology 51, 1 (2009), 16 – 30. Special Section - Most Cited Articles in 2002 and Regular Research Papers.
[5]
Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. USENIX (2008).
[6]
J. Chen, Y. Bai, D. Hao, L. Zhang, L. Zhang, and B. Xie. 2017. How Do Assertions Impact Coverage-Based Test-Suite Reduction?. In 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST). 418–423.
[7]
Jinfu Chen, Lili Zhu, Tsong Yueh Chen, Dave Towey, Fei-Ching Kuo, Rubing Huang, and Yuchi Guo. 2018. Test Case Prioritization for Object-oriented Software. J. Syst. Softw. 135, C (Jan. 2018), 107–125.
[8]
T.Y. Chen and M.F. Lau. 1998. A new heuristic for test suite reduction. Information and Software Technology 40, 5 (1998), 347 – 354.
[9]
Y. Dang, R. Wu, H. Zhang, D. Zhang, and P. Nobel. 2012. ReBucket: A method for clustering duplicate crash reports based on call stack similarity. In 2012 34th International Conference on Software Engineering (ICSE). 1084–1093.
[10]
T. Dhaliwal, F. Khomh, and Y. Zou. 2011. Classifying field crash reports for fixing bugs: A case study of Mozilla Firefox. In 2011 27th IEEE International Conference on Software Maintenance (ICSM). 333–342.
[11]
William Dickinson, David Leon, and Andy Podgurski. 2001. Finding failures by cluster analysis of execution profiles. Proceedings - International Conference on Software Engineering (2001).
[12]
William Dickinson, David Leon, and Andy Podgurski. 2001. Pursuing Failure: The Distribution of Program Failures in a Profile Space. SIGSOFT Softw. Eng. Notes 26, 5 (Sept. 2001), 246–255.
[13]
Nicholas DiGiuseppe and James A. Jones. 2012. Concept-Based Failure Clustering. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering - FSE ’12.
[14]
Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2000. Prioritizing Test Cases for Regression Testing. In Proceedings of the 2000 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’00). ACM, New York, NY, USA, 102–112.
[15]
Emelie Engström, Mats Skoglund, and Per Runeson. 2008. Empirical Evaluations of Regression Test Selection Techniques: A Systematic Review. In Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM ’08). ACM, New York, NY, USA, 22–31.
[16]
Hamed Esfahani, Jonas Fietz, Qi Ke, Alexei Kolomiets, Erica Lan, Erik Mavrinac, Wolfram Schulte, Newton Sanches, and Srikanth Kandula. 2016. CloudBuild: Microsoft’s Distributed and Caching Build Service. In Proceedings of the 38th International Conference on Software Engineering Companion (ICSE ’16). ACM, New York, NY, USA, 11–20.
[17]
X Ester, M., Kriegel, H. P., Sander, J., & Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Kdd (1996).
[18]
Facebook. {n.d.}. Buck. https://buckbuild.com/. Accessed: 2019-01-28.
[19]
Chunrong Fang, Zhenyu Chen, Kun Wu, and Zhihong Zhao. 2014. Similaritybased test case prioritization using ordered sequences of program entities. Software Quality Journal 22, 2 (01 Jun 2014), 335–361.
[20]
ChunRong Fang, ZhenYu Chen, and BaoWen Xu. 2012. Comparing logic coverage criteria on test case prioritization. Science China Information Sciences 55, 12 (01 Dec 2012), 2826–2840.
[21]
Yang Feng, Zhenyu Chen, James A. Jones, Chunrong Fang, and Baowen Xu. 2015. Test report prioritization to assist crowdsourced testing. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering - ESEC/FSE 2015.
[22]
Jingyao Geng, Zheng Li, Ruilian Zhao, and Junxia Guo. 2016. Search Based Test Suite Minimization for Fault Detection and Localization: A Co-driven Method. In Search Based Software Engineering, Federica Sarro and Kalyanmoy Deb (Eds.). Springer International Publishing, Cham, 34–48.
[23]
Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen Hunt. 2009.
[24]
M. Gligoric, L. Eloussi, and D. Marinov. 2015. Ekstazi: Lightweight Test Selection. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 2. 713–716.
[25]
Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015. Practical Regression Test Selection with Dynamic File Dependencies. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015). ACM, New York, NY, USA, 211–222.
[26]
Mojdeh Golagha, Alexander Pretschner, Dominik Fisch, and Roman Nagy. 2017. Reducing failure analysis time: An industrial evaluation. In Proceedings - 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track, ICSE-SEIP 2017.
[27]
Arnaud Gotlieb and Dusica Marijan. 2014. FLOWER: Optimal Test Suite Reduction As a Network Maximum Flow. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, New York, NY, USA, 171–180.
[28]
Alex Gyori, Owolabi Legunsen, Farah Hariri, and Darko Marinov. 2018. Evaluating Regression Test Selection Opportunities in a Very Large Open-Source Ecosystem. In 29th IEEE International Symposium on Software Reliability Engineering, ISSRE 2018, Memphis, TN, USA, October 15-18, 2018. 112–122.
[29]
M. Jean Harrold, Rajiv Gupta, and Mary Lou Soffa. 1993. A Methodology for Controlling the Size of a Test Suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270–285.
[30]
Mary Jean Harrold, James A. Jones, Tongyu Li, Donglin Liang, Alessandro Orso, Maikel Pennings, Saurabh Sinha, S. Alexander Spoon, and Ashish Gujarathi. 2001. Regression Test Selection for Java Software. In Proceedings of the 16th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA ’01). ACM, New York, NY, USA, 312–326.
[31]
Chien Hsin Hsueh, Yung Pin Cheng, and Wei Cheng Pan. 2011. Intrusive test automation with failed test case clustering. In Proceedings - Asia-Pacific Software Engineering Conference, APSEC.
[32]
D. Jeffrey and N. Gupta. 2007. Improving Fault Detection Capability by Selectively Retaining Test Cases during Test Suite Reduction. IEEE Transactions on Software Engineering 33, 2 (Feb 2007), 108–123.
[33]
B. Jiang, Z. Zhang, W. K. Chan, and T. H. Tse. 2009. Adaptive Random Test Case Prioritization. In 2009 IEEE/ACM International Conference on Automated Software Engineering. 233–244.
[34]
James A Jones, James F Bowring, and Mary Jean Harrold. 2007. Debugging in Parallel. In ISSTA.
[35]
J. A. Jones and M. J. Harrold. 2003. Test-suite reduction and prioritization for modified condition/decision coverage. IEEE Transactions on Software Engineering 29, 3 (March 2003), 195–209.
[36]
André I. Khuri. 2013. Introduction to Linear Regression Analysis, Fifth Edition by Douglas C. Montgomery, Elizabeth A. Peck, G. Geoffrey Vining. International Statistical Review (2013).
[37]
S. Kim, T. Zimmermann, and N. Nagappan. 2011. Crash graphs: An aggregated view of multiple crashes to improve crash triage. In 2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks (DSN). 486–493.
[38]
Y. Ledru, A. Petrenko, and S. Boroday. 2009. Using String Distances for Test Case Prioritisation. In 2009 IEEE/ACM International Conference on Automated Software Engineering. 510–514.
[39]
Owolabi Legunsen, Farah Hariri, August Shi, Yafeng Lu, Lingming Zhang, and Darko Marinov. 2016. An Extensive Study of Static Regression Test Selection in Modern Software Evolution. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 583–594.
[40]
Jun-Wei Lin and Chin-Yu Huang. 2009. Analysis of test suite reduction with enhanced tie-breaking techniques. Information and Software Technology 51, 4 (2009), 679 – 690.
[41]
Chao Liu and Jiawei Han. 2006. Failure proximity: A fault localization-based approach. In Proceedings of the 14th ACM SIGSOFT International Symposium on the Foundations of Software Engineering.
[42]
David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, and Chengnian Sun. 2009. Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’09). ACM, New York, NY, USA, 557–566.
[43]
Xue-ying Ma, Bin-kui Sheng, and Cheng-qing Ye. 2005. Test-Suite Reduction Using Genetic Algorithm. In Advanced Parallel Processing Technologies, Jiannong Cao, Wolfgang Nejdl, and Ming Xu (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 253–262.
[44]
James MacQueen. 1967. Some Methods for classification and Analysis of Multivariate Observations. In 5th Berkeley Symposium on Mathematical Statistics and Probability 1967.
[45]
Oded Maimon and Lior Rokach. 2010. Data Mining and Knowledge Discovery Handbook 2ed.
[46]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA.
[47]
Atif Memon, Zebao Gao, Bao Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. 2017. Taming Google-scale Continuous Testing. In Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP ’17). IEEE Press, Piscataway, NJ, USA, 233–242.
[48]
Sroka Michal, Nagy Roman, and Fisch Dominik. 2014. Specification-Based Testing Via Domain Specific Language. Research Papers Faculty of Materials Science and Technology Slovak University of Technology 22, 341 (December 2014), 1–6.
[49]
N. Modani, R. Gupta, G. Lohman, T. Syeda-Mahmood, and L. Mignet. 2007. Automatically Identifying Known Software Problems. In 2007 IEEE 23rd International Conference on Data Engineering Workshop. 433–441. ISSTA ’19, July 15–19, 2019, Beijing, China Mojdeh Golagha, Constantin Lehnhoff, Alexander Pretschner, and Hermann Ilmberger
[50]
Alessandro Orso, Nanjuan Shi, and Mary Jean Harrold. 2004. Scaling Regression Testing to Large Software Systems. In Proceedings of the 12th ACM SIGSOFT Twelfth International Symposium on Foundations of Software Engineering (SIGSOFT ’04/FSE-12). ACM, New York, NY, USA, 241–251.
[51]
Van Thuan Pham, Sakaar Khurana, Subhajit Roy, and Abhik Roychoudhury. 2017. Bucketing failing tests via symbolic analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
[52]
Andreas Podelski, Martin Schäf, and Thomas Wies. 2016. Classifying Bugs with Interpolants. In Tests and Proofs, Bernhard K. Aichernig and Carlo A. Furia (Eds.). Springer International Publishing, Cham, 151–168.
[53]
A. Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, Jiayang Sun, and Bin Wang. 2003. Automated support for classifying software failure reports. In 25th International Conference on Software Engineering, 2003. Proceedings.
[54]
Alexander Pretschner. 2015. Defect-Based Testing. In Dependable Software Systems Engineering.
[55]
Erik Rogstad, Lionel Briand, and Richard Torkar. 2013. Test case selection for black-box regression testing of database applications. Information and Software Technology (2013).
[56]
G. Rothermel, S. Elbaum, A. Malishevsky, P. Kallakuri, and B. Davia. 2002. The impact of test suite granularity on the cost-effectiveness of regression testing. In Proceedings of the 24th International Conference on Software Engineering. ICSE 2002. 130–140.
[57]
G. Rothermel, M. J. Harrold, J. Ostrin, and C. Hong. 1998. An empirical study of the effects of minimization on the fault detection capabilities of test suites. In Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272). 34–43.
[58]
P. Runeson, M. Alexandersson, and O. Nyholm. 2007. Detection of Duplicate Defect Reports Using Natural Language Processing. In 29th International Conference on Software Engineering (ICSE’07). 499–510.
[59]
Pavi Saraswat, Abhishek Singhal, and Abhay Bansal. 2019. A Review of Test Case Prioritization and Optimization Techniques. In Software Engineering, M. N. Hoda, Naresh Chauhan, S. M. K. Quadri, and Praveen Ranjan Srivastava (Eds.). Springer Singapore, Singapore, 507–516.
[60]
August Shi, Alex Gyori, Suleman Mahmood, Peiyuan Zhao, and Darko Marinov. 2018. Evaluating Test-suite Reduction in Real Software Evolution. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018). ACM, New York, NY, USA, 84–94.
[61]
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov. 2015. Comparing and Combining Test-suite Reduction and Regression Test Selection. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 237–247.
[62]
Anja Struyf, Mia Hubert, and Peter Rousseeuw. 1997. Clustering in an Object-Oriented Environment. Journal of Statistical Software, Articles 1, 4 (1997), 1–30.
[63]
W. E. Wong, J. R. Horgan, S. London, and A. P. Mathur. 1995. Effect of Test Set Minimization on Fault Detection Effectiveness. In 1995 17th International Conference on Software Engineering. 41–41.
[64]
W. E. Wong, J. R. Horgan, A. P. Mathur, and A. Pasquini. 1997. Test set size minimization and fault detection effectiveness: a case study in a space application. In Proceedings Twenty-First Annual International Computer Software and Applications Conference (COMPSAC’97). 522–528.
[65]
G. Xu and A. Rountev. 2007. Regression Test Selection for AspectJ Software. In 29th International Conference on Software Engineering (ICSE’07). 65–74.
[66]
S. Yan, Z. Chen, Z. Zhao, C. Zhang, and Y. Zhou. 2010. A Dynamic Test Cluster Sampling Strategy by Leveraging Execution Spectra Information. In 2010 Third International Conference on Software Testing, Verification and Validation. 147–154.
[67]
Shin Yoo and Mark Harman. 2007. Pareto Efficient Multi-objective Test Case Selection. In Proceedings of the 2007 International Symposium on Software Testing and Analysis (ISSTA ’07). ACM, New York, NY, USA, 140–150.
[68]
S. Yoo and M. Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Softw. Test. Verif. Reliab. 22, 2 (March 2012), 67–120.
[69]
Shin Yoo, Mark Harman, Paolo Tonella, and Angelo Susi. 2009. Clustering Test Cases to Achieve Effective and Scalable Prioritisation Incorporating Expert Knowledge. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis (ISSTA ’09). ACM, New York, NY, USA, 201–212.
[70]
L. Zhang. 2018. Hybrid Regression Test Selection. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 199–209.
[71]
L. Zhang, M. Kim, and S. Khurshid. 2011. Localizing failure-inducing program edits based on spectrum information. In 2011 27th IEEE International Conference on Software Maintenance (ICSM). 23–32.
[72]
Lingming Zhang, Darko Marinov, Lu Zhang, and Sarfraz Khurshid. 2011. An Empirical Study of JUnit Test-Suite Reduction. In Proceedings of the 2011 IEEE 22Nd International Symposium on Software Reliability Engineering (ISSRE ’11). IEEE Computer Society, Washington, DC, USA, 170–179.

Cited By

View all
  • (2024)Enhancing Clustering Performance of Failed Test Cases during HIL Simulation: A Study on Deep Auto-Encoder Structures and Hyperparameter TuningApplied Sciences10.3390/app1419906414:19(9064)Online publication date: 8-Oct-2024
  • (2024)ReClues: Representing and indexing failures in parallel debugging with program variablesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639098(1-13)Online publication date: 20-May-2024
  • (2024)Unsupervised Machine Learning Approaches for Test Suite ReductionApplied Artificial Intelligence10.1080/08839514.2024.232233638:1Online publication date: 4-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2019: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis
July 2019
451 pages
ISBN:9781450362245
DOI:10.1145/3293882
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Debugging
  2. Failure Clustering

Qualifiers

  • Research-article

Conference

ISSTA '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Clustering Performance of Failed Test Cases during HIL Simulation: A Study on Deep Auto-Encoder Structures and Hyperparameter TuningApplied Sciences10.3390/app1419906414:19(9064)Online publication date: 8-Oct-2024
  • (2024)ReClues: Representing and indexing failures in parallel debugging with program variablesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639098(1-13)Online publication date: 20-May-2024
  • (2024)Unsupervised Machine Learning Approaches for Test Suite ReductionApplied Artificial Intelligence10.1080/08839514.2024.232233638:1Online publication date: 4-Mar-2024
  • (2023)A Survey on Bug Deduplication and Triage Methods from Multiple Points of ViewApplied Sciences10.3390/app1315878813:15(8788)Online publication date: 29-Jul-2023
  • (2023)Making Sense of Failure Logs in an Industrial DevOps EnvironmentITNG 2023 20th International Conference on Information Technology-New Generations10.1007/978-3-031-28332-1_25(217-226)Online publication date: 21-Feb-2023
  • (2022)Evolving Ranking-Based Failure Proximities for Better Clustering in Fault IsolationProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556922(1-13)Online publication date: 10-Oct-2022
  • (2022)Automatically identifying shared root causes of test breakages in SAP HANAProceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice10.1145/3510457.3513051(65-74)Online publication date: 21-May-2022
  • (2022)BuildSheriffProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510132(312-324)Online publication date: 21-May-2022
  • (2022)Automatically Identifying Shared Root Causes of Test Breakages in SAP HANA2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)10.1109/ICSE-SEIP55303.2022.9793878(65-74)Online publication date: May-2022
  • (2022)A comprehensive empirical investigation on failure clustering in parallel debuggingJournal of Systems and Software10.1016/j.jss.2022.111452193(111452)Online publication date: Nov-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media