ABSTRACT
Modern software development processes recommend that changes be integrated into the main development line of a project multiple times a day. Before a new revision may be integrated, developers practice regression testing to ensure that the latest changes do not break any previously established functionality. The cost of regression testing is high, due to an increase in the number of revisions that are introduced per day, as well as the number of tests developers write per revision. Regression test selection (RTS) optimizes regression testing by skipping tests that are not affected by recent project changes. Existing dynamic RTS techniques support only projects written in a single programming language, which is unfortunate knowing that an open-source project is on average written in several programming languages.
We present the first dynamic RTS technique that does not stop at predefined language boundaries. Our technique dynamically detects, at the operating system level, all file artifacts a test depends on. Our technique is, hence, oblivious to the specific means the test uses to actually access the files: be it through spawning a new process, invoking a system call, invoking a library written in a different language, invoking a library that spawns a process which makes a system call, etc. We also provide a set of extension points which allow for a smooth integration with testing frameworks and build systems. We implemented our technique in a tool called RTSLinux as a loadable Linux kernel module and evaluated it on 21 Java projects that escape JVM by spawning new processes or invoking native code, totaling 2,050,791 lines of code. Our results show that RTSLinux, on average, skips 74.17% of tests and saves 52.83% of test execution time compared to executing all tests.
- Ant Home Page. http://ant.apache.org.Google Scholar
- Apache Hadoop Home Page. http://hadoop.apache.org.Google Scholar
- Build in the Cloud. http://google-engtools.blogspot.com/2011/08/ build-in-cloud-how-build-system-works.html.Google Scholar
- JUnit Task. https://ant.apache.org/manual/Tasks/junit.html.Google Scholar
- Linus Torvalds - Moving to Linux 4.0. https://plus.google.com/+LinusTorvalds/ posts/jmtzzLiiejc.Google Scholar
- Linux Loadable Kernel Module HOWTO. http://tldp.org/HOWTO/ Module-HOWTO.Google Scholar
- pendulum - Python datetimes made easy. https://github.com/sdispater/ pendulum.Google Scholar
- perf trace & vfs_getname. http://www.spinics.net/lists/linux-perf-users/ msg02975.html.Google Scholar
- SCons. http://www.scons.org.Google Scholar
- strace - trace system calls and signals. http://linux.die.net/man/1/strace.Google Scholar
- Travis CI - Test and Deploy Your Code with Confidence. https://travis-ci.org.Google Scholar
- Tup. http://gittup.org/tup.Google Scholar
- Elaine Angelino, Daniel Yamins, and Margo I. Seltzer. 2010. StarFlow: A Script-Centric Data Analysis Environment. In International Provenance and Annotation Workshop. 236–250.Google Scholar
- Thomas Ball. 1998. On the Limit of Control Flow Analysis for Regression Test Selection. In International Symposium on Software Testing and Analysis. 134–142. Google ScholarDigital Library
- Adam Bates, Dave Tian, Kevin R. B. Butler, and Thomas Moyer. 2015. Trustworthy Whole-system Provenance for the Linux Kernel. In USENIX Conference on Security Symposium. 319–334. Google ScholarDigital Library
- Bazel. http://bazel.io/.Google Scholar
- Boris Beizer. 1990. Software Testing Techniques (2nd Ed.). Van Nostrand Reinhold Co., New York, NY, USA. Google ScholarDigital Library
- Jonathan Bell and Gail E. Kaiser. 2014. Unit test virtualization with VMVM. In International Conference on Software Engineering. 550–561. Google ScholarDigital Library
- Swarnendu Biswas, Rajib Mall, Manoranjan Satpathy, and Srihari Sukumaran. 2011. Regression Test Selection Techniques: A Survey. Informatica (Slovenia) 35, 3 (2011), 289–321.Google Scholar
- Ahmet Celik, Alex Knaust, Aleksandar Milicevic, and Milos Gligoric. 2016. Build System with Lazy Retrieval for Java Projects. In International Symposium on Foundations of Software Engineering. 643–654. Google ScholarDigital Library
- Yih-Farn Chen, David S. Rosenblum, and Kiem-Phong Vo. 1994. TestTube: A System for Selective Regression Testing. In International Conference on Software Engineering. 211–220. Google ScholarDigital Library
- Pavan Kumar Chittimalli and Mary Jean Harrold. 2007. Re-computing Coverage Information to Assist Regression Testing. In International Conference on Software Maintenance. 164–173.Google ScholarCross Ref
- Maria Christakis, K. Rustan M. Leino, and Wolfram Schulte. 2014. Formalizing and Verifying a Modern Build Language. In International Symposium on Formal Methods. 643–657. Google ScholarDigital Library
- Cleaning directories after each test run, to prevent repository pollution. https: //github.com/apache/commons-io/pull/13.Google Scholar
- Cleaning directories after each test run, to prevent repository pollution. https: //issues.apache.org/jira/browse/CONFIGURATION-638.Google Scholar
- Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for Improving Regression Testing in Continuous Integration Development Environments. In International Symposium on Foundations of Software Engineering. 235–245. Google ScholarDigital Library
- Emelie Engström and Per Runeson. 2010. A Qualitative Survey of Regression Testing Practices. In Product-Focused Software Process Improvement. 3–16. Google ScholarDigital Library
- Emelie Engström, Per Runeson, and Mats Skoglund. 2010. A Systematic Review on Regression Test Selection Techniques. Journal of Information and Software Technology 52, 1 (2010), 14–30. Google ScholarDigital Library
- Emelie Engström, Mats Skoglund, and Per Runeson. 2008. Empirical evaluations of regression test selection techniques: a systematic review. In International Symposium on Empirical Software Engineering and Measurement. 22–31. Google ScholarDigital Library
- Eradicating Non-Determinism in Tests. http://martinfowler.com/articles/ nonDeterminism.html.Google Scholar
- Hamed Esfahani, Jonas Fietz, Qi Ke, Alexei Kolomiets, Erica Lan, Erik Mavrinac, Wolfram Schulte, Newton Sanches, and Srikanth Kandula. 2016. CloudBuild: Microsoft’s Distributed and Caching Build Service. In International Conference on Software Engineering, Software Engineering in Practice. 11–20. Google ScholarDigital Library
- Fabricate. https://github.com/SimonAlfie/fabricate.Google Scholar
- Erich Gamma, John Vlissides, Ralph Johnson, and Richard Helm. 1994. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. Google ScholarDigital Library
- Ashish Gehani and Dawood Tariq. 2012. SPADE: Support for Provenance Auditing in Distributed Environments. In International Middleware Conference. 101–120. Google ScholarDigital Library
- Milos Gligoric. 2015. Regression Test Selection: Theory and Practice. Ph.D. Dissertation. The University of Illinois at Urbana-Champaign.Google Scholar
- Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015. Ekstazi: Lightweight Test Selection. In International Conference on Software Engineering, Demo. 713– 716. Google ScholarDigital Library
- Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015. Practical Regression Test Selection with Dynamic File Dependencies. In International Symposium on Software Testing and Analysis. 211–222. Google ScholarDigital Library
- Milos Gligoric, Wolfram Schulte, Chandra Prasad, Danny van Velzen, Iman Narasamdya, and Benjamin Livshits. 2014. Automated Migration of Build Scripts using Dynamic Analysis and Search-Based Refactoring. In Conference on Object-Oriented Programming, Systems, Languages, and Applications. 599–616. Google ScholarDigital Library
- Gradle Build Tool - Modern Open Source Build Automation. http://gradle.org.Google Scholar
- Philip J. Guo and Dawson Engler. 2011. Using Automatic Persistent Memoization to Facilitate Data Analysis Scripting. In International Symposium on Software Testing and Analysis. 287–297. Google ScholarDigital Library
- Alex Gyori, August Shi, Farah Hariri, and Darko Marinov. 2015. Reliable testing: detecting state-polluting tests to prevent test dependency. In International Symposium on Software Testing and Analysis. 223–233. Google ScholarDigital Library
- Ramzi A. Haraty, Nash’at Mansour, and Bassel Daou. 2001. Regression testing of database applications. In Symposium on Applied Computing. 285–289. Google ScholarDigital Library
- Ramzi A. Haraty, Nashat Mansour, and Bassel A. Daou. 2004. Regression test selection for database applications. Advanced Topics in Database Research 3 (2004), 141–165.Google ScholarCross Ref
- Jean Hartmann. 2012. 30 Years of Regression Testing: Past, Present and Future. In Pacific Northwest Software Quality Conference. 119–126.Google Scholar
- Allan Heydon, Roy Levin, Timothy Mann, and Yuan Yu. 2002. The Vesta Software Configuration Management System. Research Report. http://www.hpl.hp.com/ techreports/Compaq-DEC/SRC-RR-177.pdf.Google Scholar
- Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, Costs, and Benefits of Continuous Integration in Open-Source Projects. In Automated Software Engineering. 426–437. Google ScholarDigital Library
- JNI APIs and Developer Guides. https://docs.oracle.com/javase/8/docs/technotes/ guides/jni.Google Scholar
- KernelGitGuide. https://wiki.ubuntu.com/Kernel/Dev/KernelGitGuide.Google Scholar
- David Chenho Kung, Jerry Gao, Pei Hsia, Jeremy Lin, and Yasufumi Toyoshima. 1995. Class Firewall, Test Order, and Regression Testing of Object-Oriented Programs. Journal of Object-Oriented Programming 8, 2 (1995), 51–65.Google Scholar
- Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2013. High Accuracy Attack Provenance via Binary-based Execution Partition. In Network and Distributed System Security Symposium.Google Scholar
- Owolabi Legunsen, Farah Hariri, August Shi, Yafeng Lu, Lingming Zhang, and Darko Marinov. 2016. An Extensive Study of Static Regression Test Selection in Modern Software Evolution. In International Symposium on Foundations of Software Engineering. 883–894. Google ScholarDigital Library
- Hareton K. N. Leung and Lee White. 1989. Insights into regression testing. In International Conference on Software Maintenance. 60–69.Google Scholar
- Sheng Liang. 1999. Java Native Interface : Programmer’s Guide and Specification. Addison-Wesley Longman. Google ScholarDigital Library
- Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An Empirical Analysis of Flaky Tests. In International Symposium on Foundations of Software Engineering. 643–653. Google ScholarDigital Library
- Peter Macko and Margo Seltzer. 2012. A General-purpose Provenance Library. In USENIX Conference on Theory and Practice of Provenance. 6–6. Google ScholarDigital Library
- Apache Maven. https://maven.apache.org.Google Scholar
- Philip Mayer and Alexander Bauer. 2015. An Empirical Analysis of the Utilization of Multiple Programming Languages in Open Source Projects. In International Conference on Evaluation and Assessment in Software Engineering. 1–10. Google ScholarDigital Library
- Shane Mcintosh, Bram Adams, and Ahmed E. Hassan. 2012. The Evolution of Java Build Systems. Empirical Software Engineering 17, 4–5 (2012), 578–608. Google ScholarDigital Library
- Memoize. https://github.com/kgaughan/memoize.py.Google Scholar
- Atif M. Memon and Myra B. Cohen. 2013. Automated Testing of GUI Applications: Models, Tools, and Controlling Flakiness. In International Conference on Software Engineering. 1479–1480. Google ScholarDigital Library
- Brian S. Mitchell and Spiros Mancoridis. 2006. On the automatic modularization of software systems using the Bunch tool. Transactions on Software Engineering 32, 3 (2006), 193–208. Google ScholarDigital Library
- Kivanç Muşlu, Bilge Soran, and Jochen Wuttke. 2011. Finding Bugs by Isolating Unit Tests. In International Symposium on Foundations of Software Engineering. 496–499. Google ScholarDigital Library
- Kiran-Kumar Muniswamy-Reddy, Uri Braun, David A. Holland, Peter Macko, Diana Maclean, Daniel Margo, Margo Seltzer, and Robin Smogor. 2009. Layering in Provenance Systems. In Conference on USENIX Annual Technical Conference. 10–10. Google ScholarDigital Library
- Agastya Nanda, Senthil Mani, Saurabh Sinha, Mary Jean Harrold, and Alessandro Orso. 2011. Regression Testing in the Presence of Non-code Changes. In International Conference on Software Testing, Verification, and Validation. 21–30. Google ScholarDigital Library
- Alexander Neundorf. Why the KDE project switched to CMake - and how. http://lwn.net/Articles/188693.Google Scholar
- ESEC/FSE’17, September 4–8, 2017, Paderborn, Germany Ahmet Celik, Marko Vasic, Aleksandar Milicevic, and Milos GligoricGoogle Scholar
- Alessandro Orso and Gregg Rothermel. 2014. Software Testing: A Research Travelogue (2000–2014). In Future of Software Engineering. 117–132. Google ScholarDigital Library
- Alessandro Orso, Nanjuan Shi, and Mary Jean Harrold. 2004. Scaling Regression Testing to Large Software Systems. In International Symposium on Foundations of Software Engineering. 241–251. Google ScholarDigital Library
- Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A Large Scale Study of Programming Languages and Code Quality in GitHub. In International Symposium on Foundations of Software Engineering. 155–165. Google ScholarDigital Library
- Xiaoxia Ren, Fenil Shah, Frank Tip, Barbara G. Ryder, and Ophelia Chesley. 2004. Chianti: A Tool for Change Impact Analysis of Java Programs. In Conference on Object-Oriented Programming, Systems, Languages, and Applications. 432–448. Google ScholarDigital Library
- Gregg Rothermel and Mary Jean Harrold. 1996. Analyzing Regression Test Selection Techniques. Transactions on Software Engineering 22, 8 (1996), 529– 551. Google ScholarDigital Library
- Gregg Rothermel and Mary Jean Harrold. 1997. A safe, efficient regression test selection technique. Transactions on Software Engineering and Methodology 6, 2 (1997), 173–210. Google ScholarDigital Library
- August Shi, Alex Gyori, Milos Gligoric, Andrey Zaytsev, and Darko Marinov. 2014. Balancing Trade-offs in Test-suite Reduction. In International Symposium on Foundations of Software Engineering. 246–256. Google ScholarDigital Library
- Mats Skoglund and Per Runeson. 2005. A case study of the class firewall regression test selection technique on a large scale distributed software system. In International Symposium on Empirical Software Engineering. 74–83.Google ScholarCross Ref
- Mats Skoglund and Per Runeson. 2007. Improving Class Firewall Regression Test Selection by Removing the Class Firewall. International Journal of Software Engineering and Knowledge Engineering 17, 3 (2007), 359–378.Google ScholarCross Ref
- Peter Smith. 2011. Software Build Systems: Principles and Experience. Addison-Wesley Professional. Google ScholarDigital Library
- R. Spillane, R. Sears, C. Yalamanchili, S. Gaikwad, M. Chinni, and E. Zadok. 2009. Story Book: An Efficient Extensible Provenance Framework. In Workshop on on Theory and Practice of Provenance. 11:1–11:10. Google ScholarDigital Library
- Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively Prioritizing Tests in Development Environment. In International Symposium on Software Testing and Analysis. 97–106. Google ScholarDigital Library
- Roman Suvorov, Meiyappan Nagappan, Ahmed E. Hassan, Ying Zou, and Bram Adams. 2012. An empirical study of build system migrations in practice: Case studies on KDE and the Linux kernel. In International Conference on Software Maintenance. 160–169. Google ScholarDigital Library
- Testing at the speed and scale of Google. http://google-engtools.blogspot.com/ 2011/06/testing-at-speed-and-scale-of-google.html.Google Scholar
- TotT: Avoiding Flakey Tests. http://googletesting.blogspot.com/2008/04/ tott-avoiding-flakey-tests.html.Google Scholar
- Mohsen Vakilian, Raluca Sauciuc, J. David Morgenthaler, and Vahab Mirrokni. 2015. Automated Decomposition of Build Targets. In International Conference on Software Engineering. 123–133. Google ScholarDigital Library
- Marko Vasic, Zuhair Parvez, Aleksandar Milicevic, and Milos Gligoric. 2017. File-Level vs. Module-Level Regression Test Selection for .NET. In Symposium on the Foundations of Software Engineering, Industry Track. TO APPEAR. Google ScholarDigital Library
- Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu, and Vladimir Filkov. 2015. Quality and Productivity Outcomes Relating to Continuous Integration in GitHub. In International Symposium on Foundations of Software Engineering. 805–816. Google ScholarDigital Library
- David Willmor and Suzanne M. Embury. 2005. A Safe Regression Test Selection Technique for Database Driven Applications. In International Conference on Software Maintenance. 421–430. Google ScholarDigital Library
- Shin Yoo and Mark Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012), 67–120. Google ScholarCross Ref
- Lingming Zhang, Miryung Kim, and Sarfraz Khurshid. 2011. Localizing failureinducing program edits based on spectrum information. In International Conference on Software Maintenance. 23–32. Google ScholarDigital Library
- Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muslu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically revisiting the test independence assumption. In International Symposium on Software Testing and Analysis. 385– 396. Google ScholarDigital Library
Index Terms
- Regression test selection across JVM boundaries
Recommendations
File-level vs. module-level regression test selection for .NET
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software EngineeringRegression testing is used to check the correctness of evolving software. With the adoption of Agile development methodology, the number of tests and software revisions has dramatically increased, and hence has the cost of regression testing. ...
Practical regression test selection with dynamic file dependencies
ISSTA 2015: Proceedings of the 2015 International Symposium on Software Testing and AnalysisRegression testing is important but can be time-intensive. One approach to speed it up is regression test selection (RTS), which runs only a subset of tests. RTS was proposed over three decades ago but has not been widely adopted in practice. Meanwhile,...
Empirical Studies of a Safe Regression Test Selection Technique
Regression testing is an expensive testing procedure utilized to validate modified software. Regression test selection techniques attempt to reduce the cost of regression testing by selecting a subset of a program's existing test suite. Safe regression ...
Comments