ABSTRACT
Test code maintainability is a common concern in software testing. In order to achieve good maintainability, test methods should be clearly structured, well named, small in size, and, mainly, test code duplication should be avoided. Several strategies exist to avoid test code duplication, such as implicit setup and delegated setup. However, prior to applying these strategies, first it is necessary to identify the duplicate code, which can be a time-consuming task. To address this problem, we automate the identification of duplicate test code through the application of code similarity metrics. We propose a novel similarity metric, called Longest Common Contiguous Start Sub-Sequence (LCCSS), to identify refactoring candidates. LCCSS is a metric used to measure similarity between pairs of tests. The most similar pairs are reported as strong candidates to be refactored through the implicit setup strategy. We also develop a framework, called Róża, that can use different similarity metrics to identify test code duplication. An experiment shows that LCCSS and Simian, a clone detection tool, have both identified pairs of tests to be refactored through the implicit setup strategy with maximum precision in all the eleven standard recall levels. But, unlike Simian, LCCSS does not need to be calibrated for each project.
- Ricardo Baeza-Yates, Berthier Ribeiro-Neto, et al. 1999. Modern information retrieval. Vol. 463. ACM press New York. Google ScholarDigital Library
- I. D. Baxter, A. Yahin, L. Moura, M. Sant'Anna, and L. Bier. 1998. Clone detection using abstract syntax trees. In Proceedings. International Conference on Software Maintenance. 368--377. 10.1109/ICSM.1998.738528 Google ScholarDigital Library
- Kent Beck. 2003. Test-driven development: by example. Addison-Wesley Professional. Google ScholarDigital Library
- Stefan Berner, Roland Weber, and Rudolf K. Keller. 2005. Observations and Lessons Learned from Automated Testing. In Proceedings of the 27th International Conference on Software Engineering (ICSE '05). ACM, New York, NY, USA, 571--579. 10.1145/1062455.1062556 Google ScholarDigital Library
- C. A. Christensen, S. Gundersborg, K. D. Linde, and K. Torp. 2006. A Unit-Test Framework for Database Applications. In 2006 10th International Database Engineering and Applications Symposium (IDEAS'06). 11--20. 10.1109/IDEAS.2006.7 Google ScholarDigital Library
- Vašek Chvátal, David A Klarner, and Donald Ervin Knuth. 1972. Selected combinatorial research problems. Computer Science Department, Stanford University.Google Scholar
- J. R. Cordy and C. K. Roy. 2011. The NiCad Clone Detector. In 2011 IEEE 19th International Conference on Program Comprehension. 219--220. Google ScholarDigital Library
- L. P. da Silva and P. Vilain. 2016. Execution and code reuse between test classes. In 2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA). 99--106. 10.1109/SERA.2016.7516134Google Scholar
- S. R. Dalal, A. Jain, N. Karunanithi, J. M. Leaton, C. M. Lott, G. C. Patton, and B. M. Horowitz. 1999. Model-based testing in practice. In Proceedings of the 1999 International Conference on Software Engineering. 285--294. 10.1145/302405.302640 Google ScholarDigital Library
- Brett Daniel, Qingzhou Luo, Mehdi Mirzaaghaei, Danny Dig, Darko Marinov, and Mauro Pezzè. 2011. Automated GUI Refactoring and Test Script Repair. In Proceedings of the First International Workshop on End-to-End Test Script Engineering (ETSE '11). Association for Computing Machinery, New York, NY, USA, 38--41. 10.1145/2002931.2002937 Google ScholarDigital Library
- John L Donaldson, Ann-Marie Lancaster, and Paula H Sposato. 1981. A plagiarism detection system. In ACM SIGCSE Bulletin, Vol. 13. ACM, 21--25. Issue 1. Google ScholarDigital Library
- S. Ducasse, M. Rieger, and S. Demeyer. 1999. A language independent approach for detecting duplicated code. In Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 109--118. 10.1109/ICSM.1999.792593 Google ScholarDigital Library
- M. Erfani, I. Keivanloo, and J. Rilling. 2013. Opportunities for Clone Detection in Test Case Recommendation. In 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops. 65--70. Google ScholarDigital Library
- Martin Fowler. 2018. Refactoring: improving the design of existing code. Addison-Wesley Professional.Google Scholar
- Steve Freeman and Nat Pryce. 2009. Growing Object-Oriented Software, Guided by Tests. Addison-Wesley. Google ScholarDigital Library
- M. Greiler, A. van Deursen, and M. Storey. 2013. Automated Detection of Test Fixture Strategies and Smells. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. 322--331. 10.1109/ICST.2013.45 Google ScholarDigital Library
- Michaela Greiler, Andy Zaidman, Arie Van Deursen, and Margaret-Anne Storey. 2013. Strategies for avoiding text fixture smells during software evolution. In 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, 387--396. Google ScholarDigital Library
- Eduardo Martins Guerra and Clovis Torres Fernandes. 2007. Refactoring test code safely. In International Conference on Software Engineering Advances (ICSEA 2007). IEEE, 44--44. Google ScholarDigital Library
- R. W. Hamming. 1950. Error detecting and error correcting codes. The Bell System Technical Journal 29, 2 (April 1950), 147--160. 10.1002/j.1538-7305.1950.tb00463.xGoogle ScholarCross Ref
- M. Jean Harrold, Rajiv Gupta, and Mary Lou Soffa. 1993. A Methodology for Controlling the Size of a Test Suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270--285. 10.1145/152388.152391 Google ScholarDigital Library
- Hadi Hemmati, Andrea Arcuri, and Lionel Briand. 2010. Reducing the Cost of Model-Based Testing through Test Case Diversity. In Testing Software and Systems. Springer Berlin Heidelberg, Berlin, Heidelberg, 63--78. Google ScholarDigital Library
- H. Hemmati and L. Briand. 2010. An Industrial Investigation of Similarity Measures for Model-Based Test Case Selection. In 2010 IEEE 21st International Symposium on Software Reliability Engineering. 141--150. 10.1109/ISSRE.2010.9 Google ScholarDigital Library
- Hadi Hemmati, Lionel Briand, Andrea Arcuri, and Shaukat Ali. 2010. An Enhanced Test Case Selection Approach for Model-based Testing: An Industrial Case Study. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '10). ACM, New York, NY, USA, 267--276. 10.1145/1882291.1882331 Google ScholarDigital Library
- R. Holmes and G. C. Murphy. 2005. Using structural context to recommend source code examples. In Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005. 117--125. 10.1109/ICSE.2005.1553554 Google ScholarDigital Library
- T. Kamiya, S. Kusumoto, and K. Inoue. 2002. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (July 2002), 654--670. 10.1109/TSE.2002.1019480 Google ScholarDigital Library
- Yoshio Kataoka, David Notkin, Michael D. Ernst, and William G. Griswold. 2001. Automated Support for Program Refactoring Using Invariants. In Proceedings of the IEEE International Conference on Software Maintenance (ICSM '01) (ICSM '01). IEEE Computer Society, Washington, DC, USA, 736-. 10.1109/ICSM.2001.972794 Google ScholarDigital Library
- J. Krinke. 2001. Identifying similar code with program dependence graphs. In Proceedings Eighth Working Conference on Reverse Engineering. 301--309. 10.1109/WCRE.2001.957835 Google ScholarDigital Library
- M. Landhäußer and W. F. Tichy. 2012. Automated test-case generation by cloning. In 2012 7th International Workshop on Automation of Software Test (AST). 83--88. 10.1109/IWAST.2012.6228995 Google ScholarDigital Library
- Yves Ledru, Alexandre Petrenko, Sergiy Boroday, and Nadine Mandran. 2012. Prioritizing test cases with string distances. Automated Software Engineering 19, 1 (Mar 2012), 65--95. 10.1007/s10515-011-0093-0 Google ScholarDigital Library
- Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. 707--710. Issue 8.Google Scholar
- Douglas Hiura Longo, Beatriz Wilges, Patrícia Vilain, and Renato Cislaghi. 2015. Fixture Setup through Object Notation for Implicit Test Fixtures. Journal of Computer Science 11, 6 (2015), 794.Google ScholarCross Ref
- Udi Manber et al. 1994. Finding Similar Files in a Large File System.. In Usenix Winter, Vol. 94. 1--10. Google ScholarDigital Library
- Gerard Meszaros. 2007. xUnit test patterns: Refactoring test code. Pearson Education. Google ScholarDigital Library
- Breno Miranda, Emilio Cruciani, Roberto Verdecchia, and Antonia Bertolino. 2018. FAST Approaches to Scalable Similarity-based Test Case Prioritization. In Proceedings of the 40th International Conference on Software Engineering (ICSE '18). ACM, New York, NY, USA, 222--232. 10.1145/3180155.3180210 Google ScholarDigital Library
- Iman Hemati Moghadam and Mel Ó Cinnéide. 2011. Code-Imp: A Tool for Automated Search-based Refactoring. In Proceedings of the 4th Workshop on Refactoring Tools (WRT '11). ACM, New York, NY, USA, 41--44. 10.1145/1984732.1984742 Google ScholarDigital Library
- Karl J Ottenstein. 1976. An algorithmic approach to the detection and prevention of plagiarism. ACM Sigcse Bulletin 8, 4(1976), 30--41. Google ScholarDigital Library
- Chaiyong Ragkhitwetsagul, Jens Krinke, and David Clark. 2018. A comparison of code similarity analysers. Empirical Software Engineering 23, 4 (Aug 2018), 2464--2519. 10.1007/s10664-017-9564-7 Google ScholarDigital Library
- C. K. Roy and J. R. Cordy. 2008. NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization. In 2008 16th IEEE International Conference on Program Comprehension. 172--181. 10.1109/ICPC.2008.41 Google ScholarDigital Library
- Rajeev Tiwari and Noopur Goel. 2013. Reuse: Reducing Test Effort. SIGSOFT Softw. Eng. Notes 38, 2 (March 2013), 1--11. 10.1145/2439976.2439982 Google ScholarDigital Library
- W. T. Tsai, A. Saimi, L. Yu, and R. Paul. 2003. Scenario-based object-oriented testing framework. In Third International Conference on Quality Software, 2003. Proceedings. 410--417. 10.1109/QSIC.2003.1319129 Google ScholarDigital Library
- Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd international conference on extreme programming and flexible processes in software engineering (XP). 92--95.Google Scholar
- Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media. Google ScholarDigital Library
- W. E. Wong, J. R. Horgan, S. London, and H. Agrawal. 1997. A study of effective regression testing in practice. In Proceedings The Eighth International Symposium on Software Reliability Engineering. 264--274. 10.1109/ISSRE.1997.630875 Google ScholarDigital Library
- Jifeng Xuan, Benoit Cornu, Matias Martinez, Benoit Baudry, Lionel Seinturier, and Martin Monperrus. 2016. B-Refactoring: Automatic test code refactoring to improve dynamic analysis. Information and Software Technology 76 (2016), 65--80. Google ScholarDigital Library
Index Terms
- LCCSS: A Similarity Metric for Identifying Similar Test Code
Recommendations
Looking for More Confidence in Refactoring? How to Assess Adequacy of Your Refactoring Tests
QSIC '08: Proceedings of the 2008 The Eighth International Conference on Quality SoftwareRefactoring is an important technique in today's software development practice. If applied correctly, it can significantly improve software design without altering behavior. During refactoring, developers rely on regression testing. However, without ...
A test driven approach for aspectualizing legacy software using mock systems
Aspect-based refactoring, called aspectualization, involves moving program code that implements cross-cutting concerns into aspects. Such refactoring can improve the maintainability of legacy systems. Long compilation and weave times, and the lack of an ...
Test coverage and impact analysis for detecting refactoring faults: a study on the extract method refactoring
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied ComputingRefactoring validation by automated testing is a common practice in agile development processes. However, this practice can be misleading when the test suite is not adequate. Particularly, refactoring faults can be tricky and difficult to detect. While ...
Comments