skip to main content
10.1145/3425269.3425283acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbcarsConference Proceedingsconference-collections
research-article

LCCSS: A Similarity Metric for Identifying Similar Test Code

Published:30 October 2020Publication History

ABSTRACT

Test code maintainability is a common concern in software testing. In order to achieve good maintainability, test methods should be clearly structured, well named, small in size, and, mainly, test code duplication should be avoided. Several strategies exist to avoid test code duplication, such as implicit setup and delegated setup. However, prior to applying these strategies, first it is necessary to identify the duplicate code, which can be a time-consuming task. To address this problem, we automate the identification of duplicate test code through the application of code similarity metrics. We propose a novel similarity metric, called Longest Common Contiguous Start Sub-Sequence (LCCSS), to identify refactoring candidates. LCCSS is a metric used to measure similarity between pairs of tests. The most similar pairs are reported as strong candidates to be refactored through the implicit setup strategy. We also develop a framework, called Róża, that can use different similarity metrics to identify test code duplication. An experiment shows that LCCSS and Simian, a clone detection tool, have both identified pairs of tests to be refactored through the implicit setup strategy with maximum precision in all the eleven standard recall levels. But, unlike Simian, LCCSS does not need to be calibrated for each project.

References

  1. Ricardo Baeza-Yates, Berthier Ribeiro-Neto, et al. 1999. Modern information retrieval. Vol. 463. ACM press New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. D. Baxter, A. Yahin, L. Moura, M. Sant'Anna, and L. Bier. 1998. Clone detection using abstract syntax trees. In Proceedings. International Conference on Software Maintenance. 368--377. 10.1109/ICSM.1998.738528 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kent Beck. 2003. Test-driven development: by example. Addison-Wesley Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Stefan Berner, Roland Weber, and Rudolf K. Keller. 2005. Observations and Lessons Learned from Automated Testing. In Proceedings of the 27th International Conference on Software Engineering (ICSE '05). ACM, New York, NY, USA, 571--579. 10.1145/1062455.1062556 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. A. Christensen, S. Gundersborg, K. D. Linde, and K. Torp. 2006. A Unit-Test Framework for Database Applications. In 2006 10th International Database Engineering and Applications Symposium (IDEAS'06). 11--20. 10.1109/IDEAS.2006.7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Vašek Chvátal, David A Klarner, and Donald Ervin Knuth. 1972. Selected combinatorial research problems. Computer Science Department, Stanford University.Google ScholarGoogle Scholar
  7. J. R. Cordy and C. K. Roy. 2011. The NiCad Clone Detector. In 2011 IEEE 19th International Conference on Program Comprehension. 219--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. P. da Silva and P. Vilain. 2016. Execution and code reuse between test classes. In 2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA). 99--106. 10.1109/SERA.2016.7516134Google ScholarGoogle Scholar
  9. S. R. Dalal, A. Jain, N. Karunanithi, J. M. Leaton, C. M. Lott, G. C. Patton, and B. M. Horowitz. 1999. Model-based testing in practice. In Proceedings of the 1999 International Conference on Software Engineering. 285--294. 10.1145/302405.302640 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brett Daniel, Qingzhou Luo, Mehdi Mirzaaghaei, Danny Dig, Darko Marinov, and Mauro Pezzè. 2011. Automated GUI Refactoring and Test Script Repair. In Proceedings of the First International Workshop on End-to-End Test Script Engineering (ETSE '11). Association for Computing Machinery, New York, NY, USA, 38--41. 10.1145/2002931.2002937 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. John L Donaldson, Ann-Marie Lancaster, and Paula H Sposato. 1981. A plagiarism detection system. In ACM SIGCSE Bulletin, Vol. 13. ACM, 21--25. Issue 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Ducasse, M. Rieger, and S. Demeyer. 1999. A language independent approach for detecting duplicated code. In Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 109--118. 10.1109/ICSM.1999.792593 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Erfani, I. Keivanloo, and J. Rilling. 2013. Opportunities for Clone Detection in Test Case Recommendation. In 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops. 65--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Martin Fowler. 2018. Refactoring: improving the design of existing code. Addison-Wesley Professional.Google ScholarGoogle Scholar
  15. Steve Freeman and Nat Pryce. 2009. Growing Object-Oriented Software, Guided by Tests. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Greiler, A. van Deursen, and M. Storey. 2013. Automated Detection of Test Fixture Strategies and Smells. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. 322--331. 10.1109/ICST.2013.45 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Michaela Greiler, Andy Zaidman, Arie Van Deursen, and Margaret-Anne Storey. 2013. Strategies for avoiding text fixture smells during software evolution. In 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, 387--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Eduardo Martins Guerra and Clovis Torres Fernandes. 2007. Refactoring test code safely. In International Conference on Software Engineering Advances (ICSEA 2007). IEEE, 44--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. W. Hamming. 1950. Error detecting and error correcting codes. The Bell System Technical Journal 29, 2 (April 1950), 147--160. 10.1002/j.1538-7305.1950.tb00463.xGoogle ScholarGoogle ScholarCross RefCross Ref
  20. M. Jean Harrold, Rajiv Gupta, and Mary Lou Soffa. 1993. A Methodology for Controlling the Size of a Test Suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270--285. 10.1145/152388.152391 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hadi Hemmati, Andrea Arcuri, and Lionel Briand. 2010. Reducing the Cost of Model-Based Testing through Test Case Diversity. In Testing Software and Systems. Springer Berlin Heidelberg, Berlin, Heidelberg, 63--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Hemmati and L. Briand. 2010. An Industrial Investigation of Similarity Measures for Model-Based Test Case Selection. In 2010 IEEE 21st International Symposium on Software Reliability Engineering. 141--150. 10.1109/ISSRE.2010.9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hadi Hemmati, Lionel Briand, Andrea Arcuri, and Shaukat Ali. 2010. An Enhanced Test Case Selection Approach for Model-based Testing: An Industrial Case Study. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '10). ACM, New York, NY, USA, 267--276. 10.1145/1882291.1882331 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Holmes and G. C. Murphy. 2005. Using structural context to recommend source code examples. In Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005. 117--125. 10.1109/ICSE.2005.1553554 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Kamiya, S. Kusumoto, and K. Inoue. 2002. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (July 2002), 654--670. 10.1109/TSE.2002.1019480 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yoshio Kataoka, David Notkin, Michael D. Ernst, and William G. Griswold. 2001. Automated Support for Program Refactoring Using Invariants. In Proceedings of the IEEE International Conference on Software Maintenance (ICSM '01) (ICSM '01). IEEE Computer Society, Washington, DC, USA, 736-. 10.1109/ICSM.2001.972794 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Krinke. 2001. Identifying similar code with program dependence graphs. In Proceedings Eighth Working Conference on Reverse Engineering. 301--309. 10.1109/WCRE.2001.957835 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Landhäußer and W. F. Tichy. 2012. Automated test-case generation by cloning. In 2012 7th International Workshop on Automation of Software Test (AST). 83--88. 10.1109/IWAST.2012.6228995 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yves Ledru, Alexandre Petrenko, Sergiy Boroday, and Nadine Mandran. 2012. Prioritizing test cases with string distances. Automated Software Engineering 19, 1 (Mar 2012), 65--95. 10.1007/s10515-011-0093-0 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. 707--710. Issue 8.Google ScholarGoogle Scholar
  31. Douglas Hiura Longo, Beatriz Wilges, Patrícia Vilain, and Renato Cislaghi. 2015. Fixture Setup through Object Notation for Implicit Test Fixtures. Journal of Computer Science 11, 6 (2015), 794.Google ScholarGoogle ScholarCross RefCross Ref
  32. Udi Manber et al. 1994. Finding Similar Files in a Large File System.. In Usenix Winter, Vol. 94. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Gerard Meszaros. 2007. xUnit test patterns: Refactoring test code. Pearson Education. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Breno Miranda, Emilio Cruciani, Roberto Verdecchia, and Antonia Bertolino. 2018. FAST Approaches to Scalable Similarity-based Test Case Prioritization. In Proceedings of the 40th International Conference on Software Engineering (ICSE '18). ACM, New York, NY, USA, 222--232. 10.1145/3180155.3180210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Iman Hemati Moghadam and Mel Ó Cinnéide. 2011. Code-Imp: A Tool for Automated Search-based Refactoring. In Proceedings of the 4th Workshop on Refactoring Tools (WRT '11). ACM, New York, NY, USA, 41--44. 10.1145/1984732.1984742 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Karl J Ottenstein. 1976. An algorithmic approach to the detection and prevention of plagiarism. ACM Sigcse Bulletin 8, 4(1976), 30--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Chaiyong Ragkhitwetsagul, Jens Krinke, and David Clark. 2018. A comparison of code similarity analysers. Empirical Software Engineering 23, 4 (Aug 2018), 2464--2519. 10.1007/s10664-017-9564-7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. K. Roy and J. R. Cordy. 2008. NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization. In 2008 16th IEEE International Conference on Program Comprehension. 172--181. 10.1109/ICPC.2008.41 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rajeev Tiwari and Noopur Goel. 2013. Reuse: Reducing Test Effort. SIGSOFT Softw. Eng. Notes 38, 2 (March 2013), 1--11. 10.1145/2439976.2439982 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. W. T. Tsai, A. Saimi, L. Yu, and R. Paul. 2003. Scenario-based object-oriented testing framework. In Third International Conference on Quality Software, 2003. Proceedings. 410--417. 10.1109/QSIC.2003.1319129 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd international conference on extreme programming and flexible processes in software engineering (XP). 92--95.Google ScholarGoogle Scholar
  42. Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. W. E. Wong, J. R. Horgan, S. London, and H. Agrawal. 1997. A study of effective regression testing in practice. In Proceedings The Eighth International Symposium on Software Reliability Engineering. 264--274. 10.1109/ISSRE.1997.630875 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jifeng Xuan, Benoit Cornu, Matias Martinez, Benoit Baudry, Lionel Seinturier, and Martin Monperrus. 2016. B-Refactoring: Automatic test code refactoring to improve dynamic analysis. Information and Software Technology 76 (2016), 65--80. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LCCSS: A Similarity Metric for Identifying Similar Test Code

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        SBCARS '20: Proceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse
        October 2020
        172 pages
        ISBN:9781450387545
        DOI:10.1145/3425269

        Copyright © 2020 ACM

        Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 October 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate23of79submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader