skip to main content
research-article

Clone detection in software source code using operational similarity of statements

Published:04 June 2014Publication History
Skip Abstract Section

Abstract

This paper presents a technique to detect clones in source code by comparing the operations performed in the statements comprising a function. The key concept used is that two functions are considered clones if the statements in the functions perform the same operation up to a certain extent. This could be ascertained by categorizing the available statement types based on the operations performed (for instance, addition, multiplication, function invocation, etc). Then, a category is assigned to each statement present in every function in the source code. Comparisons are then made between functions by comparing the categories of the statements to each other. If one function contains exactly the same categories of statement as another (same operations performed in both the functions), or contains a subset of statement categories (operations performed in one function are subset of another), then these functions are judged to be clones.

References

  1. Baker S., "A Program for Identifying duplicated Code", Computing Science and Statistics, vol. 24, pp. 49--57, 1992.Google ScholarGoogle Scholar
  2. Benjamin Biegel and Stephan Diehl "Highly Configurable And Extensible Code Clone Detection" Benjamin Biegel and Stephan Diehl University of Trier, Germany.Google ScholarGoogle Scholar
  3. Chanchal Kumar Roy and James R.Cordy: "A survey on Software Clone Detection Research" Technical Report No: 2007-541, School of computing, Queens University at Kingston Ontario, Canada (September 26, 2007).Google ScholarGoogle Scholar
  4. Filip Van Rysselberghe, Serge Demeyer. Evaluating Clone Detection Techniques. In Proceedings of the International Workshop on Evolution of Large Scale Industrial Applications (ELISA'03), 12pp., Amsterdam, The Netherlands, September 2003Google ScholarGoogle Scholar
  5. Ian J. Davis and Michael W Godfrey: Clone detection by exploiting assembler.Google ScholarGoogle Scholar
  6. Ira Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant Anna. Clone Detection Using Abstract Syntax Trees. The 14th International Conference on Software Maintenance (ICSM'98), pp. 368--377, Bethesda, Maryland, November 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. JCD (Java Clone Detector): http://www.swag.uwaterloo.ca/jcd/Google ScholarGoogle Scholar
  8. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su and Stephane Glondu University of California, "DECKARD: Scalable and Accurate Tree-based Detection of Code Clones Davis" 2007.Google ScholarGoogle Scholar
  9. Miryung Kim, Vibha Sazawal and Gail C Murphy "An empirical study of code clone Genealogies" University of Washington, Seattle, Washington, USA: 2005.Google ScholarGoogle Scholar
  10. Peter Bulychev, Marius Minea -- Duplicate code detection using anti-unification.Google ScholarGoogle Scholar
  11. Randy Smith and Susan Horwitz "Detecting and measuring similarity in Code Clones" Department of Computer Sciences, University of Wisconsin-Madison: 2002Google ScholarGoogle Scholar
  12. Soot: A java optimization framework: www.sable.mcgill.ca/sootGoogle ScholarGoogle Scholar
  13. Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code. Transactions on Software Engineering, Vol. 28(7): 654--670, July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yang Yuan and Yao Guo, "CMCD: Count Matrix based Code Clone Detection". National Engineering Research Center for Software Engineering, Peking Universiy, Beijing 100871, ChinaGoogle ScholarGoogle Scholar
  15. Satwinder Singh and K.S Kahlon, "Effectiveness of refactoring metrics model to identify smelly and error prone classes in open source software" ACM SIGSOFT Software Engineering Notes Volume 37 Issue 2, March 2012.Google ScholarGoogle Scholar
  16. Pavitdeep Singh, Satwinder Singh and Jatinder Kaur, "Tool for generating code metrics for C# source code using abstract syntax tree technique", ACM SIGSOFT Software Engineering Notes Volume 38 Issue 5, September 2013.Google ScholarGoogle Scholar
  17. Satwinder Singh, Puneet Mittal and K.S Kahlon, "Empirical model for predicting high, medium and low severity faults using object oriented metrics in Mozilla Firefox", International Journal of Computer Applications in Technology Volume 47 Issue 2/3, June 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Satwinder Singh and K.S. Kahlon, "Effectiveness of encapsulation and object-oriented metrics to refactor code and identify error prone classes using bad smells", ACM SIGSOFT Software Engineering Notes Volume 36 Issue 5, September 2011.Google ScholarGoogle Scholar

Index Terms

  1. Clone detection in software source code using operational similarity of statements

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGSOFT Software Engineering Notes
        ACM SIGSOFT Software Engineering Notes  Volume 39, Issue 3
        May 2014
        73 pages
        ISSN:0163-5948
        DOI:10.1145/2597716
        Issue’s Table of Contents

        Copyright © 2014 Authors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 June 2014

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader