Skip to main content

Introduction to Code Clone Analysis

  • Chapter
  • First Online:
Code Clone Analysis
  • 747 Accesses

Abstract

Code Clone is a code snippet that has the same or similar code snippet in the same or different software system. The existence of code clones is an issue on software maintenance and a clue to understanding the structure and evolution of software systems. A large number of researches on code clones have been performed, and many tools for code clone analysis have been developed. In this chapter, we will explain some of the terms that are important for understanding code clones, such as definition, type, analysis granularity, and analysis domain. We will also outline the approaches and applications of code clone analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. F.A. Akbar, S. Rochimah, R.J. Akbar, Investigation of sql clone on mvc-based application. IPTEK J. Proc. Series 2018(1), 72–77 (2018)

    Google Scholar 

  2. B. Baker, A program for identifying duplicated code, in Proceedings of Computing Science and Statistics: 24th Symposium on the Interface, vol. 24, pp. 49–57 (1992)

    Google Scholar 

  3. H.A. Basit, D.C. Rajapakse, S. Jarzabek, Beyond templates: a study of clones in the STL and some general implications, in 27th International Conference on Software Engineering (ICSE 2005) (St. Louis, Missouri, USA, 2005), pp. 451–459

    Google Scholar 

  4. V. Bauer, T. Völke, S. Eder, Combining clone detection and latent semantic indexing to detect re-implementations, in 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 3 (IEEE, 2016), pp. 23–29

    Google Scholar 

  5. I.D. Baxter, A. Yahin, L. Moura, M. Sant’Anna, L. Bier, Clone detection using abstract syntax trees, in Proceedings of International Conference on Software Maintenance (IEEE, 1998), pp. 368–377

    Google Scholar 

  6. S. Carter, R. Frank, D. Tansley, Clone detection in telecommunications software systems: a neural net approach, in Proceedings of International Workshop on Application of Neural Networks to Telecommunications (1993), pp. 273–287

    Google Scholar 

  7. N. Davey, P. Barson, S. Field, R. Frank, D. Tansley, The development of a software clone detector. Int. J. Appl. Soft. Technol. (1995)

    Google Scholar 

  8. F. Deissenboeck, B. Hummel, E. Jürgens, B. Schätz, S. Wagner, J.F. Girard, S. Teuchert, Clone detection in automotive model-based development, in 2008 ACM/IEEE 30th International Conference on Software Engineering (ICSE2008) (IEEE, 2008), pp. 603–612

    Google Scholar 

  9. S. Ducasse, M. Rieger, S. Demeyer, A language independent approach for detecting duplicated code, in Proceedings IEEE International Conference on Software Maintenance (ICSM’99) (IEEE, 1999), pp. 109–118

    Google Scholar 

  10. D. Gusfield, Algorithms on Strings, Trees and Sequences (Cambridge University Press, New York, NY, 1997)

    Google Scholar 

  11. Y. Higo, Y. Ueda, T. Kamiya, S. Kusumoto, K. Inoue, On software maintenance process improvement based on code clone analysis, in International Conference on Product Focused Software Process Improvement (Springer, 2002), pp. 185–197

    Google Scholar 

  12. K. Inoue, Y. Sasaki, P. Xia, Y. Manabe, Where does this code come from and where does it go?—integrated code history tracker for open source systems, in 2012 34th International Conference on Software Engineering (ICSE12) (2012), pp. 331–341

    Google Scholar 

  13. L. Jiang, G. Misherghi, Z. Su, S. Glondu, Deckard: scalable and accurate tree-based detection of code clones, in 29th International Conference on Software Engineering (ICSE ’07) (IEEE, 2007), pp. 96–105

    Google Scholar 

  14. T. Kamiya, S. Kusumoto, K. Inoue, Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Software Eng. 28, 654–670 (2002)

    Google Scholar 

  15. C.J. Kapser, M.W. Godfrey, “Cloning considered harmful” considered harmful: patterns of cloning in software. Emp. Soft. Eng. 13(6), 645 (2008)

    Google Scholar 

  16. I. Keivanloo, C.K. Roy, J. Rilling, Java bytecode clone detection via relaxation on code fingerprint and semantic web reasoning, in 2012 6th International Workshop on Software Clones (IWSC) (IEEE, 2012), pp. 36–42

    Google Scholar 

  17. R. Komondoor, S. Horwitz, Semantics-preserving procedure extraction, in Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (2000), pp. 155–169

    Google Scholar 

  18. R. Koschke, Survey of research on software clones, in Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2007)

    Google Scholar 

  19. B. Lague, D. Proulx, J. Mayrand, E.M. Merlo, J. Hudepohl, Assessing the benefits of incorporating function clone detection in a development process, in 1997 Proceedings International Conference on Software Maintenance (IEEE, 1997), pp. 314–321

    Google Scholar 

  20. L. Li, H. Feng, W. Zhuang, N. Meng, B. Ryder, Cclearner: a deep learning-based clone detection approach, in 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) (IEEE, 2017), pp. 249–260

    Google Scholar 

  21. C. Manning, H. Schutze, Foundations of Statistical Natural Language Processing (MIT Press, 1999)

    Google Scholar 

  22. T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)

  23. A. Monden, D. Nakae, T. Kamiya, S. Sato, K. Matsumoto, Software quality analysis by code clones in industrial legacy software, in Proceedings Eighth IEEE Symposium on Software Metrics (IEEE, 2002), pp. 87–94

    Google Scholar 

  24. T.T. Nguyen, H.A. Nguyen, J.M. Al-Kofahi, N.H. Pham, T.N. Nguyen, Scalable and incremental clone detection for evolving software, in 2009 IEEE International Conference on Software Maintenance (IEEE, 2009), pp. 491–494

    Google Scholar 

  25. D. Pizzolotto, K. Inoue, Identifying compiler and optimization options from binary code using deep learning approaches, in 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2020), pp. 232–242

    Google Scholar 

  26. C.K. Roy, J.R. Cordy, R. Koschke, Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009)

    Google Scholar 

  27. A. Sæbjørnsen, J. Willcock, T. Panas, D. Quinlan, Z. Su, Detecting code clones in binary executables, in Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA ’09 (ACM, 2009), pp. 117–128

    Google Scholar 

  28. Y. Semura, N. Yoshida, E. Choi, K. Inoue, Ccfindersw: clone detection tool with flexible multilingual tokenization, in 2017 24th Asia-Pacific Software Engineering Conference (APSEC) (2017), pp. 654–659

    Google Scholar 

  29. A. Sheneamer, J. Kalita, A survey of software clone detection techniques. Int. J. Comput. Appl. 137(10), 1–21 (2016)

    Google Scholar 

  30. W. Smyth, Computing Patterns in Strings (Addison-Wesley, New York, 2003)

    Google Scholar 

  31. J. Svajlenko, C.K. Roy, A survey on the evaluation of clone detection performance and benchmarking (2020)

    Google Scholar 

  32. Synopsys: Black duck open source security and license compliance (2020). https://www.blackducksoftware.com/

  33. K. Uemura, A. Mori, K. Fujiwara, E. Choi, H. Iida, Detecting and analyzing code clones in hdl, in 2017 IEEE 11th International Workshop on Software Clones (IWSC) (IEEE, 2017), pp. 1–7

    Google Scholar 

  34. M. Weis, F. Naumann, F. Brosy, A duplicate detection benchmark for xml (and relational) data, in Proceedings of Workshop on Information Quality for Information Systems (IQIS) (2006)

    Google Scholar 

  35. T. Yamamoto, M. Matsushita, T. Kamiya, K. Inoue, Measuring similarity of large software systems based on source code correspondence, in 6th International PROFES (Product Focused Software Process Improvement), LNCS3547 (2005), pp. 530–544

    Google Scholar 

  36. K. Yokoi, E. Choi, N. Yoshida, K. Inoue, Investigating vector-based detection of code clones using bigclonebench, in 2018 25th Asia-Pacific Software Engineering Conference (APSEC) (IEEE, 2018), pp. 699–700

    Google Scholar 

Download references

Acknowledgements

We are grateful for the useful comments by Norihiro Yoshida and Eunjong Choi. This work is partially supported by JSPS KAKENHI Grant Number 18H04094.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katsuro Inoue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Inoue, K. (2021). Introduction to Code Clone Analysis. In: Inoue, K., Roy, C.K. (eds) Code Clone Analysis. Springer, Singapore. https://doi.org/10.1007/978-981-16-1927-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1927-4_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1926-7

  • Online ISBN: 978-981-16-1927-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics