skip to main content
10.1145/2245276.2231970acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

IDE-based real-time focused search for near-miss clones

Published: 26 March 2012 Publication History

Abstract

Code clone is a well-known code smell that needs to be detected and managed during the software development process. However, the existing clone detectors have one or more of the three shortcomings: (a) limitation in detecting Type-3 clones, (b) they come as stand-alone tools separate from IDE and thus cannot support clone-aware development, (c) they overwhelm the developer with all clones from the entire code-base, instead of a focused search for clones of a selected code segment of the developer's interest.
This paper presents our IDE-integrated clone search tool, that addresses all the above issues. For clone detection, we adapt a suffix-tree-based hybrid algorithm. Through an asymptotic analysis, we show that our approach for clone detection is both time and memory efficient. Moreover, using three separate empirical studies, we demonstrate that our tool is flexibly usable for searching exact (Type-1) and near-miss (Type-2 and Type-3) clones with high precision and recall.

References

[1]
S. Bellon, R. Koschke, G. Antoniol, J. Krinke, and E. Merlo. Comparison and evaluation of clone detection tools. IEEE Trans. on Softw. Engg., 33(9): 577--591, 2007.
[2]
N. Bettenburg, W. Shang, W. Ibrahim, B. Adams, Y. Zou, and A. Hassan. An empirical study on inconsistent changes to code clones at release level. Science of Computer Programming, 17 pages, 2010.
[3]
M. de Wit. Managing Clones Using Dynamic Change Tracking and Resolution. M.Sc. thesis, Delft University of Technology, 2008.
[4]
E. Duala-Ekoko and M. Robillard. CloneTracker: tool support for code clone management. In ICSE, pages 843--846, 2008.
[5]
D. Gusfield. Algorithms on Strings, Trees, and Sequences. Computer and Computational Biology. Cambridge University Press, 1st edition, 1997.
[6]
P. Jablonski and D. Hou. CReN: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the IDE. In ETX, pages 16--20, 2007.
[7]
E. Juergens, F. Deissenboeck, and B. Hummel. CloneDetective - a workbench for clone detection research. In ICSE, pages 603--606, 2009.
[8]
C. Kapser and M. Godfrey. "Cloning considered harmful" considered harmful. In WCRE, pages 19--28, 2006.
[9]
S. Kawaguchi, T. Yamashina, H. Uwano, K. Fushida, Y. Kamei, M. Nagura, and H. Iida. SHINOBI: A tool for automatic code clone detection in the IDE. In WCRE, pages 313--314, 2009.
[10]
G. Landau and U. Vishkin. Fast parallel and serial approximate string matching. J. Algorithms, 10(2): 157--169, 1989.
[11]
M. Lee, J. Roh, S. Hwang, and S. Kim. Instant code clone search. In FSE, pages 167--176, 2010.
[12]
S. Lee and I. Jeong. SDD: high performance code clone detection system for large scale source code. In OOPSLA, pages 140--141, 2005.
[13]
T. Nguyen, H. Nguyen, N. Pham, J. Al-Kofahi, and T. Nguyen. Clone-aware configuration management. In ASE, pages 123--134, 2009.
[14]
M. Rabin. Fingerprinting by random polynomials. Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.
[15]
M. Rieger, S. Ducasse, and M. Lanza. Insights into system-wide code duplication. In WCRE, pages 100--109, 2004.
[16]
C. Roy and J. Cordy. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In ICPC, pages 172--181, 2008.
[17]
C. Roy and J. Cordy. A mutation/injection-based automatic framework for evaluating code clone detection tools. In ICSTW, pages 157--166, 2009.
[18]
C. Roy and J. Cordy. Near-miss function clones in open source software: an empirical study. J. of Softw. Maintenance and Evolution: Research and Practice, 22(3): 165--189, 2010.
[19]
C. Roy, J. Cordy, and R. Koschke. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Sci. Comput. Program., 74: 470--495, 2009.
[20]
R. Tairas and J. Gray. Phoenix-based clone detection using suffix trees. In ACM-SE, pages 679--684, 2006.
[21]
R. Tairas and J. Gray. Get to know your clones with CeDAR. In OOPSLA, pages 817--818, 2009.
[22]
E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14: 249--260, 1995.
[23]
V. Weckerle. CPC: an eclipse framework for automated clone life cycle tracking and update anomaly detection. Master's thesis, Freie Universität Berlin, Germany, 2008.
[24]
M. Zibran and C. Roy. A constraint programming approach to conflict-aware optimal scheduling of prioritized code clone refactoring. In SCAM, pages 105--114, 2011.
[25]
M. Zibran and C. Roy. Towards flexible code clone detection, management, and refactoring in IDE. In IWSC, pages 75--76, 2011.
[26]
M. Zibran, R. Saha, M. Asaduzzaman, and C. Roy. Analyzing and forecasting near-miss clones in evolving software: An empirical study. In ICECCS, pages 295--304, 2011.

Cited By

View all
  • (2022)Utilizing source code syntax patterns to detect bug inducing commits using machine learning modelsSoftware Quality Journal10.1007/s11219-022-09611-331:3(775-807)Online publication date: 31-Dec-2022
  • (2021)Analysis of Software ClonesInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT217290(439-450)Online publication date: 20-Apr-2021
  • (2021)Source Code Clone SearchCode Clone Analysis10.1007/978-981-16-1927-4_9(121-134)Online publication date: 4-Aug-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied Computing
March 2012
2179 pages
ISBN:9781450308571
DOI:10.1145/2245276
  • Conference Chairs:
  • Sascha Ossowski,
  • Paola Lecca
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 March 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clone detection
  2. clone search
  3. maintenance
  4. reengineering

Qualifiers

  • Research-article

Conference

SAC 2012
Sponsor:
SAC 2012: ACM Symposium on Applied Computing
March 26 - 30, 2012
Trento, Italy

Acceptance Rates

SAC '12 Paper Acceptance Rate 270 of 1,056 submissions, 26%;
Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Utilizing source code syntax patterns to detect bug inducing commits using machine learning modelsSoftware Quality Journal10.1007/s11219-022-09611-331:3(775-807)Online publication date: 31-Dec-2022
  • (2021)Analysis of Software ClonesInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT217290(439-450)Online publication date: 20-Apr-2021
  • (2021)Source Code Clone SearchCode Clone Analysis10.1007/978-981-16-1927-4_9(121-134)Online publication date: 4-Aug-2021
  • (2020)A machine learning based framework for code clone validationJournal of Systems and Software10.1016/j.jss.2020.110686(110686)Online publication date: Jun-2020
  • (2019)Research on the Tools of Clone Code RefactoringProceedings of the 2019 3rd International Conference on Management Engineering, Software Engineering and Service Sciences10.1145/3312662.3312693(27-31)Online publication date: 12-Jan-2019
  • (2018)Detection Technology and Application of Clone RefactoringProceedings of the 2018 2nd International Conference on Management Engineering, Software Engineering and Service Sciences10.1145/3180374.3181332(128-133)Online publication date: 13-Jan-2018
  • (2016)Towards Implementation of an Integrated Clone Management Infrastructure2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)10.1109/SANER.2016.89(60-61)Online publication date: Mar-2016
  • (2016)Development nature mattersEmpirical Software Engineering10.1007/s10664-015-9368-621:2(517-564)Online publication date: 1-Apr-2016
  • (2015)Towards convenient management of software clone codes in practiceProceedings of the 25th Annual International Conference on Computer Science and Software Engineering10.5555/2886444.2886475(211-220)Online publication date: 2-Nov-2015
  • (2015)Examining the effectiveness of using concolic analysis to detect code clonesProceedings of the 30th Annual ACM Symposium on Applied Computing10.1145/2695664.2695929(1610-1615)Online publication date: 13-Apr-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media