Abstract
The literature presents conflicting claims regarding the effects of clones on software maintainability. For a community to progress, it is important to identify and address those areas of disagreement. Many claims, such as those related to developer behavior, either lack human-based empirical validation or are contradicted by other studies. This paper describes the results of two surveys to evaluate the level of agreement among clone researchers regarding claims that have not yet been validated through human-based empirical study. The surveys covered three key clone-related research topics: general information, developer behavior, and evolution. Survey 1 focused on high-level information about all three topics, whereas Survey 2 focused specifically on developer behavior. Approximately 20 clone researchers responded to each survey. The survey responses showed a lack of agreement on some major clone-related topics. First, the respondents disagree about the definitions of clone types, with some indicating the need for a taxonomy based upon developer intent. Second, the respondents were uncertain whether the ratio of cloned to non-cloned code affected system quality. Finally, the respondents disagree about the usefulness of various detection, analysis, evolution, and visualization tools for clone management tasks such as tracking and refactoring of clones. The overall results indicate the need for more focused, human-based empirical research regarding the effects of clones during maintenance. The paper proposes a strategy for future research regarding developer behavior and code clones in order to bridge the gap between clone research and the application of that research in clone maintenance.








Similar content being viewed by others
Notes
Clone-aware tools provide information about clones residing in a software system, for example detection results, evolution information or tracking information of clones.
References
Barbour L, Khomh F, Zou Y (2013) An empirical study of faults in late propagation clone genealogies. J Softw Evol Process 25(11):1139–1165. doi:10.1002/smr.1597
Baxter I, Yahin A, Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: Proceedings of the international conference on software maintenance, 1998. doi:10.1109/ICSM.1998.738528, pp 368–377
Bellon S, Koschke R, Antoniol G, Krinke J, Merlo E (2007) Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering 33 (9):577–591. doi:10.1109/TSE.2007.70725
Cai D, Kim M (2011) An empirical study of long-lived code clones. In: Proceedings of the 14th international conference on fundamental approaches to software engineering: part of the joint european conferences on theory and practice of software, Springer-Verlag, Berlin, Heidelberg, FASE’11/ETAPS’11, pp 432–446. http://dl.acm.org/citation.cfm?id=1987434.1987474
Chatterji D, Carver J, Massengil B, Oslin J, Kraft N (2011) Measuring the efficacy of code clone information in a bug localization task: an empirical study. In: International symposium on empirical software engineering and measurement, pp 20–29
Chatterji D, Carver J, Kraft N (2012) Claims and beliefs about code clones: do we agree as a community? a survey. In: 6th International workshop on software clones (IWSC), pp 15–21
Chatterji D, Carver J, Kraft N, Harder J (2013) Effects of cloned code on software maintainability: a replicated developer study. In: 20th Working conference on reverse engineering (WCRE), pp 112–121
De Wit M, Zaidman A, van Deursen A (2009) Managing code clones using dynamic change tracking and resolution.. In: IEEE international conference on Software Maintenance, ICSM 2009, pp 169– 178
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley, Boston
Glaser BG (1965) The constant comparative method of qualitative analysis. Soc Probl 12(4):436–445. doi:10.2307/798843. ArticleType: research-article / Full publication date: Spring, 1965 / Copyright 1965 University of California Press
Göde N, Koschke R (2013) Studying clone evolution using incremental clone detection. J Softw Evol Process 25(2):165–192. doi:10.1002/smr.520
Harder J, Göde N (2012) Cloned code: stable code. J Softw Evol Process. doi:10.1002/smr.1551
Jablonski P, Hou D (2010) Aiding software maintenance with copy-and-paste clone-awareness. In: 2010 IEEE 18th international conference on program comprehension (ICPC), pp 170–179
Kamiya T, Kusumoto S, Inoue K (2002) Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans Softw Eng 28 (7):654–670. doi:10.1109/TSE.2002.1019480
Kapser CJ, Godfrey MW (2008a) “cloning considered harmful” considered harmful: patterns of cloning in software. Empirical Softw Engg 13(6):645–692
Kapser CJ, Godfrey MW (2008b) “cloning considered harmful” considered harmful: patterns of cloning in software. Empir Softw Eng 13(6):645–692. doi:10.1007/s10664-008-9076-6
Kim M, Bergman L, Lau T, Notkin D (2004) An ethnographic study of copy and paste programming practices in oopl. In: Proceedings 2004 international symposium on empirical software engineering , pp 83–92
Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. SIGSOFT Softw Eng Notes 30(5):187–196. doi:10.1145/1095430.1081737
Koschke R, Baxter ID, Conradt M, Cordy JR (2012) Software clone management towards industrial application (Dagstuhl Seminar 12071). Dagstuhl Reports 2(2):21–57. doi:10.4230/DagRep.2.2.21
Lozano A, Wermelinger M, Nuseibeh B (2007) Evaluating the harmfulness of cloning: a change based experiment. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, Washington, p 18. MSR ’07
Pate J R, Tairas R, Kraft N A (2013) Clone evolution: a systematic review. J Softw Evol Process 25(3):261–283. doi:10.1002/smr.579
Rahman F, Bird C, Devanbu P (2010) Clones: what is that smell? In: 7th IEEE working conference on mining software repositories, pp 72–81
Roy C, Zibran M, Koschke R (2014) The vision of software clone management: past, present, and future (keynote paper). In: 2014 Software evolution week - IEEE conference on software maintenance, Reengineering and Reverse Engineering, pp 18–33
Roy CK, Cordy JR, Koschke R (2009) Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci Comput Program 74 (7):470–495. doi:10.1016/j.scico.2009.02.007
Thummalapenta S, Cerulo L, Aversano L, Di Penta M (2010) An empirical study on the maintenance of source code clones. Empirical Softw Engg 15(1):1–34. doi:10.1007/s10664-009-9108-x
Zhang G, Peng X, Xing Z, Zhao W (2012) Cloning practices: Why developers clone and what can be changed. In: 28th IEEE international conference on software maintenance, pp 285–294
Acknowledgments
We thank the survey respondents. We acknowledge support from NSF grant CCF-0915559.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Massimiliano Di Penta
Rights and permissions
About this article
Cite this article
Chatterji, D., Carver, J.C. & Kraft, N.A. Code clones and developer behavior: results of two surveys of the clone research community. Empir Software Eng 21, 1476–1508 (2016). https://doi.org/10.1007/s10664-015-9394-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9394-4