On the possibilities of (pseudo-) software cloning from external interactions

Reformat, Marek; Chai, Xinwei; Miller, James

doi:10.1007/s00500-007-0215-6

On the possibilities of (pseudo-) software cloning from external interactions

Focus
Published: 30 June 2007

Volume 12, pages 29–49, (2008)
Cite this article

Soft Computing Aims and scope Submit manuscript

Marek Reformat¹,
Xinwei Chai¹ &
James Miller¹

120 Accesses
6 Citations
Explore all metrics

Abstract

This paper discusses some initial investigations into the application of genetic programming technology as a vehicle for re-examining some existing approaches within the software life-cycle. Specifically, it outlines a new direction in production techniques—software cloning from executable specifications or source code. It explores the possibility and advantages of producing a system from its external interactions. To allow this production to be automatic, the system assumes that it can view (and potentially manipulate) these external interactions of the original system; and hence it assumes the existence of either an executable specification or the source code—an object to assist in the generation of the external interactions; i.e. the system is treated as a black-box. Although the generation and application of software clones is relatively unexplored, it is believed that this is a fundamental technology that can have many different applications within a software engineering environment. For example, software clones could be used in: complexity measurement, software testing and software fault tolerance. Clearly, for these clones to be usable, their production needs to be automated. An interesting approach to this automatic production or generation problem is the application of evolutionary-based Genetic Programming (GP). Using the paradigms of best fit, selection, crossover and mutation, a number of clones, satisfying specific requirements, can be automatically generated. In general, GP is a flexible and powerful algorithm suitable for solving a variety of different problems. This paper presents the results of studies that have been conducted in order to answer questions related to feasibility of using GP for clone generation: what features of GP are important? What works and what does not? How can the GP be “tuned” for the problem? The results have been used to draw a set of suggestions and conclusions that indicate possible usability of GP-based approach to automatic generation of clones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adamopoulos K, Harman M, Hierons RM (2004) How to Overcome the Equivalent Mutant Problem and Achieve Tailored Selective Mutation Using Co-evolution, GECCO 1338–1349
Ammann PE, Knight JC (1988) Data diversity: an approach to software fault tolerance. IEEE Trans Comput 37:418–425
Article Google Scholar
Baker BS (1995) On finding duplication and near-duplication in large software systems. In: 2nd working conference on reverse engineering, WCRE, pp 86–95
Baxter ID, Yahin A, Mendonça de Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: 14th IEEE conference on software maintenance, ICSM, pp 368–377
Beck K (1999) Extreme programming explained: embrace change. Addison-Wesley, Reading
Google Scholar
Beck K (2002) Test driven development: by example. Addison-Wesley, Reading
Google Scholar
Briand L, Labiche Y, Wang Y (2002) Using simulation to empirically investigate test coverage criteria based on statecharts. Technical Report SCE-02–09, Carleton University
Bleuler S, Brack M, Thiele L, Zitzler E (2001) Multiobjective genetic programming: reducing bloat using SPEA2. In: Congress on evolutionary computation CEC-2001, pp 536–543
Casagrande JT, Pike MC, Smith PG (1978) An improved approximate formula for calculating sample sizes for comparing two binomial distributions. Biometrics 34:483–486
Article MATH Google Scholar
Chen TY, Tse TH, Zhou Z (2002) Semi-Proving: an integrated method based on global symbolic evaluation and metamorphic testing. In: International symposium on software testing and analysis, pp. 191–195
Demillo R J Lipton, Sayward F (1979) Program mutation: a new approach to program testing. Infotech State of the Art Report, Software Testing 2:107–126
Google Scholar
Ducasse S, Rieger M, Demeyer S (1999) A language independent approach for detecting duplicated code. In: 15th IEEE conference on software maintenance, ICSM, pp 109–118
Fogel DB (1995) Evolutionary computation, toward a new philosophy of machine intelligence. IEEE Press, Piscataway
Google Scholar
Fowler M: The New Methodology. http://www.martinfowler.com/articles/newmethodology.html
Gell-Mann M (1995) What is complexity. Complexity 1(1):16–19
MATH Google Scholar
George B, Williams L (2004) A structured experiment of test-driven development. Inform Softw Technol 46(5):337–342
Article Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading
MATH Google Scholar
Halstead MH (1977) Elements of software science. Elsevier North-Holland, New York
MATH Google Scholar
Harman M, Hierons RM, Danicic S (2000) The relationship between program dependence and mutation testing. Mutation, pp 15–23
Hierons RM, Harman M, Danicic S (1999) Using program slicing to assist in the detection of equivalent mutants. J Softw Test Verificat Reliab 9(4):233–262
Article Google Scholar
Holland JH (1992) Adaptation in natural and artificial systems. 2nd edn. MIT Press, Cambridge
Google Scholar
Kamiya T, Kusumoto S, Inoue K (2002) CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans Softw Eng 28(7):654–670
Article Google Scholar
Kantschik W, Banzhaf W (2002) Linear-Graph GP—a new GP structure. In: 5th genetic programming, European conference, pp 83–92
Kasper C, Godfrey W (2006) Cloning considered harmful. In: 13th working conference on reverse engineering, pp 19–28
Kim M, Sazawal V, Notkin D, Murphy GC (2005) An empirical study of code clone genealogies. In: ESEC/SIGSOFT FSE, pp 187–196
Knight JC, Leveson NG (1986) An experimental evaluation of the assumption of independence in multiversion programming. IEEE Trans Softw Eng 12(1):96–109
Google Scholar
Koza J (1992) Genetic programming. MIT Press, Cambridge
MATH Google Scholar
Krinke J (2001) Identifying similar code with program dependence graphs. In: 8th working conference on reverse engineering, WCRE, pp 301–309
Langdon WB (1998) Genetic programming and data structures: Genetic Programming + Data Structures = Automatic Programming. Genetic Programming, vol 1. Kluwer, Boston
Google Scholar
Luke S ECJ: http://www.cs.umd.edu/projects/plus/ec/ecj
Luke S, Panait L (2002) Fighting bloat with nonparametric parsimony pressure. In: 7th international conference on parallel problem solving from nature, pp 411–421
Lyu MR, He Y (1993) Improving the N-Version programming process through the evolution of a design paradigm. IEEE Trans Reliab 42:179–189
Article Google Scholar
Mayrand J, Leblanc C, Merlo E (1996) Experiment on the automatic detection of function clones in a software system using metrics. In: International conference on software maintenance, ICSM, pp 244–253
Maximilien E, Williams L (2003) Assessing test-driven development at IBM. In: 25th international conference on software engineering, pp 564–569
McCabe T (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320
Article Google Scholar
Mitchell BS, Mancoridis S (2003) Modeling the search landscape of metaheuristic software clustering algorithms. In: Genetic and evolutionary computing conference, GECCO, pp 2499–2510
Michael CC, McGraw G, Schatz MA (2001) Generating software test data by evolution. IEEE Trans Softw Eng 27(12):1085–1110
Article Google Scholar
Montana DJ (1995) Strongly typed genetic programming. Evolut Comput 3(2):199–230
Google Scholar
Mresa ES, Bottaci L (1999) Efficiency of mutation operators and selective mutation strategies: an empirical study. J Softw Test Verificat Reliab 9(4):205–232
Article Google Scholar
O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Trans Evolut Comput 5(4):349–358
Article Google Scholar
Offut AJ, Lee A, Rothermel G, Untch RH, Zapf C (1996) An experimental determination of sufficient mutant operators. ACM Trans Softw Eng Methodol 5(2):99–118
Article Google Scholar
Offutt J, Pan J (1997) Automatically detecting equivalent mutants and infeasible paths. J Softw Test Verificat Reliab 7(3):165–192
Article Google Scholar
Parasoft Corporation, Mutation Testing: A New Approach to Automatic Error-Detection. http://www.parasoft.com/jsp/products/article. jsp?articleId=291
Pines D (1988). Emerging synthesis in science. Addison-Wesley, Reading
Google Scholar
Soule T (1998) Code growth in genetic programming. PhD thesis, University of Idaho
The Standish Group (2001) CHOAS Chronicles II. The Standish Group International Inc.
Torres-Pommales W. (2000) Software fault tolerance: a tutorial. NASA/TM-2000-210616
Williams L, Maximilien E, Vouk M (2003) Test-driven development as a defect-reduction practice. In: 14th IEEE international symposium on software reliability engineering, pp 34–48

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Software Technology, Engineering and Measurement Research Center (STEAM), University of Alberta, St. Albert, AB, Canada, T6G 2G7
Marek Reformat, Xinwei Chai & James Miller

Authors

Marek Reformat
View author publications
You can also search for this author in PubMed Google Scholar
Xinwei Chai
View author publications
You can also search for this author in PubMed Google Scholar
James Miller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marek Reformat.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reformat, M., Chai, X. & Miller, J. On the possibilities of (pseudo-) software cloning from external interactions. Soft Comput 12, 29–49 (2008). https://doi.org/10.1007/s00500-007-0215-6

Download citation

Published: 30 June 2007
Issue Date: January 2008
DOI: https://doi.org/10.1007/s00500-007-0215-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the possibilities of (pseudo-) software cloning from external interactions

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Automatic software refactoring: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the possibilities of (pseudo-) software cloning from external interactions

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Automatic software refactoring: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation