Skip to main content

Advertisement

Log in

On the possibilities of (pseudo-) software cloning from external interactions

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper discusses some initial investigations into the application of genetic programming technology as a vehicle for re-examining some existing approaches within the software life-cycle. Specifically, it outlines a new direction in production techniques—software cloning from executable specifications or source code. It explores the possibility and advantages of producing a system from its external interactions. To allow this production to be automatic, the system assumes that it can view (and potentially manipulate) these external interactions of the original system; and hence it assumes the existence of either an executable specification or the source code—an object to assist in the generation of the external interactions; i.e. the system is treated as a black-box. Although the generation and application of software clones is relatively unexplored, it is believed that this is a fundamental technology that can have many different applications within a software engineering environment. For example, software clones could be used in: complexity measurement, software testing and software fault tolerance. Clearly, for these clones to be usable, their production needs to be automated. An interesting approach to this automatic production or generation problem is the application of evolutionary-based Genetic Programming (GP). Using the paradigms of best fit, selection, crossover and mutation, a number of clones, satisfying specific requirements, can be automatically generated. In general, GP is a flexible and powerful algorithm suitable for solving a variety of different problems. This paper presents the results of studies that have been conducted in order to answer questions related to feasibility of using GP for clone generation: what features of GP are important? What works and what does not? How can the GP be “tuned” for the problem? The results have been used to draw a set of suggestions and conclusions that indicate possible usability of GP-based approach to automatic generation of clones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adamopoulos K, Harman M, Hierons RM (2004) How to Overcome the Equivalent Mutant Problem and Achieve Tailored Selective Mutation Using Co-evolution, GECCO 1338–1349

  • Ammann PE, Knight JC (1988) Data diversity: an approach to software fault tolerance. IEEE Trans Comput 37:418–425

    Article  Google Scholar 

  • Baker BS (1995) On finding duplication and near-duplication in large software systems. In: 2nd working conference on reverse engineering, WCRE, pp 86–95

  • Baxter ID, Yahin A, Mendonça de Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: 14th IEEE conference on software maintenance, ICSM, pp 368–377

  • Beck K (1999) Extreme programming explained: embrace change. Addison-Wesley, Reading

    Google Scholar 

  • Beck K (2002) Test driven development: by example. Addison-Wesley, Reading

    Google Scholar 

  • Briand L, Labiche Y, Wang Y (2002) Using simulation to empirically investigate test coverage criteria based on statecharts. Technical Report SCE-02–09, Carleton University

  • Bleuler S, Brack M, Thiele L, Zitzler E (2001) Multiobjective genetic programming: reducing bloat using SPEA2. In: Congress on evolutionary computation CEC-2001, pp 536–543

  • Casagrande JT, Pike MC, Smith PG (1978) An improved approximate formula for calculating sample sizes for comparing two binomial distributions. Biometrics 34:483–486

    Article  MATH  Google Scholar 

  • Chen TY, Tse TH, Zhou Z (2002) Semi-Proving: an integrated method based on global symbolic evaluation and metamorphic testing. In: International symposium on software testing and analysis, pp. 191–195

  • Demillo R J Lipton, Sayward F (1979) Program mutation: a new approach to program testing. Infotech State of the Art Report, Software Testing 2:107–126

    Google Scholar 

  • Ducasse S, Rieger M, Demeyer S (1999) A language independent approach for detecting duplicated code. In: 15th IEEE conference on software maintenance, ICSM, pp 109–118

  • Fogel DB (1995) Evolutionary computation, toward a new philosophy of machine intelligence. IEEE Press, Piscataway

    Google Scholar 

  • Fowler M: The New Methodology. http://www.martinfowler.com/articles/newmethodology.html

  • Gell-Mann M (1995) What is complexity. Complexity 1(1):16–19

    MATH  Google Scholar 

  • George B, Williams L (2004) A structured experiment of test-driven development. Inform Softw Technol 46(5):337–342

    Article  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading

    MATH  Google Scholar 

  • Halstead MH (1977) Elements of software science. Elsevier North-Holland, New York

    MATH  Google Scholar 

  • Harman M, Hierons RM, Danicic S (2000) The relationship between program dependence and mutation testing. Mutation, pp 15–23

  • Hierons RM, Harman M, Danicic S (1999) Using program slicing to assist in the detection of equivalent mutants. J Softw Test Verificat Reliab 9(4):233–262

    Article  Google Scholar 

  • Holland JH (1992) Adaptation in natural and artificial systems. 2nd edn. MIT Press, Cambridge

    Google Scholar 

  • Kamiya T, Kusumoto S, Inoue K (2002) CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans Softw Eng 28(7):654–670

    Article  Google Scholar 

  • Kantschik W, Banzhaf W (2002) Linear-Graph GP—a new GP structure. In: 5th genetic programming, European conference, pp 83–92

  • Kasper C, Godfrey W (2006) Cloning considered harmful. In: 13th working conference on reverse engineering, pp 19–28

  • Kim M, Sazawal V, Notkin D, Murphy GC (2005) An empirical study of code clone genealogies. In: ESEC/SIGSOFT FSE, pp 187–196

  • Knight JC, Leveson NG (1986) An experimental evaluation of the assumption of independence in multiversion programming. IEEE Trans Softw Eng 12(1):96–109

    Google Scholar 

  • Koza J (1992) Genetic programming. MIT Press, Cambridge

    MATH  Google Scholar 

  • Krinke J (2001) Identifying similar code with program dependence graphs. In: 8th working conference on reverse engineering, WCRE, pp 301–309

  • Langdon WB (1998) Genetic programming and data structures: Genetic Programming + Data Structures = Automatic Programming. Genetic Programming, vol 1. Kluwer, Boston

    Google Scholar 

  • Luke S ECJ: http://www.cs.umd.edu/projects/plus/ec/ecj

  • Luke S, Panait L (2002) Fighting bloat with nonparametric parsimony pressure. In: 7th international conference on parallel problem solving from nature, pp 411–421

  • Lyu MR, He Y (1993) Improving the N-Version programming process through the evolution of a design paradigm. IEEE Trans Reliab 42:179–189

    Article  Google Scholar 

  • Mayrand J, Leblanc C, Merlo E (1996) Experiment on the automatic detection of function clones in a software system using metrics. In: International conference on software maintenance, ICSM, pp 244–253

  • Maximilien E, Williams L (2003) Assessing test-driven development at IBM. In: 25th international conference on software engineering, pp 564–569

  • McCabe T (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320

    Article  Google Scholar 

  • Mitchell BS, Mancoridis S (2003) Modeling the search landscape of metaheuristic software clustering algorithms. In: Genetic and evolutionary computing conference, GECCO, pp 2499–2510

  • Michael CC, McGraw G, Schatz MA (2001) Generating software test data by evolution. IEEE Trans Softw Eng 27(12):1085–1110

    Article  Google Scholar 

  • Montana DJ (1995) Strongly typed genetic programming. Evolut Comput 3(2):199–230

    Google Scholar 

  • Mresa ES, Bottaci L (1999) Efficiency of mutation operators and selective mutation strategies: an empirical study. J Softw Test Verificat Reliab 9(4):205–232

    Article  Google Scholar 

  • O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Trans Evolut Comput 5(4):349–358

    Article  Google Scholar 

  • Offut AJ, Lee A, Rothermel G, Untch RH, Zapf C (1996) An experimental determination of sufficient mutant operators. ACM Trans Softw Eng Methodol 5(2):99–118

    Article  Google Scholar 

  • Offutt J, Pan J (1997) Automatically detecting equivalent mutants and infeasible paths. J Softw Test Verificat Reliab 7(3):165–192

    Article  Google Scholar 

  • Parasoft Corporation, Mutation Testing: A New Approach to Automatic Error-Detection. http://www.parasoft.com/jsp/products/article. jsp?articleId=291

  • Pines D (1988). Emerging synthesis in science. Addison-Wesley, Reading

    Google Scholar 

  • Soule T (1998) Code growth in genetic programming. PhD thesis, University of Idaho

  • The Standish Group (2001) CHOAS Chronicles II. The Standish Group International Inc.

  • Torres-Pommales W. (2000) Software fault tolerance: a tutorial. NASA/TM-2000-210616

  • Williams L, Maximilien E, Vouk M (2003) Test-driven development as a defect-reduction practice. In: 14th IEEE international symposium on software reliability engineering, pp 34–48

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marek Reformat.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reformat, M., Chai, X. & Miller, J. On the possibilities of (pseudo-) software cloning from external interactions. Soft Comput 12, 29–49 (2008). https://doi.org/10.1007/s00500-007-0215-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-007-0215-6

Keywords

Navigation