Skip to main content
Log in

Generation of refactoring algorithms by grammatical evolution

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Recent machine learning studies present accurate results generating prediction models to identify refactoring operations for a program. However, such works are limited to prediction, i.e., they learn refactoring operations strictly as applied by developers, but there are possibilities that they might not think. On the other hand, the Search-Based Software Refactoring (SBR) field applies search algorithms to find refactoring operations in a vast space of possibilities to improve diverse quality attributes. Nevertheless, existing SBR approaches do not generate a model as machine learning studies, and then, they need to be reapplied individually for each program needing refactoring. To mitigate this limitation, this work introduces a novel SBR learning approach that generates refactoring algorithms capable of providing refactoring operations to several programs. These algorithms are composed of procedures that use rules to determine the refactoring operations. To create the algorithms, a learning process first extracts refactoring patterns from programs by grouping their elements that were refactored in similar ways. After that, a Grammatical Evolution (GE) is applied to generate the algorithms based on a grammar encompassing details of the extracted patterns. GE works to generate an algorithm that provides refactoring operations similar to those applied in practice while improving quality attributes, such as modularity. The approach is evaluated using refactoring data from 40 Java programs of GitHub repositories. The algorithms are tested against different programs, obtaining an overall average of 60% of modularity improvement and 50% of similarity with actual refactoring operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://github.com/tsantalis/RefactoringMiner

  2. https://scitools.com/features/

  3. http://aserg-ufmg.github.io/why-we-refactor/#/projects.

  4. http://activemq.apache.org/maven/5.9.0/apidocs

References

  • AutoRefactor (2021). Available at: http://autorefactor.org/. Accessed on March 28

  • Spartan Refactoring (2021). Available at: https://marketplace.eclipse.org/content/spartan-refactoring. Accessed on March 28

  • Abid C, Alizadeh V, Kessentini M, Ferreira T N, Dig D (2020) 30 years of software refactoring research: A systematic literature review. CoRR abs/2007.02194

  • Al Dallal J (2012) Constructing models for predicting extract subclass refactoring opportunities using object-oriented quality metrics. Inf Softw Technol 54 (10):1125–1141

    Article  Google Scholar 

  • Alenezi M, Akour M, Alqasem O (2020) Harnessing deep learning algorithms to predict software refactoring. TELKOMNIKA (Telecommunication Computing Electronics and Control) 18:2977–2982. https://doi.org/10.12928/TELKOMNIKA.v18i6.16743

    Article  Google Scholar 

  • Alizadeh V, Kessentini M, Mkaouer M W, Ocinneide M, Ouni A, Cai Y (2020) An interactive and dynamic search-based approach to software refactoring recommendations. IEEE Trans Softw Eng 46(9):932–961. https://doi.org/10.1109/TSE.2018.2872711

    Article  Google Scholar 

  • AlOmar E A, Peruma A, Newman C D, Mkaouer M W, Ouni A (2020) On the relationship between developer experience and refactoring: An exploratory study and preliminary results. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops. ICSEW’20. Association for Computing Machinery, New York, NY, USA, pp 342–349

  • Amal B, Kessentini M, Bechikh S, Dea J, Said L B (2014) On the use of machine learning and search-based software engineering for ill-defined fitness function: A case study on software refactoring. In: Le Goues C, Yoo S (eds) Search-Based Software Engineering. Springer International Publishing, Cham, pp 31–45

  • Aniche M, Maziero E, Durelli R, Durelli V (2020) The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Trans Softw Eng, pp 1–1. https://doi.org/10.1109/TSE.2020.3021736

  • Bansiya J, Davis C G (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Softw Eng 28(1):4–17

    Article  Google Scholar 

  • Baqais A, Alshayeb M (2020) Automatic software refactoring: a systematic literature review. Softw Qual J 28:459–502

    Article  Google Scholar 

  • Barros R C, Basgalupp M P, Cerri R, da Silva T S, de Carvalho A C P L F (2013) A Grammatical Evolution Approach for Software Effort Estimation. In: Proceedings of the 5th Genetic and Evolutionary Computation Conference. GECCO

  • Catolino G, Palomba F, Fontana F A, De Lucia A, Andy Z, Ferrucci F (2020) Improving change prediction models with code smell-related information. Empirical Software Engineer 25:49–95. https://doi.org/10.1007/s10664-019-09739-0

    Article  Google Scholar 

  • Cohen J (2013) Statistical power analysis for the behavioral sciences. Academic press

  • Colanzi T E, Assunção W K G, Farah P R , Vergilio S R, Guizzo G (2019) A review of ten years of the symposium on search-based software engineering. In: Nejati S, Gay G (eds) Symposium on Search-Based Software Engineering. Springer, Cham, pp 42–57

  • Cormen T H, Leiserson C E, Rivest R L, Stein C (2009) Introduction to algorithms, edn. 3. The MIT Press

  • Dallal J A (2017) Predicting move method refactoring opportunities in object-oriented code. Inf Softw Technol 92:105–120

    Article  Google Scholar 

  • Dempster A P, Laird N M, Rubin D B (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Durillo J J, Nebro A J (2011) jMetal: A Java framework for multi-objective optimization. Adv Eng Softw 42:760–771

    Article  Google Scholar 

  • Fowler M, Beck K (2018) Refactoring: Improving the Design of Existing Code, edn. 2. Addison-Wesley

  • Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: Elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA

    MATH  Google Scholar 

  • Hartigan J A, Wong M A (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc 28(1):100–108

    MATH  Google Scholar 

  • Imazato A, Higo Y, Hotta K, Kusumoto S (2017) Finding extract method refactoring opportunities by analyzing development history. In: Proceedings of the 41st Annual Computer Software and Applications Conference. COMPSAC

  • Jindal S, Khurana G (2013) The statistical analysis of source-code to determine the refactoring opportunities factor (ROF) using a machine learning algorithm. In: Proceedings of the International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom

  • Kaur A, Dhiman G (2019) A review on search-based tools and techniques to identify bad code smells in object-oriented systems. In: Yadav N, Yadav A, Bansal J C, Deep K, Kim J H (eds) Harmony Search and Nature Inspired Optimization Algorithms. Springer Singapore, Singapore, pp 909–921

  • Kessentini M, Mahouachi R, Ghedira K (2012) What you like in design use to correct bad-smells. Softw Qual J 21(4):551–571

    Article  Google Scholar 

  • Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoring challenges and benefits at Microsoft. IEEE Trans Softw Eng 40(7):633–649

    Article  Google Scholar 

  • Koc E, Ersoy N, Andac A, Camlidere Z S, Cereci I, Kilic H (2011) An empirical study about search-based refactoring using alternative multiple and population-based search techniques. In: Proceedings of the International Symposium on Computer and Information Sciences. ISCIS, pp 59–66

  • Koc E, Ersoy N, Camlidere Z S, Kilic H (2012) A Web-Service for Automated Software Refactoring Using Artificial Bee Colony Optimization. In: Proceedings of the International Conference on Advances in Swarm Intelligence. ICSI, pp 318–325

  • Kosker Y, Turhan B, Bener A (2009) An expert system for determining candidate software classes for refactoring. Expert Syst Appl 36(6):10000–10003

    Article  Google Scholar 

  • Koza J R (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press

  • Kumar L, Satapathy S M, Murthy L B (2019) Method level refactoring prediction on five open source java projects using machine learning techniques. In: Proceedings of the 12th Innovations on Software Engineering Conference. ISEC

  • Lance G N, Williams W T (1967) A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems. The Computer Journal 9(4):373–380

    Article  Google Scholar 

  • Mahouachi R, Kessentini M, Ghedira K (2012) A new design defects classification: Marrying detection and correction. In: Proceedings of the Fundamental Approaches to Software Engineering. FASE

  • Mann H B, Whitney D R (1947) On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 18(1):50–60

    Article  MathSciNet  Google Scholar 

  • Mansoor U, Kessentini M, Wimmer M, Deb K (2015) Multi-view refactoring of class and activity diagrams using a multi-objective evolutionary algorithm. Softw Qual J, pp 1–29

  • Mariani T, Guizzo G, Vergilio S R, Pozo A T R (2016) Grammatical evolution for the multi-objective integration and test order problem. In: Genetic and Evolutionary Computation Conference. GECCO, pp 1069–1076

  • Mariani T, Kessentini M, Vergilio S R (2021) Dataset and Suplementary Material. https://doi.org/10.6084/m9.figshare.12275981

  • Mariani T, Vergilio S R (2016) A systematic review on search-based refactoring. Inf Softw Technol 83:14–34

    Article  Google Scholar 

  • Mkaouer M W, Kessentini M, Bechikh S, Cinnéide M O, Deb K (2015) On the use of many quality attributes for software refactoring: a many-objective search-based software engineering approach. Empir Softw Eng, pp 1–43

  • Mkaouer M W, Kessentini M, Bechikh S, Deb K, Ó Cinnéide M (2014) Recommendation system for software refactoring using innovization and interactive dynamic optimization. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM, pp 331–336

  • Mkaouer M W, Kessentini M, Bechikh S, Deb K, Ó Cinnéide M (2014) Recommendation system for software refactoring using innovization and interactive dynamic optimization. In: Proceedings of the International Conference on Automated Software Engineering. ASE, pp 331–336

  • Mkaouer W, Kessentini M, Kontchou P, Deb K, Bechikh S, Ouni A (2015) Many-Objective Software Remodularization Using NSGA-III. Transactions on Software Engineering and Methodology 24(3):17:1–17:45

    Google Scholar 

  • Mohan M, Greer D (2018) A survey of search-based refactoring for software maintenance. Journal of Software Engineering Research and Development 6:3:1 – 3:52

    Article  Google Scholar 

  • Moore I (1996) Automatic inheritance hierarchy restructuring and method refactoring. In: Proceedings of the 11th Conference on Object-oriented Programming, Systems, Languages, and Applications. OOPSLA

  • Murphy-Hill E, Parnin C, Black A P (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18

    Article  Google Scholar 

  • Ouni A, Kessentini M, Sahraoui H (2013) Search-based refactoring using recorded code changes. In: Proceedings of the European Conference on Software Maintenance and Reengineering. CSMR

  • Ouni A, Kessentini M, Sahraoui H (2014) Multiobjective optimization for software refactoring and evolution. Adv Comput 94:103–167

    Article  Google Scholar 

  • Ouni A, Kessentini M, Sahraoui H, Hamdi M S (2013) The use of development history in software refactoring using a multi-objective evolutionary algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO

  • Ouni A, Kessentini M, Sahraoui H, Inoue K, Deb K (2016) Multi-criteria code refactoring using search-based software engineering: An industrial case study. ACM Trans Softw Eng Methodol 25(3):23:1–23:53

    Article  Google Scholar 

  • Ouni A, Kessentini M, Sahraoui H, Inoue K, Hamdi M S (2015) Improving multi-objective code-smells correction using development history. J Syst Softw 105:18–39

    Article  Google Scholar 

  • Paixao M, Harman M, Zhang Y, Yu Y (2018) An empirical study of cohesion and coupling: Balancing optimization and disruption. IEEE Trans Evol Comput 22(3):394–414

    Article  Google Scholar 

  • Phongpaibul M, Boehm B (2007) Mining software evolution to predict refactoring. In: Proceedings of the International Symposium on Empirical Software Engineering and Measurement. ESEM

  • Powers D M W (2011) Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2:37–63

    Google Scholar 

  • Ryan C, Collins J J, Neill M O (1998) Grammatical evolution: Evolving programs for an arbitrary language. In: Genetic Programming. Lecture Notes in Computer Science, vol 1391. Springer, Berlin Heidelberg, pp 83–96

  • Silva D, Tsantalis N, Valente M T (2016) Why we refactor? confessions of github contributors. In: Proceedings of the 24th International Symposium on Foundations of Software Engineering. FSE, pp 858–870

  • Sjøberg D I K, Yamashita A, Anda B C D, Mockus A, Dybå T (2013) Quantifying the effect of code smells on maintenance effort. IEEE Trans Softw Eng 39(8):1144–1156

    Article  Google Scholar 

  • Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison-Wesley

  • Tsantalis N, Chaikalis T, Chatzigeorgiou A (2008) JDeodorant: Identification and removal of type-checking bad smells. In: Proceedings of the 12th European Conference on Software Maintenance and Reengineering. CSMR

  • Tufano M, Pantiuchina J, Watson C, Bavota G, Poshyvanyk D (2019) On learning meaningful code changes via neural machine translation. In: Proceedings of the 41st International Conference on Software Engineering. ICSE ’19, pp 25–36

  • Wang H, Kessentini M, Grosky W, Meddeb H (2015) On the use of time series and search based software engineering for refactoring recommendation. In: Proceedings of the 7th International Conference on Management of Computational and Collective IntElligence in Digital EcoSystems. MEDES ’15. https://doi.org/10.1145/2857218.2857224. Association for Computing Machinery, New York, NY, USA, pp 35–42

  • Witten I H, Frank E (1999) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann

  • Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer Science & Business Media

  • Xu S, Sivaraman A, Khoo S-C, Xu J (2017) GEMS: An extract method refactoring recommender. In: Proceedings of the 28th International Symposium on Software Reliability Engineering. ISSRE

Download references

Acknowledgements

The authors would like to thank to CAPES by supporting Thainá Mariani by the program PDSE associated with the process 88881.135198/2016-01. Silvia R. Vergilio is supported by CNPq (Grant:305968/2018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thainá Mariani.

Ethics declarations

Conflict of Interests

We declare that we have no conflict of interests.

Additional information

Communicated by: Aldeida Aleti, Annibale Panichella and Shin Yoo

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Advances in Search-Based Software Engineering (SSBSE)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mariani, T., Kessentini, M. & Vergilio, S.R. Generation of refactoring algorithms by grammatical evolution. Empir Software Eng 27, 110 (2022). https://doi.org/10.1007/s10664-022-10151-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10151-4

Keywords

Navigation