Skip to main content

Advertisement

Log in

A fuzzy genetic automatic refactoring approach to improve software maintainability and flexibility

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The creation of high-quality software is of great importance in the current state of the enterprise systems. High-quality software should contain certain features including flexibility, maintainability, and a well-designed structure. Correctly adhering to the object-oriented principles is a primary approach to make the code more flexible. Developers usually try to leverage these principles, but many times neglecting them due to the lack of time and the extra costs involved. Therefore, sometimes they create confusing, complex, and problematic structures in code known as code smells. Code smells have specific and well-known anti-patterns that can be corrected after their identification with the help of the refactoring techniques. This process can be performed either manually by the developers or automatically. In this paper, an automated method for identifying and refactoring a series of code smells in the Java programs is introduced. The primary mechanism used for performing such automated refactoring is by leveraging a fuzzy genetic method. Besides, a graph model is used as the core representation scheme along with the corresponding measures such as betweenness, load, in-degree, out-degree, and closeness centrality, to identify the code smells in the programs. Then, the applied fuzzy approach is combined with the genetic algorithm to refactor the code using the graph-related features. The proposed method is evaluated using the Freemind, Jag, JGraph, and JUnit as sample projects and compared the results against the Fontana dataset which contains results from IPlasma, FluidTool, Anti-patternScanner, PMD, and Maeinescu. It is shown that the proposed approach can identify on average 68.92% of the bad classes similar to the Fontana dataset and also refactor 77% of the classes correctly with respect to the coupling measures. This is a noteworthy result among the currently existing refactoring mechanisms and also among the studies that consider both the identification and the refactoring of the bad smells.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

References

  • Abualigah LM (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin

    Book  Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071

    Article  Google Scholar 

  • Arcelli Fontana F, Braione P, Zanoni M (2012) Automatic detection of bad smells in code: an experimental assessment. J Object Technol 11(5):1–38

    Google Scholar 

  • Arcelli F, Zanoni M, Marino A, Mäntylä MV (2013) Code smell detection: towards a machine learning-based approach. In: Proceedings of the 2013 IEEE international conference on software maintenance, September 2013, Eindhoven, The Netherlands, pp 396–399

  • Azeem MI, Palomba F, Shi L, Wang Q (2019) Machine learning techniques for code smell detection: a systematic literature review and meta-analysis. Inf Softw Technol 108:115–138

    Article  Google Scholar 

  • Bafandeh Mayvan B, Rasoolzadegan A (2017) Design pattern detection based on the graph theory. Knowl Based Syst 120(3):211–225

    Article  Google Scholar 

  • Bagga J, Heinz A (2002) A Java-based system for drawing graphs and running graph algorithms. Int Sympos Graph Drawing, February-2002, Berlin, Heidelberg, pp 459–460

  • Bandyopadhyay R, Chakraborty UK, Patranabis D (2001) Auto tuning a PID controller: a fuzzy-genetic approach. J Syst Architect 47(7):663–673

    Article  Google Scholar 

  • Bian T, Hu J, Deng Y (2017) Identifying influential nodes in complex networks based on AHP. Phys Statistic Mech Appl 479:422–436

    Article  MathSciNet  Google Scholar 

  • Boukhdhir A, Kessentini M, Bechikh S, Dea J, Ben-Said L (2014) On the use of machine learning and search-based software engineering for ill-defined fitness function: a case study on software refactoring. In: Proceedings of the international symposium on search-based software engineering, pp 31–45. Springer, Cham

  • Brandes U (2008) On variants of shortest-path betweenness centrality and their generic computation. Soc Netw 30(2):136–145

    Article  Google Scholar 

  • Charette RN (2005) Why software fails [software failure]. IEEE Spectr 42(9):42–49

    Article  Google Scholar 

  • Chong CY, Lee SP (2017) Automatic Clustering constraints derivation from object-oriented software using the weighted complex network with graph theory analysis. J Syst Softw 133:28–53

    Article  Google Scholar 

  • Dooley J (2011) Object-oriented design principles. Software Development and Professional Practice, Apress, pp 115–136

  • Fontana F, Mäntylä F, Zanoni MV, Marino A (2016) Comparing and experimenting machine learning techniques for code smell detection. Empirical Softw Eng 21(3):1143–1191

    Article  Google Scholar 

  • Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999) Refactoring: Improving the design of existing code, MA. Addison Wesley, USA

    Google Scholar 

  • Freeman LC (1979) Centrality in networks: conceptual clarification. Soc Netw 1(3):215–239

    Article  MathSciNet  Google Scholar 

  • FreeMind—free mind mapping software. http://freemind.sourceforge.net/wiki/index.php/Main_page. Accessed 10 July, 2019

  • Ghannem A, Kessentini M, Hamdi MS, El Boussaidi (2018) Model refactoring by example: a multi objective search based software engineering approach. J Softw Evolut Process 30(4):e1916

    Article  Google Scholar 

  • Goh K, Kahng B, Kim D (2001) Universal behavior of load distribution in scale-free networks. Phys Rev Lett 87(27):1–4

    Article  Google Scholar 

  • Gu A, Zhou X, Li Z, Li Q, Li L (2017) Measuring object-oriented class cohesion based on complex networks. Arab J Sci Eng 42(8):3551–3561

    Article  Google Scholar 

  • Gurpreet S, Chopra V (2013) A study of bad smells in code. Int J Sci Technol Latest Trends 7(1):16–20

    Google Scholar 

  • Hemalatha K, AnandaRao A, RadhikaRaju A, Ramesh G (2016) Detection of code-smells by using particle swarm optimization technique (PSO). South Asian J Eng Technol 2(28):186–195

    Google Scholar 

  • Huaxin M, Shuai J (2011) Design patterns in software development. In: Proceedings of the 2011 IEEE 2nd international conference on software engineering and service science, July 2011, Beijing, China

  • JAG Java Application Generator (2019) https://java-source.net/open-source/j2ee-frameworks/jag-java-application-generator. Accessed 13 July, 2019

  • Jaspreet K, Satwinder S (2016) Neural network-based refactoring area identification in software system with object-oriented metrics. Indian J Sci Technol 9(10):1–8

    Google Scholar 

  • Java call graph utilities (2018) https://github.com/gousiosg/java-callgraph. Accessed 24 November 2018

  • Jenkins S, Kirk S (2007) Software architecture graphs as complex networks: a novel partitioning scheme to measure stability and evolution. Inf Sci 177(12):2587–2601

    Article  Google Scholar 

  • Johann E, Kappel G, Schrefl M (1994) Coupling and cohesion in object-oriented systems. Technical Report, University of Klagenfurt

  • Kaur J, Singh S (2016) Neural network based refactoring area identification in software system with object oriented metrics. Indian J Sci Technol 9:1–8

    Google Scholar 

  • Kebir S, Borne I, Meslati D (2017) A genetic algorithm-based approach for automated refactoring of component-based software. Inf Softw Technol 88(3):17–36

    Article  Google Scholar 

  • Kim DK (2017) Finding bad code smells with neural network models. Int J Electr Comput Eng 7:3613–3621

    Google Scholar 

  • Mansoor U, Kessentini M, Wimmer M, Deb K (2017) Multi-view refactoring of class and activity diagrams using a multi-objective evolutionary algorithm. Software Qual J 25(2):473–501

    Article  Google Scholar 

  • Marinescu C, Marinescu R, Mihancea PF, Ratiu D, Wettel R (2005) iPlasma: an integrated platform for quality assessment of object-oriented design. In: Proceedings of the ICSM conference, 2005, Budapest, Hungary pp 77–80

  • Masoud H, Jalili S (2014) A clustering-based model for class responsibility assignment problem in object-oriented analysis. J Syst Softw 93:110–131

    Article  Google Scholar 

  • Mendel J (1995) Fuzzy logic systems for engineering: a tutorial. Proc IEEE 83(3):345–377

    Article  Google Scholar 

  • UML modeling (2018) https://www.visual-paradigm.com/tutorials/. Accessed 10 October 2018

  • Negara S et al. (2013) A comparative study of manual and automated refactoring. In: Proceedings of the European conference on object-oriented programming, Springer, Berlin, Heidelberg

  • Ouni A, Kessentini M, Cinnéide MÓ, Sahraoui H (2017) MORE: A multi-objective refactoring recommendation approach to introducing design patterns and fixing code smells. J Softw Evolut Process 29:e1843

    Article  Google Scholar 

  • Overview of NetworkX (2019) https://networkx.github.io/documentation/. Accessed 20 February, 2019

  • Pecorelli F, Palomba F, Di Nucci D, De Lucia A (2019) Comparing heuristic and machine learning approaches for metric-based code smell detection. In: Proceedings of the 27th international conference on program comprehension, Piscataway, New York, USA, 2019 May 25, pp 93–104

  • Refactoring (2019) https://refactoring.guru/refactoring. Accessed 7 February 2019

  • Riel AJ (1996) Object-oriented design heuristics. Addison-Wesley Reading, vol 338

  • Savić M, Ivanović M, Radovanović M (2017) Analysis of high structural class coupling in object-oriented software systems. Computing 99(11):1055–1079

    Article  MathSciNet  Google Scholar 

  • Sharma T, Samarthyam G, Suryanarayana G (2015) Applying design principles in practice. In: Proceedings of the 8th India software conference, February 2015, Bangalore, India, pp 200–102

  • Snyder A (1986) Encapsulation and inheritance in object-oriented programming languages. In: Proceedings on object-oriented programming systems, languages and applications conference, Portland, USA, pp 38–45

  • Stroustrup B (1988) What is object-oriented programming? IEEE Softw 5(3):10–20

    Article  MathSciNet  Google Scholar 

  • Subramaniam H, Zulzalil H (2012) Software quality assessment using flexibility: a systematic literature review. Int Rev Comput Softw 7:2095–2099

    Google Scholar 

  • Wampler D (2007) Aspect-oriented design principles: lessons from object-oriented design. In: Proceedings of the 6th international conference on aspect-oriented software development, March 2007, Vancouver, British Columbia, Canada

  • Wang Y, Yu H, Zhu Z, Zhang W, Zhao Y (2017) Automatic software refactoring via weighted clustering in method-level networks. IEEE Trans Softw Eng 44(3):202–236

    Article  Google Scholar 

  • What are genetic algorithms? (2019) https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_quick_guide.htm. Accessed 11 July, 2019

  • Wirfs-Brock R, McKean A (2003) Object design: roles, responsibilities, and collaborations. Addison-Wesley Professional

  • Yamashita A (2013) How good are code smells for evaluating software maintainability? Results from a comparative case study. In: Proceedings of the 2013 IEEE international conference on software maintenance, 2013, Netherland, Eindhoven, pp 566–571

  • Yu F, Xia X, Li W, Tao J, Ma L, Cai Z (2017) Critical node identification for complex network based on a novel minimum connected dominating set. Soft Comput 21(19):5621–5629

    Article  Google Scholar 

  • Zadeh LA (1976) A fuzzy-algorithmic approach to the definition of complex or imprecise concepts. In: Proceedings of the systems theory in the social sciences: interdisciplinary systems research, Birkhäuser, Basel, pp 202–282

Download references

Funding

This study has received no funding from any organization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehrdad Ashtiani.

Ethics declarations

Conflict of interest

All of the authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A—time complexity

In Sect. 4 0.2, the authors have mentioned that the use of an evolutionary algorithm can potentially reduce the performance of the process. The execution time was not the focus of the current study but can be optimized in future work. Here, more detailed information about the performance of the proposed work is given below:

  1. 1.

    Performance of the network measures calculation Calculating network measures involves graph construction and calculating the shortest paths between all pairs of vertices in the graph. The process of graph construction and finding weights of the edges is shown in Algorithm 1. The time complexity of Algorithm 1 based on existing iterations in the process is \( O\left( {n^{2} } \right) \) where \( n \) is the number of classes in the project. With respect to the formulation given in Sect. 2.2, finding all the shortest paths in the graph needs \( O\left( {n^{3} } \right) \) time. Of course, this would be a preprocessing procedure to create the network.

  2. 2.

    Performance of the refactoring process The refactoring process works differently for each code smell. The main difference is the priority of the relations for reducing their weight degree. For example, in shotgun surgery and feature envy, relations with higher weight have also a higher priority to be reduced or removed. This is different for lazy class and god class since for refactoring these bad smells, relations with lower weights have a higher priority. For lazy class and god class smells, the linguistic term “very high” from the fuzzy/genetic system causes the dilatation of all the relations from a class and finally also the class itself. One other difference between the refactoring processes for each bad smell is the direction of the relations. For classes with the feature envy or god class, relations “from the class” are considered. For classes with shotgun surgery, relations “to the class” and for lazy class, both relations are considered. In algorithm 3, the process of removing a relation in feature envy is shown. Other bad smells have a similar scenario. In this algorithm, relations that are the only connections between packages will not be removed. In other words, if a class relation is also the package relation between the two packages, this relation will be kept.

As mentioned above, refactoring all types of bad smells in this paper have a similar scenario to the one given in algorithm 3. Therefore, they all have the same time complexity. It can be deduced from the algorithm that the time complexity of refactoring a relation is \( O(l + wn + m + nlog\left( n \right) \)) where \( l \) is the total number of relations, \( n \) is the total number of relations for a class, \( w \) is the weight of each relation, and \( m \) is the number of packages in the project. This value could be easily calculated by considering existing loops in the algorithm.

  1. 3.

    The execution time of the algorithm Evolutionary algorithms, especially the genetic algorithm, usually require a long execution time. In Table 9, the execution time of the proposed work for each system compared with other approaches is given. The execution time depends mostly on the generalization relations that have bad smells but cannot be refactored. As described in the Discussion section, in the refactoring process only the association relations are considered. So, to prevent the fuzzy genetic system to get stuck, a time limit for the identification and refactoring process is assumed. This is one of the reasons why some classes have longer identification and refactoring time.

    figure c

Appendix B—evaluating different thresholds values

In Sect. 4 0.2, the authors have used the mean value for each network measure for the identification of bad smells and considered classes as bad classes if the difference of the network measure of one class is more than 10% of the mean value for any class. This 10% threshold is flexible and can be changed from one project to another with respect to the developer’s requirement. In this section, the results for the Jag project with 5% and 15% threshold values are presented. Figure 27 shows which bad smells have been found in the two runs. The differences in the percentage of the bad smell types are not due to the threshold values. The differences exist because the genetic algorithm has in each run a different order for finding and refactoring bad smells. The threshold is related to the number of bad smells.

Fig. 27
figure 27

Evaluation of the bad smell identification in Jag with 5% and 15% threshold values

A threshold of 5% found 26 classes and a threshold of 15% found 29 classes as bad classes. Generally, the method finds more bad smells by increasing the threshold and less bad smells by decreasing it. In other words, with a small threshold, the most important bad smells are found. By increasing the threshold, other bad smells may gain a higher chance to be detected although the false positive rate will increase as well.

Figure 28 shows the evaluation of the refactoring on Jag with 5% and 15% threshold values. Here, thresholds of 15% and 5% have refactored 72.5% and 70% of the classes correctly, respectively. Figures 29 and 30 show the precision of identified bad smells, bad smell types, and the accuracy of the corrected refactoring for Jag with the thresholds of 5% and 15%, respectively. As shown in the figures, the precision of the correctly identified bad smells in the first run with a 5% threshold is more than the run with 15%. In the 5% and 15% thresholds, a precision of about 80% and 72% is achieved, respectively. Lower threshold values seem to be more effective in finding the most important classes with bad smell.

Fig. 28
figure 28

Evaluation of the refactoring approach on Jag with 5% and 15% threshold values

Fig. 29
figure 29

Evaluation results of the bad smell identification and refactoring on Jag with a 5% threshold

Fig. 30
figure 30

Evaluation results of the bad smell identification and refactoring on Jag with a 15% threshold

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saheb Nasagh, R., Shahidi, M. & Ashtiani, M. A fuzzy genetic automatic refactoring approach to improve software maintainability and flexibility. Soft Comput 25, 4295–4325 (2021). https://doi.org/10.1007/s00500-020-05443-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-05443-0

Keywords

Navigation