skip to main content
10.1145/3472674.3473978acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections

Comparing within- and cross-project machine learning algorithms for code smell detection

Published: 23 August 2021 Publication History


Code smells represent a well-known problem in software engineering, since they are a notorious cause of loss of comprehensibility and maintainability. The most recent efforts in devising automatic machine learning-based code smell detection techniques have achieved unsatisfying results so far. This could be explained by the fact that all these approaches follow a within-project classification, i.e. training and test data are taken from the same source project, which combined with the imbalanced nature of the problem, produces datasets with a very low number of instances belonging to the minority class (i.e. smelly instances). In this paper, we propose a cross-project machine learning approach and compare its performance with a within-project alternative. The core idea is to use transfer learning to increase the overall number of smelly instances in the training datasets. Our results have shown that cross-project classification provides very similar performance with respect to within-project. Despite this finding does not yet provide a step forward in increasing the performance of ML techniques for code smell detection, it sets the basis for further investigations.


Marwen Abbes, Foutse Khomh, Yann-Gael Gueheneuc, and Giuliano Antoniol. 2011. An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In 2011 15Th european conference on software maintenance and reengineering. 181–190.
Muhammad Ilyas Azeem, Fabio Palomba, Lin Shi, and Qing Wang. 2019. Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Information and Software Technology, 108 (2019), 115–138.
Ricardo Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern information retrieval. 463, ACM press New York.
Gabriele Bavota, Rocco Oliveto, Malcom Gethers, Denys Poshyvanyk, and Andrea De Lucia. 2013. Methodbook: Recommending move method refactorings via relational topic models. IEEE Transactions on Software Engineering, 40, 7 (2013), 671–694.
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research, 13, 2 (2012).
Nanette Brown, Yuanfang Cai, Yuepu Guo, Rick Kazman, Miryung Kim, Philippe Kruchten, Erin Lim, Alan MacCormack, Robert Nord, and Ipek Ozkaya. 2010. Managing technical debt in software-reliant systems. In Proceedings of the FSE/SDP workshop on Future of software engineering research. 47–52.
Shyam R Chidamber and Chris F Kemerer. 1994. A metrics suite for object oriented design. IEEE Transactions on software engineering, 20, 6 (1994), 476–493.
Norman Cliff. 1993. Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological bulletin, 114, 3 (1993), 494.
William Jay Conover. 1999. Practical nonparametric statistics (3. ed ed.). Wiley, New York, NY [u.a.]. isbn:0471160687
Ward Cunningham. 1993. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger, 4, 2 (1993), 29–30.
Elder Vicente de Paulo Sobrinho, Andrea De Lucia, and Marcelo de Almeida Maia. 2018. A systematic literature review on bad smells—5 W’s: which, when, what, who, where. IEEE Transactions on Software Engineering.
Dario Di Nucci, Fabio Palomba, Damian A Tamburri, Alexander Serebrenik, and Andrea De Lucia. 2018. Detecting code smells using machine learning techniques: are we there yet? In 2018 ieee 25th international conference on software analysis, evolution and reengineering (saner). 612–621.
Eduardo Fernandes, Johnatan Oliveira, Gustavo Vale, Thanis Paiva, and Eduardo Figueiredo. 2016. A review-based comparative study of bad smell detection tools. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering. 1–12.
Marios Fokaefs, Nikolaos Tsantalis, and Alexander Chatzigeorgiou. 2007. Jdeodorant: Identification and removal of feature envy bad smells. In 2007 ieee international conference on software maintenance. 519–520.
Marios Fokaefs, Nikolaos Tsantalis, Eleni Stroulia, and Alexander Chatzigeorgiou. 2011. JDeodorant: identification and application of extract class refactorings. In 2011 33rd International Conference on Software Engineering (ICSE). 1037–1039.
Francesca Arcelli Fontana, Pietro Braione, and Marco Zanoni. 2012. Automatic detection of bad smells in code: An experimental assessment. J. Object Technol., 11, 2 (2012), 5–1.
Francesca Arcelli Fontana, Jens Dietrich, Bartosz Walter, Aiko Yamashita, and Marco Zanoni. 2016. Antipattern and code smell false positives: Preliminary conceptualization and classification. In 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER). 1, 609–613.
Francesca Arcelli Fontana, Mika V Mäntylä, Marco Zanoni, and Alessandro Marino. 2016. Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering, 21, 3 (2016), 1143–1191.
Francesca Arcelli Fontana and Marco Zanoni. 2017. Code smell severity classification using machine learning techniques. Knowledge-Based Systems, 128 (2017), 43–58.
Francesca Arcelli Fontana, Marco Zanoni, Alessandro Marino, and Mika V Mäntylä. 2013. Code smell detection: Towards a machine learning-based approach. In 2013 IEEE International Conference on Software Maintenance. 396–399.
Martin Fowler. 2018. Refactoring: improving the design of existing code. Addison-Wesley Professional.
Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. 2011. Developing fault-prediction models: What the research can show industry. IEEE software, 28, 6 (2011), 96–99.
Philippe Kruchten, Robert L Nord, and Ipek Ozkaya. 2012. Technical debt: From metaphor to theory and practice. Ieee software, 29, 6 (2012), 18–21.
Meir M Lehman. 1980. Programs, life cycles, and laws of software evolution. Proc. IEEE, 68, 9 (1980), 1060–1076.
Zaheed Mahmood, David Bowes, Peter CR Lane, and Tracy Hall. 2015. What is the impact of imbalance on software defect prediction performance? In Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering. 1–4.
Mika V Mäntylä and Casper Lassenius. 2006. Subjective evaluation of software evolvability using code smells: An empirical study. Empirical Software Engineering, 11, 3 (2006), 395–431.
Naouel Moha, Yann-Gaël Guéhéneuc, Laurence Duchien, and Anne-Francoise Le Meur. 2009. Decor: A method for the specification and detection of code and design smells. IEEE Transactions on Software Engineering, 36, 1 (2009), 20–36.
Robert M O’brien. 2007. A caution regarding rules of thumb for variance inflation factors. Quality & quantity, 41, 5 (2007), 673–690.
Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Fausto Fasano, Rocco Oliveto, and Andrea De Lucia. 2018. On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. Empirical Software Engineering, 23, 3 (2018), 1188–1221.
Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do they really smell bad? a study on developers’ perception of bad code smells. In 2014 IEEE International Conference on Software Maintenance and Evolution. 101–110.
Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Denys Poshyvanyk, and Andrea De Lucia. 2014. Mining version histories for detecting code smells. IEEE Transactions on Software Engineering, 41, 5 (2014), 462–489.
Fabio Palomba, Annibale Panichella, Andrea De Lucia, Rocco Oliveto, and Andy Zaidman. 2016. A textual-based technique for smell detection. In 2016 IEEE 24th international conference on program comprehension (ICPC). 1–10.
Fabiano Pecorelli, Dario Di Nucci, Coen De Roover, and Andrea De Lucia. 2020. A large empirical assessment of the role of data balancing in machine-learning-based code smell detection. Journal of Systems and Software, 169 (2020), 110693.
Fabiano Pecorelli, Fabio Palomba, Dario Di Nucci, and Andrea De Lucia. 2019. Comparing heuristic and machine learning approaches for metric-based code smell detection. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC). 93–104.
David MW Powers. 2020. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.
Forrest Shull, Davide Falessi, Carolyn Seaman, Madeline Diep, and Lucas Layman. 2013. Technical debt: Showing the way for better transfer of empirical results. In Perspectives on the Future of Software Engineering. Springer, 179–190.
Mervyn Stone. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36, 2 (1974), 111–133.
Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E Hassan, and Kenichi Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE Transactions on Software Engineering, 45, 7 (2018), 683–711.
Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, and Denys Poshyvanyk. 2017. When and why your code starts to smell bad (and whether the smells go away). IEEE Transactions on Software Engineering, 43, 11 (2017), 1063–1088.
Aki Vehtari, Andrew Gelman, and Jonah Gabry. 2017. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and computing, 27, 5 (2017), 1413–1432.
Zhou Xu, Jin Liu, Zijiang Yang, Gege An, and Xiangyang Jia. 2016. The impact of feature selection on defect prediction performance: An empirical comparison. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). 309–320.
Min Zhang, Tracy Hall, and Nathan Baddoo. 2011. Code bad smells: a review of current knowledge. Journal of Software Maintenance and Evolution: research and practice, 23, 3 (2011), 179–202.

Cited By

View all

Index Terms

  1. Comparing within- and cross-project machine learning algorithms for code smell detection



    Information & Contributors


    Published In

    cover image ACM Conferences
    MaLTESQuE 2021: Proceedings of the 5th International Workshop on Machine Learning Techniques for Software Quality Evolution
    August 2021
    36 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 August 2021


    Request permissions for this article.

    Check for updates

    Author Tags

    1. Code smells
    2. Empirical Software Engineering
    3. Transfer Learning


    • Research-article

    Funding Sources

    • Swiss National Science Foundation


    ESEC/FSE '21


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 20 Feb 2025

    Other Metrics


    Cited By

    View all
    • (2025)Alleviating class imbalance in Feature Envy prediction: An oversampling technique based on code entity attributesInformation and Software Technology10.1016/j.infsof.2025.107673180(107673)Online publication date: Apr-2025
    • (2025)Bmco-o: a smart code smell detection method based on co-occurrencesAutomated Software Engineering10.1007/s10515-025-00486-932:1Online publication date: 21-Feb-2025
    • (2024)Code smell detection based on supervised learning modelsNeurocomputing10.1016/j.neucom.2023.127014565:COnline publication date: 27-Feb-2024
    • (2024) CBReTKnowledge-Based Systems10.1016/j.knosys.2024.111390286:COnline publication date: 17-Apr-2024
    • (2024)Revisiting Code Smell Severity Prioritization using learning to rank techniquesExpert Systems with Applications10.1016/j.eswa.2024.123483249(123483)Online publication date: Sep-2024
    • (2024)Automatic detection of Feature Envy and Data Class code smells using machine learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122855243:COnline publication date: 25-Jun-2024
    • (2024)Improving Code Smell Detection by Reducing Dimensionality Using Ensemble Feature Selection and Machine LearningSN Computer Science10.1007/s42979-024-03013-x5:6Online publication date: 25-Jun-2024
    • (2024)Enhancing software code smell detection with modified cost-sensitive SVMInternational Journal of System Assurance Engineering and Management10.1007/s13198-024-02326-715:7(3210-3224)Online publication date: 24-Apr-2024
    • (2023)The Yin and Yang of Software Quality: On the Relationship between Design Patterns and Code Smells2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00043(227-234)Online publication date: 6-Sep-2023
    • (2023)Revisiting "code smell severity classification using machine learning techniques"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC57700.2023.00113(840-849)Online publication date: Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.







    Share this Publication link

    Share on social media