Abstract
Code smell detection is a task aimed at identifying sub-optimal programming structures within code entities that may indicate problems requiring attention. It plays a crucial role in improving software quality. Numerous automatic or semi-automatic methods for code smell detection have been proposed. However, these methods are constrained by the manual setting of detection rules and thresholds, leading to subjective determinations, or they require large-scale labeled datasets for model training. In addition, they exhibit poor detection performance across different projects. Related studies have revealed the existence of co-occurrences among different types of code smells. Therefore, we propose a smart code smell detection method based on code smell co-occurrences, termed BMCo-O. The key insight is that code smell co-occurrences can assist in improving code smell detection. We introduce and utilize code smell co-occurrence impact factor set, a code smell pre-filter mechanism, and a possibility mechanism, which enable BMCo-O to demonstrate outstanding detection performance. To reduce manual intervention, we propose an adaptive detection mechanism that automatically adjusts parameters to detect different types of code smell in various software projects. As an initial attempt, we applied the proposed method to seven classical high-criticality code smells: Message Chain, Feature Envy, Spaghetti Code, Large Class, Complex Class, Refused Bequest, and Long Method. The evaluation results on benchmarks composed of open source software projects demonstrated that BMCo-O significantly outperforms the well-known and widely used methods in detecting these seven classical code smells, especially in F1, with improvements of 137%, 155%, 23%, 195%, 364%, 552% and 35%, respectively. To further verify its effectiveness in actual detection across different software projects, we also implemented a prototype of a new code smell detector using BMCo-O.








Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
https://github.com/SilverMustang/BMCo-O
Notes
CppDepend. 2021. https://www.cppdepend.com/
Designite. 2021. https://www.designite-tools.com/
NDepend. 2021. https://www.ndepend.com/
Sonarsource. 2021. https://www.sonarsource.com/
PMD. 2021. https://pmd.github.io/pmd-6.29.0/index.html
SDMetrics. 2021. https://www.sdmetrics.com/index.html
optimized Palambo dataset.zip, https://github.com/SilverMustang/BMCo-O
References
Alazba, A., Aljamaan, H., Alshayeb, M.: Cort: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection. Empir. Softw. Eng. 29(3), 59 (2024). https://doi.org/10.1007/s10664-024-10445-9
Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv. (CSUR) 51(4), 81–137 (2018). https://doi.org/10.1145/3212695
Arcelli Fontana, F., Mäntylä, M.V., Zanoni, M., Marino, A.: Comparing and experimenting machine learning techniques for code smell detection. Empir. Softw. Eng. 21, 1143–1191 (2016)
Bafandeh Mayvan, B., Rasoolzadegan, A., Javan Jafari, A.: Bad smell detection using quality metrics and refactoring opportunities. J. Softw.: Evolut. Process 32(8), 2255 (2020). https://doi.org/10.1002/smr.2255
Barbez, A., Khomh, F., Guéhéneuc, Y.-G.: A machine-learning based ensemble method for anti-patterns detection. J. Syst. Softw. 161, 110486 (2020)
Bavota, G., De Lucia, A., Di Penta, M., Oliveto, R., Palomba, F.: An experimental investigation on the innate relationship between quality and refactoring. J. Syst. Softw. 107, 1–14 (2015)
Bigonha, M.A., Ferreira, K., Souza, P., Sousa, B., Januário, M., Lima, D.: The usefulness of software metric thresholds for detection of bad smells and fault prediction. Inf. Softw. Technol. 115, 79–92 (2019)
Cruz, D., Santana, A., Figueiredo, E.: Detecting bad smells with machine learning algorithms: an empirical study. In: Proceedings of the 3rd international conference on technical debt, pp. 31–40 (2020)
De Stefano, M., Pecorelli, F., Palomba, F., De Lucia, A.: Comparing within-and cross-project machine learning algorithms for code smell detection. In: Proceedings of the 5th international workshop on machine learning techniques for software quality evolution, pp. 1–6 (2021)
Fard, A.M., Mesbah, A.: Jsnose: Detecting javascript code smells. In: 2013 IEEE 13th international working conference on source code analysis and manipulation (SCAM), pp. 116–125. IEEE, (2013)
Fontana, F.A., Ferme, V., Marino, A., Walter, B., Martenka, P.: Investigating the impact of code smells on system’s quality: an empirical study on systems of different application domains. In: 2013 IEEE international conference on software maintenance, pp. 260–269. IEEE, (2013)
Fontana, F.A., Zanoni, M., Marino, A., Mäntylä, M.V.: Code smell detection: Towards a machine learning-based approach. In: 2013 IEEE international conference on software maintenance, pp. 396–399. IEEE, (2013)
Fowler, M., Beck, K.: Refactoring: Improving the design of existing code. In: 11th European conference. Jyväskylä, Finland, (1997)
Guggulothu, T., Moiz, S.A.: Code smell detection using multi-label classification approach. Softw. Qual. J. 28, 1063–1086 (2020)
Guilherme Lacerda, M.P., Petrillo, Fabio, Guéhéneuc, Y.G.: Code smells and refactoring: a tertiary systematic review of challenges and observations. J. Syst. Softw. 167, 110610 (2020). https://doi.org/10.1016/j.jss.2020.110610
Gupta, A., Suri, B., Misra, S.: A systematic literature review: code bad smells in java source code. In: Computational Science and Its Applications–ICCSA 2017: 17th International conference, Proceedings, Part V 17, pp. 665–682. Springer, Trieste (2017)
Hadj-Kacem, M., Bouassida, N.: Application of deep learning for code smell detection: challenges and opportunities. SN Comput. Sci. 5, 614 (2024). https://doi.org/10.1007/s42979-024-02956-5
Hadj-Kacem, M., Bouassida, N.: A hybrid approach to detect code smells using deep learning. In: ENASE, pp. 137–146 (2018)
Huang, Z., Chen, J., Gao, J.: Detecting coupling and cohesion code smells of javascript classes. J. Softw. 32(8), 2505–2521 (2021)
Jain, S., Saha, A.: Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection. Sci. Comput. Program. 212, 102713 (2021)
Khomh, F., Penta, M.D., Guéhéneuc, Y.-G., Antoniol, G.: An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empir. Softw. Eng. 17, 243–275 (2012)
Kovačević, A., Luburić, N., Slivka, J., Prokić, S., Grujić, K.-G., Vidaković, D., Sladić, G.: Automatic detection of code smells using metrics and codet5 embeddings: a case study in c#. Neural Comput. Appl 36, 9203–9220 (2024). https://doi.org/10.1007/s00521-024-09551-y
Kovačević, A., Slivka, J., Vidaković, D., Grujić, K.-G., Luburić, N., Prokić, S., Sladić, G.: Automatic detection of long method and god class code smells through neural source code embeddings. Expert Syst. Appl. 204, 117607 (2022). https://doi.org/10.1016/j.eswa.2022.117607
Kreimer, J.: Adaptive detection of design flaws. Electron. Notes Theory Comput. Sci. 141(4), 117–136 (2005)
Liu, H., Jin, J., Xu, Z., Zou, Y., Bu, Y., Zhang, L.: Deep learning based code smell detection. IEEE trans. Softw. Eng. 47(9), 1811–1837 (2021)
Liu, B., Liu, H., Li, G., Niu, N., Xu, Z., Wang, Y., Xia, Y., Zhang, Y., Jiang, Y.: Deep learning based feature envy detection boosted by real-world examples. In: 31st ACM joint meeting of the European software engineering conference/symposium on the foundations-of-software-engineering (ESEC/FSE), pp. 908–920. ACM, New York (2023)
Ma, W., Yu, Y., Ruan, X., Cai, B.: Pre-trained model based feature envy detection. In: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), pp. 430–440. IEEE, (2023)
Mazinanian, D., Tsantalis, N., Stein, R., Valenta, Z.: Jdeodorant: clone refactoring. In: Proceedings of the 38th international conference on software engineering companion, pp. 613–616 (2016)
Mens, T., Tourwé, T.: A survey of software refactoring. IEEE Trans. Softw. Eng. 30(2), 126–139 (2004)
Moha, N., Guéhéneuc, Y.-G., Duchien, L., Le Meur, A.-F.: Decor: A method for the specification and detection of code and design smells. IEEE Trans. Softw. Eng. 36(1), 20–36 (2010)
Nandani, H., Saad, M., Sharma, T.: Dacos-a manually annotated dataset of code smells. In: 2023 IEEE/ACM 20th International conference on mining software repositories (MSR), pp. 1–10. IEEE, (2023)
Nunes, H.G., Santana, A., Figueiredo, E., Costa, H.: Tuning code smell prediction models: A replication study. In: 2024 IEEE/ACM 32nd International conference on program comprehension (ICPC), pp. 316–327. ACM, (2024)
Paiva, T., Damasceno, A., Figueiredo, E., Sant’Anna, C.: On the evaluation of code smells and detection tools. J. Softw. Eng. Res. Dev. 5(1), 1–28 (2017)
Palomba, F., Andrew Tamburri, D., Arcelli Fontana, F., Oliveto, R., Zaidman, A., Serebrenik, A.: Beyond technical aspects: how do community smells influence the intensity of code smells? IEEE Trans. Softw. Eng. 47(1), 108–129 (2021)
Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., De Lucia, A.: A large-scale empirical study on the lifecycle of code smell co-occurrences. Inf. Softw. Technol. 99, 1–10 (2018)
Palomba, F., Bavota, G., Penta, M.D., Oliveto, R., Poshyvanyk, D., De Lucia, A.: Mining version histories for detecting code smells. IEEE Trans. Softw. Eng. 41(5), 462–489 (2015)
Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., De Lucia, A.: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. In: Proceedings of the 40th international conference on software engineering, pp. 482–482 (2018)
Palomba, F., Panichella, A., De Lucia, A., Oliveto, R., Zaidman, A.: A textual-based technique for smell detection. In: 2016 IEEE 24th international conference on program comprehension (ICPC), pp. 1–10. IEEE, (2016)
Palomba, F., Panichella, A., Zaidman, A., Oliveto, R., De Lucia, A.: The scent of a smell: An extensive comparison between textual and structural smells. In: Proceedings of the 40th international conference on software engineering, pp. 740–740 (2018)
de Paulo Sobrinho, E.V., De Lucia, A., de Almeida Maia, M.: A systematic literature review on bad smells-5 w’s: Which, when, what, who, where. IEEE Trans. Softw. Eng. 47(1), 17–66 (2021). https://doi.org/10.1109/TSE.2018.2880977
Pecorelli, F., Di Nucci, D., De Roover, C., De Lucia, A.: On the role of data balancing for machine learning-based code smell detection. In: Proceedings of the 3rd ACM SIGSOFT international workshop on machine learning techniques for software quality evaluation, pp. 19–24 (2019)
Pecorelli, F., Palomba, F., Di Nucci, D., De Lucia, A.: Comparing heuristic and machine learning approaches for metric-based code smell detection. In: 2019 IEEE/ACM 27th international conference on program comprehension (ICPC), pp. 93–104. IEEE, (2019)
Pietrzak, B., Walter, B.: Leveraging code smell detection with inter-smell relations. In: Extreme programming and agile processes in software engineering, pp. 75–84. Springer, Berlin (2006)
Rasool, G., Arshad, Z.: A review of code smell mining techniques. J. Softw.: Evolut. Process 27(11), 867–895 (2015)
Rasool, G., Arshad, Z.: A lightweight approach for detection of code smells. Arab. J. Sci. Eng. 42, 483–506 (2017)
Shen, L., Liu, W., Chen, X., Gu, Q., Liu, X.: Improving machine learning-based code smell detection via hyper-parameter optimization. In: 2020 27th Asia-Pacific Software Engineering Conference (APSEC), pp. 276–285. IEEE, (2020)
Slivka, J., Luburić, N., Prokić, S., Grujić, K.-G., Kovačević, A., Sladić, G., Vidaković, D.: Towards a systematic approach to manual annotation of code smells. Sci. Comput. Program. 230, 102999 (2023). https://doi.org/10.1016/j.scico.2023.102999
Tian, Y., Li, K., Wang, T., Jiao, Q., Li, G., Zhang, Y., Liu, H.: Survey on code smells. Ruan Jian Xue Bao/Journal of Software (in Chinese) 34(1), 150–170 (2023). https://doi.org/10.13328/j.cnki.jos.006431
Vidal, S., Vazquez, H., Diaz-Pace, J.A., Marcos, C., Garcia, A., Oizumi, W.: Jspirit: a flexible tool for the analysis of code smells. In: 2015 34th International conference of the Chilean computer science society (SCCC), pp. 1–6. IEEE, (2015)
Yadav, P.S., Rao, R.S., Mishra, A., Gupta, M.: Machine learning-based methods for code smell detection: a survey. Appl. Sci. 14(14), 6149 (2024). https://doi.org/10.3390/app14146149
Yedida, R., Menzies, T.: How to improve deep learning for software analytics: (a case study with code smell detection). In: Proceedings of the 19th International conference on mining software repositories, pp. 156–166 (2022)
Zhang, Y., Dong, C., Liu, H., Ge, C.: Code smell detection approach based on pre-training model and multi-level information. J. Softw. 33(5), 1551–1568 (2022)
Zhang, X., Zhu, C.: Empirical study of code smell impact on software evolution. J. Softw. 30(5), 1422–1437 (2019)
Acknowledgements
The authors would like to thank anonymous reviewers for their insightful and constructive comments. This work was supported by the National Nature Science Foundation of China (Number 62176164), the Natural Science Foundation of Guangdong Province (Grant 2023A1515010992), Shenzhen Science and Technology Foundation (JCYJ20220531101217039 and JCYJ20210324093212034).
Author information
Authors and Affiliations
Contributions
Feiqiao Mao made substantial contributions to the conception and design of the work; revised the draft of the work critically for important intellectual content. Kaihang Zhong made the acquisition, analysis, interpretation of data; and the creation of new software used in the work; drafted the work. Feiqiao Mao and Kaihang Zhong wrote the main manuscript text and Long Cheng prepared tables 1-4, 21-23, figures 2-3 and the latex sources files of the manuscript. All authors reviewed the manuscript. Feiqiao Mao and Long Cheng revised the manuscript according to the review comments.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mao, F., Zhong, K. & Cheng, L. Bmco-o: a smart code smell detection method based on co-occurrences. Autom Softw Eng 32, 24 (2025). https://doi.org/10.1007/s10515-025-00486-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-025-00486-9