Skip to main content

Advertisement

Log in

Bmco-o: a smart code smell detection method based on co-occurrences

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Code smell detection is a task aimed at identifying sub-optimal programming structures within code entities that may indicate problems requiring attention. It plays a crucial role in improving software quality. Numerous automatic or semi-automatic methods for code smell detection have been proposed. However, these methods are constrained by the manual setting of detection rules and thresholds, leading to subjective determinations, or they require large-scale labeled datasets for model training. In addition, they exhibit poor detection performance across different projects. Related studies have revealed the existence of co-occurrences among different types of code smells. Therefore, we propose a smart code smell detection method based on code smell co-occurrences, termed BMCo-O. The key insight is that code smell co-occurrences can assist in improving code smell detection. We introduce and utilize code smell co-occurrence impact factor set, a code smell pre-filter mechanism, and a possibility mechanism, which enable BMCo-O to demonstrate outstanding detection performance. To reduce manual intervention, we propose an adaptive detection mechanism that automatically adjusts parameters to detect different types of code smell in various software projects. As an initial attempt, we applied the proposed method to seven classical high-criticality code smells: Message Chain, Feature Envy, Spaghetti Code, Large Class, Complex Class, Refused Bequest, and Long Method. The evaluation results on benchmarks composed of open source software projects demonstrated that BMCo-O significantly outperforms the well-known and widely used methods in detecting these seven classical code smells, especially in F1, with improvements of 137%, 155%, 23%, 195%, 364%, 552% and 35%, respectively. To further verify its effectiveness in actual detection across different software projects, we also implemented a prototype of a new code smell detector using BMCo-O.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

https://github.com/SilverMustang/BMCo-O

Notes

  1. CppDepend. 2021. https://www.cppdepend.com/

  2. Designite. 2021. https://www.designite-tools.com/

  3. NDepend. 2021. https://www.ndepend.com/

  4. Sonarsource. 2021. https://www.sonarsource.com/

  5. PMD. 2021. https://pmd.github.io/pmd-6.29.0/index.html

  6. SDMetrics. 2021. https://www.sdmetrics.com/index.html

  7. https://github.com/SilverMustang/BMCo-O

  8. optimized Palambo dataset.zip, https://github.com/SilverMustang/BMCo-O

References

  • Alazba, A., Aljamaan, H., Alshayeb, M.: Cort: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection. Empir. Softw. Eng. 29(3), 59 (2024). https://doi.org/10.1007/s10664-024-10445-9

    Article  Google Scholar 

  • Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv. (CSUR) 51(4), 81–137 (2018). https://doi.org/10.1145/3212695

    Article  Google Scholar 

  • Arcelli Fontana, F., Mäntylä, M.V., Zanoni, M., Marino, A.: Comparing and experimenting machine learning techniques for code smell detection. Empir. Softw. Eng. 21, 1143–1191 (2016)

    Article  MATH  Google Scholar 

  • Bafandeh Mayvan, B., Rasoolzadegan, A., Javan Jafari, A.: Bad smell detection using quality metrics and refactoring opportunities. J. Softw.: Evolut. Process 32(8), 2255 (2020). https://doi.org/10.1002/smr.2255

    Article  MATH  Google Scholar 

  • Barbez, A., Khomh, F., Guéhéneuc, Y.-G.: A machine-learning based ensemble method for anti-patterns detection. J. Syst. Softw. 161, 110486 (2020)

    Article  Google Scholar 

  • Bavota, G., De Lucia, A., Di Penta, M., Oliveto, R., Palomba, F.: An experimental investigation on the innate relationship between quality and refactoring. J. Syst. Softw. 107, 1–14 (2015)

    Article  MATH  Google Scholar 

  • Bigonha, M.A., Ferreira, K., Souza, P., Sousa, B., Januário, M., Lima, D.: The usefulness of software metric thresholds for detection of bad smells and fault prediction. Inf. Softw. Technol. 115, 79–92 (2019)

    Article  Google Scholar 

  • Cruz, D., Santana, A., Figueiredo, E.: Detecting bad smells with machine learning algorithms: an empirical study. In: Proceedings of the 3rd international conference on technical debt, pp. 31–40 (2020)

  • De Stefano, M., Pecorelli, F., Palomba, F., De Lucia, A.: Comparing within-and cross-project machine learning algorithms for code smell detection. In: Proceedings of the 5th international workshop on machine learning techniques for software quality evolution, pp. 1–6 (2021)

  • Fard, A.M., Mesbah, A.: Jsnose: Detecting javascript code smells. In: 2013 IEEE 13th international working conference on source code analysis and manipulation (SCAM), pp. 116–125. IEEE, (2013)

  • Fontana, F.A., Ferme, V., Marino, A., Walter, B., Martenka, P.: Investigating the impact of code smells on system’s quality: an empirical study on systems of different application domains. In: 2013 IEEE international conference on software maintenance, pp. 260–269. IEEE, (2013)

  • Fontana, F.A., Zanoni, M., Marino, A., Mäntylä, M.V.: Code smell detection: Towards a machine learning-based approach. In: 2013 IEEE international conference on software maintenance, pp. 396–399. IEEE, (2013)

  • Fowler, M., Beck, K.: Refactoring: Improving the design of existing code. In: 11th European conference. Jyväskylä, Finland, (1997)

  • Guggulothu, T., Moiz, S.A.: Code smell detection using multi-label classification approach. Softw. Qual. J. 28, 1063–1086 (2020)

    Article  MATH  Google Scholar 

  • Guilherme Lacerda, M.P., Petrillo, Fabio, Guéhéneuc, Y.G.: Code smells and refactoring: a tertiary systematic review of challenges and observations. J. Syst. Softw. 167, 110610 (2020). https://doi.org/10.1016/j.jss.2020.110610

    Article  Google Scholar 

  • Gupta, A., Suri, B., Misra, S.: A systematic literature review: code bad smells in java source code. In: Computational Science and Its Applications–ICCSA 2017: 17th International conference, Proceedings, Part V 17, pp. 665–682. Springer, Trieste (2017)

  • Hadj-Kacem, M., Bouassida, N.: Application of deep learning for code smell detection: challenges and opportunities. SN Comput. Sci. 5, 614 (2024). https://doi.org/10.1007/s42979-024-02956-5

    Article  MATH  Google Scholar 

  • Hadj-Kacem, M., Bouassida, N.: A hybrid approach to detect code smells using deep learning. In: ENASE, pp. 137–146 (2018)

  • Huang, Z., Chen, J., Gao, J.: Detecting coupling and cohesion code smells of javascript classes. J. Softw. 32(8), 2505–2521 (2021)

    MATH  Google Scholar 

  • Jain, S., Saha, A.: Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection. Sci. Comput. Program. 212, 102713 (2021)

    Article  Google Scholar 

  • Khomh, F., Penta, M.D., Guéhéneuc, Y.-G., Antoniol, G.: An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empir. Softw. Eng. 17, 243–275 (2012)

    Article  MATH  Google Scholar 

  • Kovačević, A., Luburić, N., Slivka, J., Prokić, S., Grujić, K.-G., Vidaković, D., Sladić, G.: Automatic detection of code smells using metrics and codet5 embeddings: a case study in c#. Neural Comput. Appl 36, 9203–9220 (2024). https://doi.org/10.1007/s00521-024-09551-y

    Article  Google Scholar 

  • Kovačević, A., Slivka, J., Vidaković, D., Grujić, K.-G., Luburić, N., Prokić, S., Sladić, G.: Automatic detection of long method and god class code smells through neural source code embeddings. Expert Syst. Appl. 204, 117607 (2022). https://doi.org/10.1016/j.eswa.2022.117607

    Article  Google Scholar 

  • Kreimer, J.: Adaptive detection of design flaws. Electron. Notes Theory Comput. Sci. 141(4), 117–136 (2005)

    Article  MATH  Google Scholar 

  • Liu, H., Jin, J., Xu, Z., Zou, Y., Bu, Y., Zhang, L.: Deep learning based code smell detection. IEEE trans. Softw. Eng. 47(9), 1811–1837 (2021)

    Google Scholar 

  • Liu, B., Liu, H., Li, G., Niu, N., Xu, Z., Wang, Y., Xia, Y., Zhang, Y., Jiang, Y.: Deep learning based feature envy detection boosted by real-world examples. In: 31st ACM joint meeting of the European software engineering conference/symposium on the foundations-of-software-engineering (ESEC/FSE), pp. 908–920. ACM, New York (2023)

  • Ma, W., Yu, Y., Ruan, X., Cai, B.: Pre-trained model based feature envy detection. In: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), pp. 430–440. IEEE, (2023)

  • Mazinanian, D., Tsantalis, N., Stein, R., Valenta, Z.: Jdeodorant: clone refactoring. In: Proceedings of the 38th international conference on software engineering companion, pp. 613–616 (2016)

  • Mens, T., Tourwé, T.: A survey of software refactoring. IEEE Trans. Softw. Eng. 30(2), 126–139 (2004)

    Article  MATH  Google Scholar 

  • Moha, N., Guéhéneuc, Y.-G., Duchien, L., Le Meur, A.-F.: Decor: A method for the specification and detection of code and design smells. IEEE Trans. Softw. Eng. 36(1), 20–36 (2010)

    Article  MATH  Google Scholar 

  • Nandani, H., Saad, M., Sharma, T.: Dacos-a manually annotated dataset of code smells. In: 2023 IEEE/ACM 20th International conference on mining software repositories (MSR), pp. 1–10. IEEE, (2023)

  • Nunes, H.G., Santana, A., Figueiredo, E., Costa, H.: Tuning code smell prediction models: A replication study. In: 2024 IEEE/ACM 32nd International conference on program comprehension (ICPC), pp. 316–327. ACM, (2024)

  • Paiva, T., Damasceno, A., Figueiredo, E., Sant’Anna, C.: On the evaluation of code smells and detection tools. J. Softw. Eng. Res. Dev. 5(1), 1–28 (2017)

    Article  Google Scholar 

  • Palomba, F., Andrew Tamburri, D., Arcelli Fontana, F., Oliveto, R., Zaidman, A., Serebrenik, A.: Beyond technical aspects: how do community smells influence the intensity of code smells? IEEE Trans. Softw. Eng. 47(1), 108–129 (2021)

    Article  Google Scholar 

  • Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., De Lucia, A.: A large-scale empirical study on the lifecycle of code smell co-occurrences. Inf. Softw. Technol. 99, 1–10 (2018)

    Article  Google Scholar 

  • Palomba, F., Bavota, G., Penta, M.D., Oliveto, R., Poshyvanyk, D., De Lucia, A.: Mining version histories for detecting code smells. IEEE Trans. Softw. Eng. 41(5), 462–489 (2015)

    Article  Google Scholar 

  • Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., De Lucia, A.: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. In: Proceedings of the 40th international conference on software engineering, pp. 482–482 (2018)

  • Palomba, F., Panichella, A., De Lucia, A., Oliveto, R., Zaidman, A.: A textual-based technique for smell detection. In: 2016 IEEE 24th international conference on program comprehension (ICPC), pp. 1–10. IEEE, (2016)

  • Palomba, F., Panichella, A., Zaidman, A., Oliveto, R., De Lucia, A.: The scent of a smell: An extensive comparison between textual and structural smells. In: Proceedings of the 40th international conference on software engineering, pp. 740–740 (2018)

  • de Paulo Sobrinho, E.V., De Lucia, A., de Almeida Maia, M.: A systematic literature review on bad smells-5 w’s: Which, when, what, who, where. IEEE Trans. Softw. Eng. 47(1), 17–66 (2021). https://doi.org/10.1109/TSE.2018.2880977

    Article  Google Scholar 

  • Pecorelli, F., Di Nucci, D., De Roover, C., De Lucia, A.: On the role of data balancing for machine learning-based code smell detection. In: Proceedings of the 3rd ACM SIGSOFT international workshop on machine learning techniques for software quality evaluation, pp. 19–24 (2019)

  • Pecorelli, F., Palomba, F., Di Nucci, D., De Lucia, A.: Comparing heuristic and machine learning approaches for metric-based code smell detection. In: 2019 IEEE/ACM 27th international conference on program comprehension (ICPC), pp. 93–104. IEEE, (2019)

  • Pietrzak, B., Walter, B.: Leveraging code smell detection with inter-smell relations. In: Extreme programming and agile processes in software engineering, pp. 75–84. Springer, Berlin (2006)

  • Rasool, G., Arshad, Z.: A review of code smell mining techniques. J. Softw.: Evolut. Process 27(11), 867–895 (2015)

    MATH  Google Scholar 

  • Rasool, G., Arshad, Z.: A lightweight approach for detection of code smells. Arab. J. Sci. Eng. 42, 483–506 (2017)

    Article  MATH  Google Scholar 

  • Shen, L., Liu, W., Chen, X., Gu, Q., Liu, X.: Improving machine learning-based code smell detection via hyper-parameter optimization. In: 2020 27th Asia-Pacific Software Engineering Conference (APSEC), pp. 276–285. IEEE, (2020)

  • Slivka, J., Luburić, N., Prokić, S., Grujić, K.-G., Kovačević, A., Sladić, G., Vidaković, D.: Towards a systematic approach to manual annotation of code smells. Sci. Comput. Program. 230, 102999 (2023). https://doi.org/10.1016/j.scico.2023.102999

    Article  Google Scholar 

  • Tian, Y., Li, K., Wang, T., Jiao, Q., Li, G., Zhang, Y., Liu, H.: Survey on code smells. Ruan Jian Xue Bao/Journal of Software (in Chinese) 34(1), 150–170 (2023). https://doi.org/10.13328/j.cnki.jos.006431

    Article  MATH  Google Scholar 

  • Vidal, S., Vazquez, H., Diaz-Pace, J.A., Marcos, C., Garcia, A., Oizumi, W.: Jspirit: a flexible tool for the analysis of code smells. In: 2015 34th International conference of the Chilean computer science society (SCCC), pp. 1–6. IEEE, (2015)

  • Yadav, P.S., Rao, R.S., Mishra, A., Gupta, M.: Machine learning-based methods for code smell detection: a survey. Appl. Sci. 14(14), 6149 (2024). https://doi.org/10.3390/app14146149

    Article  MATH  Google Scholar 

  • Yedida, R., Menzies, T.: How to improve deep learning for software analytics: (a case study with code smell detection). In: Proceedings of the 19th International conference on mining software repositories, pp. 156–166 (2022)

  • Zhang, Y., Dong, C., Liu, H., Ge, C.: Code smell detection approach based on pre-training model and multi-level information. J. Softw. 33(5), 1551–1568 (2022)

    MATH  Google Scholar 

  • Zhang, X., Zhu, C.: Empirical study of code smell impact on software evolution. J. Softw. 30(5), 1422–1437 (2019)

    MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank anonymous reviewers for their insightful and constructive comments. This work was supported by the National Nature Science Foundation of China (Number 62176164), the Natural Science Foundation of Guangdong Province (Grant 2023A1515010992), Shenzhen Science and Technology Foundation (JCYJ20220531101217039 and JCYJ20210324093212034).

Author information

Authors and Affiliations

Authors

Contributions

Feiqiao Mao made substantial contributions to the conception and design of the work; revised the draft of the work critically for important intellectual content. Kaihang Zhong made the acquisition, analysis, interpretation of data; and the creation of new software used in the work; drafted the work. Feiqiao Mao and Kaihang Zhong wrote the main manuscript text and Long Cheng prepared tables 1-4, 21-23, figures 2-3 and the latex sources files of the manuscript. All authors reviewed the manuscript. Feiqiao Mao and Long Cheng revised the manuscript according to the review comments.

Corresponding author

Correspondence to Feiqiao Mao.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mao, F., Zhong, K. & Cheng, L. Bmco-o: a smart code smell detection method based on co-occurrences. Autom Softw Eng 32, 24 (2025). https://doi.org/10.1007/s10515-025-00486-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-025-00486-9

Keywords