Skip to main content
Log in

Efficient feature envy detection and refactoring based on graph neural network

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

As one type of frequently occurring code smells, feature envy negatively affects class cohesion, increases coupling between classes, and thus hampers software maintainability. While progress has been made in feature envy detection, two challenges still persist. Firstly, existing approaches often underutilize method call relationships, resulting in suboptimal detection efficiency. Secondly, they lack the emphasis on feature envy refactoring, which is however the ultimate goal of feature envy detection. To address these challenges, we propose two approaches: SCG (SMOTE Call Graph) and SFFL (Symmetric Feature Fusion Learning). SCG transforms the feature envy detection problem into a binary classification task on a method call graph. It predicts the weights of edges, termed calling strength, to capture the strength of method invocations. Additionally, it converts the method-method call graph into a method-class call graph and recommends the smelly method to the external class with the highest calling strength. As a holistic approach focusing on refactoring feature envy directly, SFFL leverages four heterogeneous graphs to represent method-class relationships. Through Symmetric Feature Fusion Learning, it obtains representations for methods and classes. Link prediction is then employed to generate the refactored method-class ownership graph, which is regarded as the refactored results. Moreover, to address the limitations of existing metrics in accurately evaluating refactoring performance, we introduce three new metrics: \(\textit{precision}_2\), \(\textit{recall}_2\) and \(\textit{F}_1\text {-score}_2\). Extensive experiments on five open-source projects demonstrate the superiority of SCG and SFFL. The code and dataset used in our study are available at https://github.com/HduDBSI/SCG-SFFL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://gephi.org/

  2. https://emenda.com/scitools-understand/

  3. https://github.com/openai/tiktoken

  4. https://github.com/huggingface/tokenizers

  5. https://doi.org/10.5281/zenodo.4468361

  6. https://github.com/liuhuigmail/FeatureEnvy

  7. https://github.com/tsantalis/JDeodorant

  8. https://github.com/aserg-ufmg/jmove

  9. https://www.eclipse.org/

  10. https://chat.openai.com/

References

  • Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, arXiv:1409.0473 (2015)

  • Bui, N.D.Q., Yu, Y., Jiang, L.: Infercode: Self-supervised learning of code representations by predicting subtrees. In: 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, pp 1186–1197, https://doi.org/10.1109/ICSE43902.2021.00109, (2021)

  • Chawla, N.V., Bowyer, K.W., Hall, L.O., et al..: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    Article  Google Scholar 

  • Cunningham, W.: The wycash portfolio management system. OOPS Messenger 4(2), 29–30 (1993). https://doi.org/10.1145/157710.157715

    Article  Google Scholar 

  • da Silva, Maldonado E., Shihab, E., Tsantalis, N.: Using natural language processing to automatically detect self-admitted technical debt. IEEE Trans Softw Eng 43(11), 1044–1062 (2017)

    Article  Google Scholar 

  • Devlin, J., Chang, M., Lee, K., et al..: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186, https://doi.org/10.18653/v1/n19-1423 (2019)

  • Fowler, M.: Refactoring - Improving the Design of Existing Code. Addison Wesley object technology series, Addison-Wesley, http://martinfowler.com/books/refactoring.html (1999)

  • Guo, X., Shi, C., Jiang, H.: Deep semantic-based feature envy identification. In: Internetware ’19: The 11th Asia-Pacific Symposium on Internetware, Fukuoka, Japan, October 28-29, 2019. ACM, pp 19:1–19:6, https://doi.org/10.1145/3361242.3361257 (2019)

  • Guo, Z., Liu, S., Liu, J., et al..: How far have we progressed in identifying self-admitted technical debts? a comprehensive empirical study. ACM Trans Softw Eng Methodol 30(4), 1–56 (2021)

    Article  Google Scholar 

  • Hadj-Kacem, M., Bouassida, N.: A multi-label classification approach for detecting test smells over java projects. J King Saud Univ Comput Inf Sci 34(10), 8692–8701 (2022)

    Google Scholar 

  • Hamilton, W.L., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Guyon I, von Luxburg U, Bengio S, et al. (eds) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 1024–1034, https://proceedings.neurips.cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html (2017)

  • Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  • Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, arXiv:1412.6980 (2015)

  • Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, https://openreview.net/forum?id=SJU4ayYgl (2017)

  • Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, JMLR Workshop and Conference Proceedings, vol 32. JMLR.org, pp 1188–1196, http://proceedings.mlr.press/v32/le14.html (2014)

  • LeCun, Y., Bottou, L., Bengio, Y., et al..: Gradient-based learning applied to document recognition. Proc IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  • Liu, H., Xu, Z., Zou, Y.: Deep learning based feature envy detection. In: Huchard M, Kästner C, Fraser G (eds) Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3-7, 2018. ACM, pp 385–396, https://doi.org/10.1145/3238147.3238166 (2018)

  • Lozano, A., Mens, K., Portugal, J.: Analyzing code evolution to uncover relations. In: 2015 IEEE 2nd International Workshop on Patterns Promotion and Anti-patterns Prevention (PPAP), pp 1–4, https://doi.org/10.1109/PPAP.2015.7076847 (2015)

  • Ma, W., Yu, Y., Ruan, X., et al..: Pre-trained model based feature envy detection. In: 20th IEEE/ACM International Conference on Mining Software Repositories, MSR 2023, Melbourne, Australia, May 15-16, 2023. IEEE, pp 430–440, https://doi.org/10.1109/MSR59073.2023.00065 (2023)

  • Mikolov, T., Chen, K., Corrado, G., et al..: Efficient estimation of word representations in vector space. In: Bengio Y, LeCun Y (eds) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, arXiv:1301.3781 (2013)

  • Nandani, H., Saad, M., Sharma, T.: DACOS - A manually annotated dataset of code smells. In: 20th IEEE/ACM International Conference on Mining Software Repositories, MSR 2023, Melbourne, Australia, May 15-16, 2023. IEEE, pp 446–450,https://doi.org/10.1109/MSR59073.2023.00067 (2023)

  • Palomba, F., Bavota, G., Penta, M.D., et al..: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation. Empir Softw Eng 23(3), 1188–1221 (2018). https://doi.org/10.1007/s10664-017-9535-z

    Article  Google Scholar 

  • Sales, V., Terra, R., Miranda, L.F., et al..: Recommending move method refactorings using dependency sets. In: Lämmel R, Oliveto R, Robbes R (eds) 20th Working Conference on Reverse Engineering, WCRE 2013, Koblenz, Germany, October 14-17, 2013. IEEE Computer Society, pp 232–241, https://doi.org/10.1109/WCRE.2013.6671298 (2013)

  • Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11), 2673–2681 (1997). https://doi.org/10.1109/78.650093

    Article  Google Scholar 

  • Shahidi, M., Ashtiani, M., Nasrabadi, M.Z.: An automated extract method refactoring approach to correct the long method code smell. J Syst Softw 187, 111221 (2022). https://doi.org/10.1016/j.jss.2022.111221

    Article  Google Scholar 

  • Sharma, T., Kessentini, M.: Qscored: A large dataset of code smells and quality metrics. In: 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021, Madrid, Spain, May 17-19, 2021. IEEE, pp 590–594, https://doi.org/10.1109/MSR52588.2021.00080 (2021)

  • Sharma, T., Efstathiou, V., Louridas, P., et al..: Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176, 110936 (2021). https://doi.org/10.1016/j.jss.2021.110936

    Article  Google Scholar 

  • Simon, F., Steinbrückner, F., Lewerentz, C.: Metrics based refactoring. In: Sousa P, Ebert J (eds) Fifth Conference on Software Maintenance and Reengineering, CSMR 2001, Lisbon, Portugal, March 14-16, 2001. IEEE Computer Society, pp 30–38, https://doi.org/10.1109/.2001.914965 (2001)

  • Terra, R., Valente, M.T., Miranda, S., et al..: Jmove: a novel heuristic and tool to detect move method refactoring opportunities. J Syst Softw 138, 19–36 (2018). https://doi.org/10.1016/j.jss.2017.11.073

    Article  Google Scholar 

  • Tsantalis, N., Chatzigeorgiou, A.: Identification of move method refactoring opportunities. IEEE Trans Softw Eng 35(3), 347–367 (2009). https://doi.org/10.1109/TSE.2009.1

    Article  Google Scholar 

  • Tsantalis, N., Chaikalis, T., Chatzigeorgiou, A.: Ten years of jdeodorant: Lessons learned from the hunt for smells. In: Oliveto R, Penta MD, Shepherd DC (eds) 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, March 20-23, 2018. IEEE Computer Society, pp 4–14, https://doi.org/10.1109/SANER.2018.8330192 (2018)

  • Vaswani, A., Shazeer, N., Parmar, N., et al..: Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, et al. (eds) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 5998–6008, https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (2017)

  • Velickovic, P., Cucurull, G., Casanova, A., et al..: Graph attention networks. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, https://openreview.net/forum?id=rJXMpikCZ (2018)

  • Wang, H., Liu, J., Kang, J., et al..: Feature envy detection based on bi-lstm with self-attention mechanism. In: Hu J, Min G, Georgalas N, et al. (eds) IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking, ISPA/BDCloud/SocialCom/SustainCom 2020, Exeter, United Kingdom, December 17-19, 2020. IEEE, pp 448–457, https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082 (2020)

  • Yin, X., Shi, C., Zhao, S.: Local and global feature based explainable feature envy detection. In: IEEE 45th Annual Computers, Software, and Applications Conference, COMPSAC 2021, Madrid, Spain, July 12-16, 2021. IEEE, pp 942–951, https://doi.org/10.1109/COMPSAC51774.2021.00127 (2021)

  • Yu, D., Xu, Y., Weng, L., et al..: Detecting and refactoring feature envy based on graph neural network. In: IEEE 33rd International Symposium on Software Reliability Engineering, ISSRE 2022, Charlotte, NC, USA, October 31 - Nov. 3, 2022. IEEE, pp 458–469, https://doi.org/10.1109/ISSRE55969.2022.00051 (2022)

  • Zhang, M., Jia, J.: Feature envy detection with deep learning and snapshot ensemble. In: 9th International Conference on Dependable Systems and Their Applications, DSA 2022, Wulumuqi, China, August 4-5, 2022. IEEE, pp 215–223, https://doi.org/10.1109/DSA56465.2022.00037 (2022)

  • Zhao, S., Shi, C., Ren, S., et al..: Correlation feature mining model based on dual attention for feature envy detection. In: Peng R, Pantoja CE, Kamthan P (eds) The 34th International Conference on Software Engineering and Knowledge Engineering, SEKE 2022, KSIR Virtual Conference Center, USA, July 1 - July 10, 2022. KSI Research Inc., pp 634–639, https://doi.org/10.18293/SEKE2022-009 (2022)

  • Zhao, T., Zhang, X., Wang, S.: Graphsmote: Imbalanced node classification on graphs with graph neural networks. In: Lewin-Eytan L, Carmel D, Yom-Tov E, et al. (eds) WSDM ’21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. ACM, pp 833–841, https://doi.org/10.1145/3437963.3441720 (2021)

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China under Grants 62372145 and 61902096, and Key Research and Development Program of Zhejiang Province under Grants 2023C03200 and 2023C03179. The authors would also like to thank the anonymous reviewers for their insightful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongjin Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 291 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, D., Xu, Y., Weng, L. et al. Efficient feature envy detection and refactoring based on graph neural network. Autom Softw Eng 32, 7 (2025). https://doi.org/10.1007/s10515-024-00476-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-024-00476-3

Keywords