Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell

Gupta, Himanshu; Gulanikar, Abhiram Anand; Kumar, Lov; Neti, Lalita Bhanu Murthy

doi:10.1007/978-3-030-87013-3_4

Himanshu Gupta¹⁸,
Abhiram Anand Gulanikar¹⁸,
Lov Kumar¹⁸ &
…
Lalita Bhanu Murthy Neti¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12957))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1532 Accesses

Abstract

A code smell is a surface indicator of an inherent problem in the system, most often due to deviation from standard coding practices on the developer’s part during the development phase. Studies observe that code smells made the code more susceptible to call for modifications and corrections than code that did not contain code smells. Restructuring the code at the early stage of development saves the exponentially increasing amount of effort it would require to address the issues stemming from the presence of these code smells. Instead of using traditional features to detect code smells, we use user comments (given on the packages’ repositories) to manually construct features to predict code smells. We use three Extreme learning machine kernels over 629 packages to identify eight code smells by leveraging feature engineering aspects and using sampling techniques. Our findings indicate that the radial basis functional kernel performs best out of the three kernel methods with a mean accuracy of 98.52.

H. Gupta and A. A. Gulanikar—The research associated to this paper was completed during author’s undergraduate study at BITS Pilani, Hyderabad Campus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review

Article 07 January 2020

Predicting Code Smells and Analysis of Predictions: Using Machine Learning Techniques and Software Metrics

Article 30 November 2020

Comparing and experimenting machine learning techniques for code smell detection

Article 06 June 2015

References

Van Emden, E., Moonen, L.: Java quality assurance by detecting code smells. In: 2002 Proceedings of the Ninth Working Conference on Reverse Engineering, pp. 97–106. IEEE (2002)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chapter Google Scholar
Mathew, J., Luo, M., Pang, C.K., Chan, H.L.: Kernel-based smote for SVM classification of imbalanced datasets. In: IECON 2015–41st Annual Conference of the IEEE Industrial Electronics Society, pp. 001127–001132. IEEE (2015)
Google Scholar
Ma, L., Zhang, Y.: Using Word2Vec to process big text data. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2895–2897. IEEE (2015)
Google Scholar
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)
Article Google Scholar
Fernández-Navarro, F., Hervás-Martínez, C., Sanchez-Monedero, J., Gutiérrez, P.A.: MELM-GRBF: a modified version of the extreme learning machine for generalized radial basis function neural networks. Neurocomputing 74(16), 2502–2510 (2011)
Article Google Scholar
Wang, Q., Luo, Z., Huang, J., Feng, Y., Liu, Z.: A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Comput. Intell. Neurosci. 2017 (2017)
Google Scholar
Wang, Q., Xu, J., Chen, H., He, B.: Two improved continuous bag-of-word models. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2851–2856. IEEE (2017)
Google Scholar
Guthrie, D., Allison, B., Liu, W., Guthrie, L., Wilks, Y.: A closer look at skip-gram modelling. In: LREC, vol. 6, pp. 1222–1225. Citeseer (2006)
Google Scholar
Micchelli, C.A., Pontil, M., Bartlett, P.: Learning the kernel function via regularization. J. Mach. Learn. Res. 6(7) (2005)
Google Scholar
Prajapati, G.L., Patle, A.: On performing classification using SVM with radial basis and polynomial kernel functions. In: 2010 3rd International Conference on Emerging Trends in Engineering and Technology, pp. 512–515. IEEE (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

BITS Pilani, Hyderabad Campus, Hyderabad, India
Himanshu Gupta, Abhiram Anand Gulanikar, Lov Kumar & Lalita Bhanu Murthy Neti

Authors

Himanshu Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Abhiram Anand Gulanikar
View author publications
You can also search for this author in PubMed Google Scholar
Lov Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Lalita Bhanu Murthy Neti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Himanshu Gupta , Abhiram Anand Gulanikar , Lov Kumar or Lalita Bhanu Murthy Neti .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Covenant University, Ota, Nigeria
Sanjay Misra
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Cagliari, Cagliari, Italy
Ivan Blečić
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino
Polytechnic University of Bari, Bari, Italy
Carmelo Maria Torre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, H., Gulanikar, A.A., Kumar, L., Neti, L.B.M. (2021). Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12957. Springer, Cham. https://doi.org/10.1007/978-3-030-87013-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-87013-3_4
Published: 10 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87012-6
Online ISBN: 978-3-030-87013-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review

Predicting Code Smells and Analysis of Predictions: Using Machine Learning Techniques and Software Metrics

Comparing and experimenting machine learning techniques for code smell detection

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review

Predicting Code Smells and Analysis of Predictions: Using Machine Learning Techniques and Software Metrics

Comparing and experimenting machine learning techniques for code smell detection

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation