Skip to main content

Investigating BERT Layer Performance and SMOTE Through MLP-Driven Ablation on Gittercom

  • Conference paper
  • First Online:
Advanced Information Networking and Applications (AINA 2024)

Abstract

For software development teams, communication is necessary to preserve growth consciousness, streamline the management of projects, and avoid misunderstandings. Among the features that chat rooms offer to help and satisfy the interaction requirements of software-development groups are instant personal messaging, group conversations, and code knowledge exchange. This is all capable of occurring in real time. As a result, developers are increasingly using chat rooms. One of these prominent forums is Gitter, and the chats it includes might be a goldmine of information for researchers studying open-source software systems. The GitterCom dataset, the biggest repository of Gitter developer messages that have been meticulously labeled and organized, was used in this study to conduct a multi-label categorization for the dataset’s Purpose Category. 9 MLP machine learning classifiers, six feature selection methods, and the layered architecture of the BERT transformer are all subjected to thorough empirical research and evaluation. As a consequence, our research process shows competent results with a maximum AUC score of 0.97 with MLP variants using Adam optimizer(MLP2 and MLP3). Additionally, the research process might be used to text data from software development forums for general multi-label text categorization. The insights of the research, which give a holistic understanding of the Machine learning pipeline driven by BERT, shall serve the research community for preferential selection of Feature selection techniques, BERT layers, and classification model selection, among others for text classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    https://gitter.im/.

References

  1. El Mezouar, M., da Costa, D.A., German, D.M., Zou, Y.: Exploring the use of chatrooms by developers: an empirical study on slack and gitter. IEEE Trans. Softw. Eng. 48(10), 3988–4001 (2021)

    Article  Google Scholar 

  2. Lin, B., Zagalsky, A., Storey, M.A., Serebrenik, A.: Why developers are slacking off: understanding how software teams use slack. In: Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion, pp. 333–336 (2016)

    Google Scholar 

  3. Lu, Z., Du, P., Nie, J.-Y.: VGCN-BERT: augmenting BERT with graph embedding for text classification. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 369–382. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_25

    Chapter  Google Scholar 

  4. Parra, E., Alahmadi, M., Ellis, A., Haiduc, S.: A comparative study and analysis of developer communications on slack and gitter. Empir. Softw. Eng. 27(2), 40 (2022)

    Article  Google Scholar 

  5. Parra, E., Ellis, A., Haiduc, S.: GitterCom: a dataset of open source developer communications in gitter. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 563–567 (2020)

    Google Scholar 

  6. Shi, L., et al.: A first look at developers’ live chat on gitter. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 391–403 (2021)

    Google Scholar 

  7. Woolson, R.F.: Wilcoxon signed-rank test. Wiley Encyclopedia of Clinical Trials, pp. 1–3 (2007)

    Google Scholar 

  8. Zimmerman, D.W., Zumbo, B.D.: Relative power of the Wilcoxon test, the Friedman test, and repeated-measures ANOVA on ranks. J. Exp. Educ. 62(1), 75–86 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bathini Sai Akash .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Akash, B.S., Singh, V., Krishna, A., Murthy, L.B., Kumar, L. (2024). Investigating BERT Layer Performance and SMOTE Through MLP-Driven Ablation on Gittercom. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 200. Springer, Cham. https://doi.org/10.1007/978-3-031-57853-3_25

Download citation

Publish with us

Policies and ethics