Abstract
For software development teams, communication is necessary to preserve growth consciousness, streamline the management of projects, and avoid misunderstandings. Among the features that chat rooms offer to help and satisfy the interaction requirements of software-development groups are instant personal messaging, group conversations, and code knowledge exchange. This is all capable of occurring in real time. As a result, developers are increasingly using chat rooms. One of these prominent forums is Gitter, and the chats it includes might be a goldmine of information for researchers studying open-source software systems. The GitterCom dataset, the biggest repository of Gitter developer messages that have been meticulously labeled and organized, was used in this study to conduct a multi-label categorization for the dataset’s Purpose Category. 9 MLP machine learning classifiers, six feature selection methods, and the layered architecture of the BERT transformer are all subjected to thorough empirical research and evaluation. As a consequence, our research process shows competent results with a maximum AUC score of 0.97 with MLP variants using Adam optimizer(MLP2 and MLP3). Additionally, the research process might be used to text data from software development forums for general multi-label text categorization. The insights of the research, which give a holistic understanding of the Machine learning pipeline driven by BERT, shall serve the research community for preferential selection of Feature selection techniques, BERT layers, and classification model selection, among others for text classification.
Notes
References
El Mezouar, M., da Costa, D.A., German, D.M., Zou, Y.: Exploring the use of chatrooms by developers: an empirical study on slack and gitter. IEEE Trans. Softw. Eng. 48(10), 3988–4001 (2021)
Lin, B., Zagalsky, A., Storey, M.A., Serebrenik, A.: Why developers are slacking off: understanding how software teams use slack. In: Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion, pp. 333–336 (2016)
Lu, Z., Du, P., Nie, J.-Y.: VGCN-BERT: augmenting BERT with graph embedding for text classification. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 369–382. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_25
Parra, E., Alahmadi, M., Ellis, A., Haiduc, S.: A comparative study and analysis of developer communications on slack and gitter. Empir. Softw. Eng. 27(2), 40 (2022)
Parra, E., Ellis, A., Haiduc, S.: GitterCom: a dataset of open source developer communications in gitter. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 563–567 (2020)
Shi, L., et al.: A first look at developers’ live chat on gitter. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 391–403 (2021)
Woolson, R.F.: Wilcoxon signed-rank test. Wiley Encyclopedia of Clinical Trials, pp. 1–3 (2007)
Zimmerman, D.W., Zumbo, B.D.: Relative power of the Wilcoxon test, the Friedman test, and repeated-measures ANOVA on ranks. J. Exp. Educ. 62(1), 75–86 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Akash, B.S., Singh, V., Krishna, A., Murthy, L.B., Kumar, L. (2024). Investigating BERT Layer Performance and SMOTE Through MLP-Driven Ablation on Gittercom. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 200. Springer, Cham. https://doi.org/10.1007/978-3-031-57853-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-57853-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57852-6
Online ISBN: 978-3-031-57853-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)