Abstract
Modern Code Reviews (MCRs) are a widely-used quality assurance mechanism in continuous integration and deployment. Unfortunately, in medium and large projects, the number of changes that need to be integrated, and consequently the number of comments triggered during MCRs could be overwhelming. Therefore, there is a need for quickly recognizing which comments are concerning issues that need prompt attention to guide the focus of the code authors, reviewers, and quality managers. The goal of this study is to design a method for automated classification of review comments to identify the needed change faster and with higher accuracy. We conduct a Design Science Research study on three open-source systems. We designed a method (CommentBERT) for automated classification of the code-review comments based on the BERT (Bidirectional Encoder Representations from Transformers) language model and a new taxonomy of comments. When applied to 2,672 comments from Wireshark, The Mono Framework, and Open Network Automation Platform (ONAP) projects, the method achieved accuracy, measured using Matthews Correlation Coefficient, of 0.46–0.82 (Wireshark), 0.12–0.8 (ONAP), and 0.48–0.85 (Mono). Based on the results, we conclude that the proposed method seems promising and could be potentially used to build machine-learning-based tools to support MCRs as long as there is a sufficient number of historical code-review comments to train the model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The input length of 128 tokens corresponds to the 98 percentile of the comment lengths distribution in the dataset of code reviews under study.
- 3.
References
Akoglu, H.: User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18(3), 91–93 (2018)
Axelsson, S., Baca, D., Feldt, R., Sidlauskas, D., Kacan, D.: Detecting defects with an interactive code review tool based on visualisation and machine learning. In: The 21st International Conference on Software Engineering and Knowledge Engineering (SEKE 2009) (2009)
Bacchelli, A., Bird, C.: Expectations, outcomes, and challenges of modern code review. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 712–721. IEEE (2013)
Badampudi, D., Britto, R., Unterkalmsteiner, M.: Modern code reviews-preliminary results of a systematic mapping study. In: Proceedings of the Evaluation and Assessment on Software Engineering, pp. 340–345. ACM (2019)
Balachandran, V.: Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: Proceedings of the 2013 International Conference on Software Engineering, pp. 931–940. IEEE Press (2013)
Boughorbel, S., Jarray, F., El-Anbari, M.: Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLoS ONE 12(6), 1–17 (2017). https://doi.org/10.1371/journal.pone.0177678
del Carpio, P.M.: Identification of architectural technical debt: an analysis based on naming patterns. In: 2016 8th Euro American Conference on Telematics and Information Systems (EATIS), pp. 1–8. IEEE (2016)
Chappelly, T., Cifuentes, C., Krishnan, P., Gevay, S.: Machine learning for finding bugs: an initial report. In: IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp. 21–26. IEEE (2017)
Cruzes, D.S., Dyba, T.: Recommended steps for thematic synthesis in software engineering. In: 2011 International Symposium on Empirical Software Engineering and Measurement, pp. 275–284. IEEE (2011)
Davila, N., Nunes, I.: A systematic literature review and taxonomy of modern code review. J. Syst. Softw. 177, 110951 (2021)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Feng, Z., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
de Freitas Farias, M.A., Santos, J.A., Kalinowski, M., Mendonça, M., Spínola, R.O.: Investigating the identification of technical debt through code comment analysis. In: Hammoudi, S., Maciaszek, L.A., Missikoff, M.M., Camp, O., Cordeiro, J. (eds.) ICEIS 2016. LNBIP, vol. 291, pp. 284–309. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62386-3_14
Ge, X., Sarkar, S., Witschey, J., Murphy-Hill, E.: Refactoring-aware code review. In: 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 71–79. IEEE (2017)
Kitchenham, B.A., Pickard, L.M., MacDonell, S.G., Shepperd, M.J.: What accuracy statistics really measure. IEE Proc.-Softw. 148(3), 81–85 (2001)
Mandrekar, J.N.: Receiver operating characteristic curve in diagnostic test assessment. J. Thorac. Oncol. 5(9), 1315–1316 (2010)
Mirsaeedi, E., Rigby, P.C.: Mitigating turnover with code review recommendation: balancing expertise, workload, and knowledge distribution. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 1183–1195 (2020)
Ochodek, M., Hebig, R., Meding, W., Frost, G., Staron, M.: Recognizing lines of code violating company-specific coding guidelines using machine learning. Empir. Softw. Eng. 25(1), 220–265 (2019). https://doi.org/10.1007/s10664-019-09769-8
Paixão, M., et al.: Behind the intents: an in-depth empirical study on software refactoring in modern code review. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 125–136 (2020)
Paul, R., Turzo, A.K., Bosu, A.: Why security defects go unnoticed during code reviews? A case-control study of the chromium OS project. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1373–1385. IEEE (2021)
Sadowski, C., Aftandilian, E., Eagle, A., Miller-Cushon, L., Jaspan, C.: Lessons from building static analysis tools at google (2018)
Staron, M., Meding, W., Söder, O., Bäck, M.: Measurement and impact factors of speed of reviews and integration in continuous software engineering. Found. Comput. Decis. Sci. 43(4), 281–303 (2018)
Staron, M., Ochodek, M., Meding, W., Söder, O.: Using machine learning to identify code fragments for manual review. In: 2020 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 513–516. IEEE (2020)
Thongtanunam, P., McIntosh, S., Hassan, A.E., Iida, H.: Review participation in modern code review. Empir. Softw. Eng. 22(2), 768–817 (2016). https://doi.org/10.1007/s10664-016-9452-6
Thongtanunam, P., Tantithamthavorn, C., Kula, R.G., Yoshida, N., Iida, H., Matsumoto, K.I.: Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 141–150. IEEE (2015)
Uchôa, A., et al.: How does modern code review impact software design degradation? An in-depth empirical study. In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 511–522. IEEE (2020)
Vassallo, C., Panichella, S., Palomba, F., Proksch, S., Gall, H.C., Zaidman, A.: How developers engage with static analysis tools in different contexts. Empir. Softw. Eng. 25(2), 1419–1457 (2019). https://doi.org/10.1007/s10664-019-09750-5
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wang, D., Ueda, Y., Kula, R.G., Ishio, T., Matsumoto, K.: The evolution of code review research: a systematic mapping study (2019)
Wen, F., Aghajani, E., Nagy, C., Lanza, M., Bavota, G.: Siri, write the next method. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 138–149. IEEE (2021)
Wieringa, R.J.: Design Science Methodology for Information Systems and Software Engineering. Springer, Cham (2014). https://doi.org/10.1007/978-3-662-43839-8
Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Ochodek, M., Staron, M., Meding, W., Söder, O. (2022). Automated Code Review Comment Classification to Improve Modern Code Reviews. In: Mendez, D., Wimmer, M., Winkler, D., Biffl, S., Bergsmann, J. (eds) Software Quality: The Next Big Thing in Software Engineering and Quality. SWQD 2022. Lecture Notes in Business Information Processing, vol 439. Springer, Cham. https://doi.org/10.1007/978-3-031-04115-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-04115-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04114-3
Online ISBN: 978-3-031-04115-0
eBook Packages: Computer ScienceComputer Science (R0)