Automated Code Review Comment Classification to Improve Modern Code Reviews

Ochodek, Miroslaw; Staron, Miroslaw; Meding, Wilhelm; Söder, Ola

doi:10.1007/978-3-031-04115-0_3

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 439))

Included in the following conference series:

International Conference on Software Quality

592 Accesses
1 Citations

Abstract

Modern Code Reviews (MCRs) are a widely-used quality assurance mechanism in continuous integration and deployment. Unfortunately, in medium and large projects, the number of changes that need to be integrated, and consequently the number of comments triggered during MCRs could be overwhelming. Therefore, there is a need for quickly recognizing which comments are concerning issues that need prompt attention to guide the focus of the code authors, reviewers, and quality managers. The goal of this study is to design a method for automated classification of review comments to identify the needed change faster and with higher accuracy. We conduct a Design Science Research study on three open-source systems. We designed a method (CommentBERT) for automated classification of the code-review comments based on the BERT (Bidirectional Encoder Representations from Transformers) language model and a new taxonomy of comments. When applied to 2,672 comments from Wireshark, The Mono Framework, and Open Network Automation Platform (ONAP) projects, the method achieved accuracy, measured using Matthews Correlation Coefficient, of 0.46–0.82 (Wireshark), 0.12–0.8 (ONAP), and 0.48–0.85 (Mono). Based on the results, we conclude that the proposed method seems promising and could be potentially used to build machine-learning-based tools to support MCRs as long as there is a sufficient number of historical code-review comments to train the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/mochodek/bertcomments.
2.
The input length of 128 tokens corresponds to the 98 percentile of the comment lengths distribution in the dataset of code reviews under study.
3.
https://www.wireshark.org/docs/wsdg_html/.

References

Akoglu, H.: User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18(3), 91–93 (2018)
Article Google Scholar
Axelsson, S., Baca, D., Feldt, R., Sidlauskas, D., Kacan, D.: Detecting defects with an interactive code review tool based on visualisation and machine learning. In: The 21st International Conference on Software Engineering and Knowledge Engineering (SEKE 2009) (2009)
Google Scholar
Bacchelli, A., Bird, C.: Expectations, outcomes, and challenges of modern code review. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 712–721. IEEE (2013)
Google Scholar
Badampudi, D., Britto, R., Unterkalmsteiner, M.: Modern code reviews-preliminary results of a systematic mapping study. In: Proceedings of the Evaluation and Assessment on Software Engineering, pp. 340–345. ACM (2019)
Google Scholar
Balachandran, V.: Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: Proceedings of the 2013 International Conference on Software Engineering, pp. 931–940. IEEE Press (2013)
Google Scholar
Boughorbel, S., Jarray, F., El-Anbari, M.: Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLoS ONE 12(6), 1–17 (2017). https://doi.org/10.1371/journal.pone.0177678
del Carpio, P.M.: Identification of architectural technical debt: an analysis based on naming patterns. In: 2016 8th Euro American Conference on Telematics and Information Systems (EATIS), pp. 1–8. IEEE (2016)
Google Scholar
Chappelly, T., Cifuentes, C., Krishnan, P., Gevay, S.: Machine learning for finding bugs: an initial report. In: IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp. 21–26. IEEE (2017)
Google Scholar
Cruzes, D.S., Dyba, T.: Recommended steps for thematic synthesis in software engineering. In: 2011 International Symposium on Empirical Software Engineering and Measurement, pp. 275–284. IEEE (2011)
Google Scholar
Davila, N., Nunes, I.: A systematic literature review and taxonomy of modern code review. J. Syst. Softw. 177, 110951 (2021)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Feng, Z., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
de Freitas Farias, M.A., Santos, J.A., Kalinowski, M., Mendonça, M., Spínola, R.O.: Investigating the identification of technical debt through code comment analysis. In: Hammoudi, S., Maciaszek, L.A., Missikoff, M.M., Camp, O., Cordeiro, J. (eds.) ICEIS 2016. LNBIP, vol. 291, pp. 284–309. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62386-3_14
Chapter Google Scholar
Ge, X., Sarkar, S., Witschey, J., Murphy-Hill, E.: Refactoring-aware code review. In: 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 71–79. IEEE (2017)
Google Scholar
Kitchenham, B.A., Pickard, L.M., MacDonell, S.G., Shepperd, M.J.: What accuracy statistics really measure. IEE Proc.-Softw. 148(3), 81–85 (2001)
Article Google Scholar
Mandrekar, J.N.: Receiver operating characteristic curve in diagnostic test assessment. J. Thorac. Oncol. 5(9), 1315–1316 (2010)
Article Google Scholar
Mirsaeedi, E., Rigby, P.C.: Mitigating turnover with code review recommendation: balancing expertise, workload, and knowledge distribution. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 1183–1195 (2020)
Google Scholar
Ochodek, M., Hebig, R., Meding, W., Frost, G., Staron, M.: Recognizing lines of code violating company-specific coding guidelines using machine learning. Empir. Softw. Eng. 25(1), 220–265 (2019). https://doi.org/10.1007/s10664-019-09769-8
Article Google Scholar
Paixão, M., et al.: Behind the intents: an in-depth empirical study on software refactoring in modern code review. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 125–136 (2020)
Google Scholar
Paul, R., Turzo, A.K., Bosu, A.: Why security defects go unnoticed during code reviews? A case-control study of the chromium OS project. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1373–1385. IEEE (2021)
Google Scholar
Sadowski, C., Aftandilian, E., Eagle, A., Miller-Cushon, L., Jaspan, C.: Lessons from building static analysis tools at google (2018)
Google Scholar
Staron, M., Meding, W., Söder, O., Bäck, M.: Measurement and impact factors of speed of reviews and integration in continuous software engineering. Found. Comput. Decis. Sci. 43(4), 281–303 (2018)
Article Google Scholar
Staron, M., Ochodek, M., Meding, W., Söder, O.: Using machine learning to identify code fragments for manual review. In: 2020 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 513–516. IEEE (2020)
Google Scholar
Thongtanunam, P., McIntosh, S., Hassan, A.E., Iida, H.: Review participation in modern code review. Empir. Softw. Eng. 22(2), 768–817 (2016). https://doi.org/10.1007/s10664-016-9452-6
Article Google Scholar
Thongtanunam, P., Tantithamthavorn, C., Kula, R.G., Yoshida, N., Iida, H., Matsumoto, K.I.: Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 141–150. IEEE (2015)
Google Scholar
Uchôa, A., et al.: How does modern code review impact software design degradation? An in-depth empirical study. In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 511–522. IEEE (2020)
Google Scholar
Vassallo, C., Panichella, S., Palomba, F., Proksch, S., Gall, H.C., Zaidman, A.: How developers engage with static analysis tools in different contexts. Empir. Softw. Eng. 25(2), 1419–1457 (2019). https://doi.org/10.1007/s10664-019-09750-5
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wang, D., Ueda, Y., Kula, R.G., Ishio, T., Matsumoto, K.: The evolution of code review research: a systematic mapping study (2019)
Google Scholar
Wen, F., Aghajani, E., Nagy, C., Lanza, M., Bavota, G.: Siri, write the next method. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 138–149. IEEE (2021)
Google Scholar
Wieringa, R.J.: Design Science Methodology for Information Systems and Software Engineering. Springer, Cham (2014). https://doi.org/10.1007/978-3-662-43839-8
Book Google Scholar
Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

Download references

Author information

Authors and Affiliations

Computer Science, Poznan University of Technology, Poznań, Poland
Miroslaw Ochodek
Computer Science and Engineering, Chalmers University of Gothenburg, Gothenburg, Sweden
Miroslaw Staron
Ericsson AB, Stockholm, Sweden
Wilhelm Meding
Axis Communications, Lund, Sweden
Ola Söder

Authors

Miroslaw Ochodek
View author publications
You can also search for this author in PubMed Google Scholar
Miroslaw Staron
View author publications
You can also search for this author in PubMed Google Scholar
Wilhelm Meding
View author publications
You can also search for this author in PubMed Google Scholar
Ola Söder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miroslaw Ochodek .

Editor information

Editors and Affiliations

Blekinge Institute of Technology, Karlskrona, Sweden
Daniel Mendez
Johannes Kepler University of Linz, Linz, Austria
Manuel Wimmer
TU Wien, Vienna, Austria
Dietmar Winkler
TU Wien, Vienna, Austria
Stefan Biffl
Software Quality Lab GmbH, Linz, Austria
Johannes Bergsmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ochodek, M., Staron, M., Meding, W., Söder, O. (2022). Automated Code Review Comment Classification to Improve Modern Code Reviews. In: Mendez, D., Wimmer, M., Winkler, D., Biffl, S., Bergsmann, J. (eds) Software Quality: The Next Big Thing in Software Engineering and Quality. SWQD 2022. Lecture Notes in Business Information Processing, vol 439. Springer, Cham. https://doi.org/10.1007/978-3-031-04115-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-04115-0_3
Published: 12 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04114-3
Online ISBN: 978-3-031-04115-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automated Code Review Comment Classification to Improve Modern Code Reviews