skip to main content
10.1145/3324884.3416581acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automating just-in-time comment updating

Published:27 January 2021Publication History

ABSTRACT

Code comments are valuable for program comprehension and software maintenance, and also require maintenance with code evolution. However, when changing code, developers sometimes neglect updating the related comments, bringing in inconsistent or obsolete comments (aka., bad comments). Such comments are detrimental since they may mislead developers and lead to future bugs. Therefore, it is necessary to fix and avoid bad comments. In this work, we argue that bad comments can be reduced and even avoided by automatically performing comment updates with code changes. We refer to this task as "Just-In-Time (JIT) Comment Updating" and propose an approach named CUP (<u>C</u>omment <u>UP</u>dater) to automate this task. CUP can be used to assist developers in updating comments during code changes and can consequently help avoid the introduction of bad comments. Specifically, CUP leverages a novel neural sequence-to-sequence model to learn comment update patterns from extant code-comment co-changes and can automatically generate a new comment based on its corresponding old comment and code change. Several customized enhancements, such as a special tokenizer and a novel co-attention mechanism, are introduced in CUP by us to handle the characteristics of this task. We build a dataset with over 108K comment-code co-change samples and evaluate CUP on it. The evaluation results show that CUP outperforms an information-retrieval-based and a rule-based baselines by substantial margins, and can reduce developers' edits required for JIT comment updating. In addition, the comments generated by our approach are identical to those updated by developers in 1612 (16.7%) test samples, 7 times more than the best-performing baseline.

References

  1. 2020. A commit in Apache Kafka. https://github.com/apache/kafka/commit/9dc76f8872b862ca008562cdcf8cf50524e2eaa3.Google ScholarGoogle Scholar
  2. 2020. JavaParser. https://javaparser.org/.Google ScholarGoogle Scholar
  3. 2020. Natural language toolkit NLTK 3.5 documentation. http://www.nltk.org/.Google ScholarGoogle Scholar
  4. 2020. Our replication package. https://tinyurl.com/jitcomment.Google ScholarGoogle Scholar
  5. 2020. Our source code on GitHub. https://github.com/tbabm/CUP.Google ScholarGoogle Scholar
  6. 2020. Word vectors for 157 languages. https://fasttext.cc/docs/en/crawl-vectors.html.Google ScholarGoogle Scholar
  7. Sergio Cozzetti B de Souza, Nicolas Anquetil, and Káthia M de Oliveira. 2005. A study of the documentation essential to software maintenance. In Proceedings of the 23rd annual International Conference on Design of Communication: Documenting & Designing for Pervasive Information. 68--75.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In Proceedings of the 29th International Conference on Automated Software Engineering. 313--324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Beat Fluri, Michael Wursch, and Harald C Gall. 2007. Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes. In Proceedings of the 14th Working Conference on Reverse Engineering. 70--79.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Beat Fluri, Michael Würsch, Emanuel Giger, and Harald C Gall. 2009. Analyzing the co-evolution of comments and source code. Software Quality Journal 17, 4 (2009), 367--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  12. Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. In Proceedings of the 17th Working Conference on Reverse Engineering. 35--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th International Conference on Program Comprehension. 200--210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2019. Deep code comment generation with hybrid lexical and syntactical information. Empirical Software Engineering (2019), 1--39.Google ScholarGoogle Scholar
  16. Walid M Ibrahim, Nicolas Bettenburg, Bram Adams, and Ahmed E Hassan. 2012. On the relationship between comment update practices and software bugs. Journal of Systems and Software 85, 10 (2012), 2293--2304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2073--2083.Google ScholarGoogle ScholarCross RefCross Ref
  18. Zhen Ming Jiang and Ahmed E Hassan. 2006. Examining the evolution of code comments in PostgreSQL. In Proceedings of the International Workshop on Mining Software Repositories. 179--180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rafael-Michael Karampatsis, Hlib Babii, Romain Robbes, Charles Sutton, and Andrea Janes. 2020. Big Code!= Big Vocabulary: Open-Vocabulary Models for Source Code. CoRR abs/2003.07914 (2020). https://arxiv.org/abs/2003.07914Google ScholarGoogle Scholar
  20. Miryung Kim and David Notkin. 2009. Discovering and representing systematic code changes. In Proceedings of the 31st International Conference on Software Engineering. 309--319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google ScholarGoogle Scholar
  22. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388--395.Google ScholarGoogle Scholar
  23. Alexander LeClair, Siyuan Jiang, and Collin McMillan. 2019. A neural model for generating natural language summaries of program subroutines. In Proceedings of the 41st International Conference on Software Engineering. 795--806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10. 707--710.Google ScholarGoogle Scholar
  25. Mario Linares-Vásquez, Boyang Li, Christopher Vendome, and Denys Poshyvanyk. 2015. How do developers document database usages in source code?. In Proceedings of the 30th International Conference on Automated Software Engineering. 36--41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhiyong Liu, Huanchao Chen, Xiangping Chen, Xiaonan Luo, and Fan Zhou. 2018. Automatic detection of outdated comments during code changes. In Proceedings of the 42nd Annual Computer Software and Applications Conference, Vol. 1. 154--163.Google ScholarGoogle ScholarCross RefCross Ref
  27. Zhongxin Liu, Xin Xia, Ahmed E Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine-translation-based commit message generation: how far are we?. In Proceedings of the 33rd International Conference on Automated Software Engineering. 373--384.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, and Shanping Li. 2019. Automatic Generation of Pull Request Descriptions. In Proceedings of the 34th International Conference on Automated Software Engineering. 176--188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1412--1421.Google ScholarGoogle ScholarCross RefCross Ref
  30. Haroon Malik, Istehad Chowdhury, Hsiao-Ming Tsou, Zhen Ming Jiang, and Ahmed E Hassan. 2008. Understanding the rationale for updating a function's comment. In Proceedings of the International Conference on Software Maintenance. 167--176.Google ScholarGoogle ScholarCross RefCross Ref
  31. Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In Proceedings of the 21st International Conference on Program Comprehension. 23--32.Google ScholarGoogle ScholarCross RefCross Ref
  32. Najam Nazar, Yan Hu, and He Jiang. 2016. Summarizing software artifacts: A literature review. Journal of Computer Science and Technology 31, 5 (2016), 883--909.Google ScholarGoogle ScholarCross RefCross Ref
  33. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H Pham, Jafar Al-Kofahi, and Tien N Nguyen. 2010. Recurring bug fixes in object-oriented programs. In Proceedings of the 32nd International Conference on Software Engineering. 315--324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yoann Padioleau, Lin Tan, and Yuanyuan Zhou. 2009. Listening to programmers Taxonomies and characteristics of comments in operating system code. In Proceedings of the 31st International Conference on Software Engineering. 331--341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. David Lorge Parnas. 2011. Precise documentation: The key to better software. In The Future of Software Engineering. Springer, 125--148.Google ScholarGoogle Scholar
  36. Luca Pascarella, Magiel Bruntink, and Alberto Bacchelli. 2019. Classifying code comments in Java software systems. Empirical Software Engineering (2019), 1--39.Google ScholarGoogle Scholar
  37. Inderjot Kaur Ratol and Martin P Robillard. 2017. Detecting fragile comments. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. 112--122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 1073--1083.Google ScholarGoogle Scholar
  39. Giriprasad Sridhara. 2016. Automatically detecting the up-to-date status of ToDo comments in Java programs. In Proceedings of the 9th India Software Engineering Conference. 16--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K Vijay-Shanker. 2010. Towards automatically generating summary comments for java methods. In Proceedings of the International Conference on Automated Software Engineering. 43--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1, 1929--1958.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments. In Proceedings of the 21st International Conference on Program Comprehension. 83--92.Google ScholarGoogle ScholarCross RefCross Ref
  43. Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /* iComment: Bugs or bad comments?*. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles. 145--158.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lin Tan, Ding Yuan, and Yuanyuan Zhou. 2007. Hotcomments: how to make program comments more useful?. In Proceedings of the 11th USENIX Workshop on Hot Topics in Operating Systems. 1--6.Google ScholarGoogle Scholar
  45. Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: mining annotations from comments and code to detect interrupt related concurrency bugs. In Proceedings of the 33rd International Conference on Software Engineering. 11--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Shin Hwei Tan, Darko Marinov, Lin Tan, and Gary T Leavens. 2012. @ tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. In Proceedings of the 5th International Conference on Software Testing, Verification and Validation. 260--269.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Nikolaos Tsantalis, Matin Mansouri, Laleh M. Eshkevari, Davood Mazinanian, and Danny Dig. 2018. Accurate and Efficient Refactoring Detection in Commit History. In Proceedings of the 40th International Conference on Software Engineering. 483--494.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, and Denys Poshyvanyk. 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 41st International Conference on Software Engineering. 25--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, and Philip S Yu. 2018. Improving automatic source code summarization via deep reinforcement learning. In Proceedings of the 33rd International Conference on Automated Software Engineering. 397--407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Fengcai Wen, Csaba Nagy, Gabriele Bavota, and Michele Lanza. 2019. A large-scale empirical study on code-comment inconsistencies. In Proceedings of the 27th International Conference on Program Comprehension. 53--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Break-throughs in Statistics. Springer, 196--202.Google ScholarGoogle Scholar
  52. Edmund Wong, Taiyue Liu, and Lin Tan. 2015. Clocom: Mining existing source code for automatic comment generation. In Proceedings of the 22nd International Conference on Software Analysis, Evolution, and Reengineering. 380--389.Google ScholarGoogle ScholarCross RefCross Ref
  53. Scott N Woodfield, Hubert E Dunsmore, and Vincent Yun Shen. 1981. The effect of modularization and comments on program comprehension. In Proceedings of the 5th International Conference on Software Engineering. 215--223.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, and Alexander L Gaunt. 2018. Learning to represent edits. In Proceedings of the 7th International Conference on Learning Representations.Google ScholarGoogle Scholar
  55. Annie TT Ying, James L Wright, and Steven Abrams. 2005. Source code that talks: an exploration of Eclipse task comments and their implication to repository mining. In Proceedings of the International Workshop on Mining Software Repositories. 1--5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella, and Harald Gall. 2017. Analyzing APIs documentation and code to detect directive defects. In Proceedings of the 39th International Conference on Software Engineering. 27--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yu Zhou, Xin Yan, Wenhua Yang, Taolue Chen, and Zhiqiu Huang. 2019. Augmenting Java method comments generation with context information based on neural networks. Journal of Systems and Software 156 (2019), 328--340.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automating just-in-time comment updating

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
          December 2020
          1449 pages
          ISBN:9781450367684
          DOI:10.1145/3324884

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 January 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate82of337submissions,24%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader