ABSTRACT
Code comments are valuable for program comprehension and software maintenance, and also require maintenance with code evolution. However, when changing code, developers sometimes neglect updating the related comments, bringing in inconsistent or obsolete comments (aka., bad comments). Such comments are detrimental since they may mislead developers and lead to future bugs. Therefore, it is necessary to fix and avoid bad comments. In this work, we argue that bad comments can be reduced and even avoided by automatically performing comment updates with code changes. We refer to this task as "Just-In-Time (JIT) Comment Updating" and propose an approach named CUP (<u>C</u>omment <u>UP</u>dater) to automate this task. CUP can be used to assist developers in updating comments during code changes and can consequently help avoid the introduction of bad comments. Specifically, CUP leverages a novel neural sequence-to-sequence model to learn comment update patterns from extant code-comment co-changes and can automatically generate a new comment based on its corresponding old comment and code change. Several customized enhancements, such as a special tokenizer and a novel co-attention mechanism, are introduced in CUP by us to handle the characteristics of this task. We build a dataset with over 108K comment-code co-change samples and evaluate CUP on it. The evaluation results show that CUP outperforms an information-retrieval-based and a rule-based baselines by substantial margins, and can reduce developers' edits required for JIT comment updating. In addition, the comments generated by our approach are identical to those updated by developers in 1612 (16.7%) test samples, 7 times more than the best-performing baseline.
- 2020. A commit in Apache Kafka. https://github.com/apache/kafka/commit/9dc76f8872b862ca008562cdcf8cf50524e2eaa3.Google Scholar
- 2020. JavaParser. https://javaparser.org/.Google Scholar
- 2020. Natural language toolkit NLTK 3.5 documentation. http://www.nltk.org/.Google Scholar
- 2020. Our replication package. https://tinyurl.com/jitcomment.Google Scholar
- 2020. Our source code on GitHub. https://github.com/tbabm/CUP.Google Scholar
- 2020. Word vectors for 157 languages. https://fasttext.cc/docs/en/crawl-vectors.html.Google Scholar
- Sergio Cozzetti B de Souza, Nicolas Anquetil, and Káthia M de Oliveira. 2005. A study of the documentation essential to software maintenance. In Proceedings of the 23rd annual International Conference on Design of Communication: Documenting & Designing for Pervasive Information. 68--75.Google ScholarDigital Library
- Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In Proceedings of the 29th International Conference on Automated Software Engineering. 313--324.Google ScholarDigital Library
- Beat Fluri, Michael Wursch, and Harald C Gall. 2007. Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes. In Proceedings of the 14th Working Conference on Reverse Engineering. 70--79.Google ScholarDigital Library
- Beat Fluri, Michael Würsch, Emanuel Giger, and Harald C Gall. 2009. Analyzing the co-evolution of comments and source code. Software Quality Journal 17, 4 (2009), 367--394.Google ScholarDigital Library
- Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the International Conference on Language Resources and Evaluation.Google Scholar
- Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. In Proceedings of the 17th Working Conference on Reverse Engineering. 35--44.Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th International Conference on Program Comprehension. 200--210.Google ScholarDigital Library
- Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2019. Deep code comment generation with hybrid lexical and syntactical information. Empirical Software Engineering (2019), 1--39.Google Scholar
- Walid M Ibrahim, Nicolas Bettenburg, Bram Adams, and Ahmed E Hassan. 2012. On the relationship between comment update practices and software bugs. Journal of Systems and Software 85, 10 (2012), 2293--2304.Google ScholarDigital Library
- Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2073--2083.Google ScholarCross Ref
- Zhen Ming Jiang and Ahmed E Hassan. 2006. Examining the evolution of code comments in PostgreSQL. In Proceedings of the International Workshop on Mining Software Repositories. 179--180.Google ScholarDigital Library
- Rafael-Michael Karampatsis, Hlib Babii, Romain Robbes, Charles Sutton, and Andrea Janes. 2020. Big Code!= Big Vocabulary: Open-Vocabulary Models for Source Code. CoRR abs/2003.07914 (2020). https://arxiv.org/abs/2003.07914Google Scholar
- Miryung Kim and David Notkin. 2009. Discovering and representing systematic code changes. In Proceedings of the 31st International Conference on Software Engineering. 309--319.Google ScholarDigital Library
- Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google Scholar
- Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 388--395.Google Scholar
- Alexander LeClair, Siyuan Jiang, and Collin McMillan. 2019. A neural model for generating natural language summaries of program subroutines. In Proceedings of the 41st International Conference on Software Engineering. 795--806.Google ScholarDigital Library
- Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, Vol. 10. 707--710.Google Scholar
- Mario Linares-Vásquez, Boyang Li, Christopher Vendome, and Denys Poshyvanyk. 2015. How do developers document database usages in source code?. In Proceedings of the 30th International Conference on Automated Software Engineering. 36--41.Google ScholarDigital Library
- Zhiyong Liu, Huanchao Chen, Xiangping Chen, Xiaonan Luo, and Fan Zhou. 2018. Automatic detection of outdated comments during code changes. In Proceedings of the 42nd Annual Computer Software and Applications Conference, Vol. 1. 154--163.Google ScholarCross Ref
- Zhongxin Liu, Xin Xia, Ahmed E Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine-translation-based commit message generation: how far are we?. In Proceedings of the 33rd International Conference on Automated Software Engineering. 373--384.Google ScholarDigital Library
- Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, and Shanping Li. 2019. Automatic Generation of Pull Request Descriptions. In Proceedings of the 34th International Conference on Automated Software Engineering. 176--188.Google ScholarDigital Library
- Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1412--1421.Google ScholarCross Ref
- Haroon Malik, Istehad Chowdhury, Hsiao-Ming Tsou, Zhen Ming Jiang, and Ahmed E Hassan. 2008. Understanding the rationale for updating a function's comment. In Proceedings of the International Conference on Software Maintenance. 167--176.Google ScholarCross Ref
- Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In Proceedings of the 21st International Conference on Program Comprehension. 23--32.Google ScholarCross Ref
- Najam Nazar, Yan Hu, and He Jiang. 2016. Summarizing software artifacts: A literature review. Journal of Computer Science and Technology 31, 5 (2016), 883--909.Google ScholarCross Ref
- Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H Pham, Jafar Al-Kofahi, and Tien N Nguyen. 2010. Recurring bug fixes in object-oriented programs. In Proceedings of the 32nd International Conference on Software Engineering. 315--324.Google ScholarDigital Library
- Yoann Padioleau, Lin Tan, and Yuanyuan Zhou. 2009. Listening to programmers Taxonomies and characteristics of comments in operating system code. In Proceedings of the 31st International Conference on Software Engineering. 331--341.Google ScholarDigital Library
- David Lorge Parnas. 2011. Precise documentation: The key to better software. In The Future of Software Engineering. Springer, 125--148.Google Scholar
- Luca Pascarella, Magiel Bruntink, and Alberto Bacchelli. 2019. Classifying code comments in Java software systems. Empirical Software Engineering (2019), 1--39.Google Scholar
- Inderjot Kaur Ratol and Martin P Robillard. 2017. Detecting fragile comments. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. 112--122.Google ScholarDigital Library
- Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 1073--1083.Google Scholar
- Giriprasad Sridhara. 2016. Automatically detecting the up-to-date status of ToDo comments in Java programs. In Proceedings of the 9th India Software Engineering Conference. 16--25.Google ScholarDigital Library
- Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K Vijay-Shanker. 2010. Towards automatically generating summary comments for java methods. In Proceedings of the International Conference on Automated Software Engineering. 43--52.Google ScholarDigital Library
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1, 1929--1958.Google ScholarDigital Library
- Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments. In Proceedings of the 21st International Conference on Program Comprehension. 83--92.Google ScholarCross Ref
- Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /* iComment: Bugs or bad comments?*. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles. 145--158.Google ScholarDigital Library
- Lin Tan, Ding Yuan, and Yuanyuan Zhou. 2007. Hotcomments: how to make program comments more useful?. In Proceedings of the 11th USENIX Workshop on Hot Topics in Operating Systems. 1--6.Google Scholar
- Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: mining annotations from comments and code to detect interrupt related concurrency bugs. In Proceedings of the 33rd International Conference on Software Engineering. 11--20.Google ScholarDigital Library
- Shin Hwei Tan, Darko Marinov, Lin Tan, and Gary T Leavens. 2012. @ tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. In Proceedings of the 5th International Conference on Software Testing, Verification and Validation. 260--269.Google ScholarDigital Library
- Nikolaos Tsantalis, Matin Mansouri, Laleh M. Eshkevari, Davood Mazinanian, and Danny Dig. 2018. Accurate and Efficient Refactoring Detection in Commit History. In Proceedings of the 40th International Conference on Software Engineering. 483--494.Google ScholarDigital Library
- Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, and Denys Poshyvanyk. 2019. On learning meaningful code changes via neural machine translation. In Proceedings of the 41st International Conference on Software Engineering. 25--36.Google ScholarDigital Library
- Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, and Philip S Yu. 2018. Improving automatic source code summarization via deep reinforcement learning. In Proceedings of the 33rd International Conference on Automated Software Engineering. 397--407.Google ScholarDigital Library
- Fengcai Wen, Csaba Nagy, Gabriele Bavota, and Michele Lanza. 2019. A large-scale empirical study on code-comment inconsistencies. In Proceedings of the 27th International Conference on Program Comprehension. 53--64.Google ScholarDigital Library
- Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Break-throughs in Statistics. Springer, 196--202.Google Scholar
- Edmund Wong, Taiyue Liu, and Lin Tan. 2015. Clocom: Mining existing source code for automatic comment generation. In Proceedings of the 22nd International Conference on Software Analysis, Evolution, and Reengineering. 380--389.Google ScholarCross Ref
- Scott N Woodfield, Hubert E Dunsmore, and Vincent Yun Shen. 1981. The effect of modularization and comments on program comprehension. In Proceedings of the 5th International Conference on Software Engineering. 215--223.Google ScholarDigital Library
- Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, and Alexander L Gaunt. 2018. Learning to represent edits. In Proceedings of the 7th International Conference on Learning Representations.Google Scholar
- Annie TT Ying, James L Wright, and Steven Abrams. 2005. Source code that talks: an exploration of Eclipse task comments and their implication to repository mining. In Proceedings of the International Workshop on Mining Software Repositories. 1--5.Google ScholarDigital Library
- Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella, and Harald Gall. 2017. Analyzing APIs documentation and code to detect directive defects. In Proceedings of the 39th International Conference on Software Engineering. 27--37.Google ScholarDigital Library
- Yu Zhou, Xin Yan, Wenhua Yang, Taolue Chen, and Zhiqiu Huang. 2019. Augmenting Java method comments generation with context information based on neural networks. Journal of Systems and Software 156 (2019), 328--340.Google ScholarDigital Library
Index Terms
- Automating just-in-time comment updating
Recommendations
HatCUP: hybrid analysis and attention based just-in-time comment updating
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program ComprehensionWhen changing code, developers sometimes neglect updating the related comments, bringing inconsistent or outdated comments. These comments increase the cost of program understanding and greatly reduce software maintainability. Researchers have put ...
Abstractive Summarization Improved by WordNet-Based Extractive Sentences
Natural Language Processing and Chinese ComputingAbstractRecently, the seq2seq abstractive summarization models have achieved good results on the CNN/Daily Mail dataset. Still, how to improve abstractive methods with extractive methods is a good research direction, since extractive methods have their ...
Topic-Features for Dialogue Summarization
Natural Language Processing and Chinese ComputingAbstractTexts such as news reports and academic papers come from one single speaker and are well-structured. However, dialogues often come from two or more speakers exchanging information. In this case, the topic or intention may change in a dialogue, and ...
Comments