extended-abstract

Understanding source code comments at large-scale

Author:

Hao HeAuthors Info & Claims

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 1217 - 1219

https://doi.org/10.1145/3338906.3342494

Published: 12 August 2019 Publication History

Abstract

Source code comments are important for any software, but the basic patterns of writing comments across domains and programming languages remain unclear. In this paper, we take a first step toward understanding differences in commenting practices by analyzing the comment density of 150 projects in 5 different programming languages. We have found that there are noticeable differences in comment density, which may be related to the programming language used in the project and the purpose of the project.

References

[1]

Oliver Arafat and Dirk Riehle. 2009. The comment density of open source software code. In 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Companion Volume. 195–198.

[2]

Qingying Chen and Minghui Zhou. 2018. A neural framework for retrieval and summarization of source code. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3-7, 2018. 826–831.

Digital Library

[3]

Beat Fluri, Michael Würsch, Emanuel Giger, and Harald C. Gall. 2009. Analyzing the Co-evolution of Comments and Source Code. Software Quality Journal 17, 4 (Dec. 2009), 367–394.

Digital Library

[4]

Georgios Gousios and Diomidis Spinellis. 2012. GHTorrent: Github’s data from a firehose. In 9th IEEE Working Conference of Mining Software Repositories, MSR 2012, June 2-3, 2012, Zurich, Switzerland. 12–21. 2012.6224294

Digital Library

[5]

Dorsaf Haouari, Houari A. Sahraoui, and Philippe Langlais. 2011. How Good is Your Comment? A Study of Comments in Java Programs. In Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement, ESEM 2011, Banff, AB, Canada, September 22-23, 2011. 137–146.

Digital Library

[6]

Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, and Takashi Ishio. 2019. 9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay. CoRR abs/1901.07440 (2019). arXiv: 1901.07440 http://arxiv.org/abs/1901.07440

Digital Library

[7]

Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th Conference on Program Comprehension, ICPC 2018, Gothenburg, Sweden, May 27-28, 2018. 200–210. 1145/3196321.3196334

Digital Library

[8]

Xing Hu, Ge Li, Xin Xia, David Lo, Shuai Lu, and Zhi Jin. 2018. Summarizing Source Code with Transferred API Knowledge. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. 2269–2275.

Digital Library

[9]

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers. http://aclweb.org/ anthology/P/P16/P16-1195.pdf

[10]

Yuxing Ma, Christopher Bogart, Sadika Amreen, Russell Zaretzki, and Audris Mockus. 2019. World of Code: An Infrastructure for Mining the Universe of Open Source VCS Data. In 16th International Conference on Mining Software Repositories, MSR 2019.

Digital Library

[11]

P. Oman and J. Hagemeister. 1992. Metrics for assessing a software system’s maintainability. In Proceedings Conference on Software Maintenance 1992. 337–344.

[12]

Oracle. 2019. Javadoc. https://docs.oracle.com/javase/8/docs/technotes/tools/ windows/javadoc.html. Accessed: 2019-06-05.

[13]

Yoann Padioleau, Lin Tan, and Yuanyuan Zhou. 2009. Listening to programmers - Taxonomies and characteristics of comments in operating system code. In 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Proceedings. 331–341.

Digital Library

[14]

5070533

[15]

Luca Pascarella and Alberto Bacchelli. 2017. Classifying code comments in Java open-source software systems. In Proceedings of the 14th International Conference on Mining Software Repositories, MSR 2017, Buenos Aires, Argentina, May 20-28, 2017. 227–237.

Digital Library

[16]

Ioannis Stamelos, Lefteris Angelis, Apostolos Oikonomou, and Georgios L. Bleris. 2002. Code Quality Analysis in Open Source Software Development. Information System Journal 12, 1 (2002), 43–60. 00117.x

[17]

Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /*Icomment: Bugs or Bad Comments?*/. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles (SOSP ’07). ACM, New York, NY, USA, 145–158.

Digital Library

[18]

The Sphinx team. 2019. Sphinx. http://www.sphinx-doc.org/en/master/. Accessed: 2019-06-05.

[19]

T. Tenny. 1988. Program Readability: Procedures Versus Comments. IEEE Trans. Softw. Eng. 14, 9 (Sept. 1988), 1271–1279.

Digital Library

[20]

Edmund Wong, Jinqiu Yang, and Lin Tan. 2013. AutoComment: Mining question and answer sites for automatic comment generation. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, November 11-15, 2013. 562–567. 2013.6693113

Digital Library

[21]

S. N. Woodfield, H. E. Dunsmore, and V. Y. Shen. 1981. The Effect of Modularization and Comments on Program Comprehension. In Proceedings of the 5th International Conference on Software Engineering (ICSE ’81). IEEE Press, Piscataway, NJ, USA, 215–223. http://dl.acm.org/citation.cfm?id=800078.802534 Abstract 1 Problem and Motivation 2 Background and Related Work 3 Approach 3.1 Selection of Open Source Projects 3.2 Analysis of Comment Density 4 Results 5 Conclusion References

Digital Library

Cited By

Nyenah EDöll PKatz DReinecke R(2024)Software sustainability of global impact modelsGeoscientific Model Development10.5194/gmd-17-8593-202417:23(8593-8611)Online publication date: 5-Dec-2024
https://doi.org/10.5194/gmd-17-8593-2024
Wang YZhao QXu DLiu XLarson K(2024)Purpose enhanced reasoning through iterative promptingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/720(6513-6521)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/720
Endres MFakhoury SChakraborty SLahiri S(2024)Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions?Proceedings of the ACM on Software Engineering10.1145/36607911:FSE(1889-1912)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660791
Show More Cited By

Index Terms

Understanding source code comments at large-scale
1. Software and its engineering
  1. Software creation and management

Recommendations

API documentation from source code comments: a case study of Javadoc
SIGDOC '99: Proceedings of the 17th annual international conference on Computer documentation

This paper describes in a general way the process we went through to determine the goals, principles, audience, content and style for writing comments in source code for the Java platform at the Java Software division of Sun Microsystems. This includes ...
Analyzing the co-evolution of comments and source code

Source code comments are a valuable instrument to preserve design decisions and to communicate the intent of the code to programmers and maintainers. Nevertheless, commenting source code and keeping comments up-to-date is often neglected for reasons of ...
The commenting practice of open source
OOPSLA '09: Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications

The development processes of open source soft-ware are different from traditional closed source development processes. Still, open source software is frequently of high quality. This raises the question of how and why open source software creates high ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 2019

1264 pages

ISBN:9781450355728

DOI:10.1145/3338906

General Chairs:
Marlon Dumas
University of Tartu, Estonia
,
Dietmar Pfahl
University of Tartu, Estonia
,
Program Chairs:
Sven Apel
Saarland University, Germany
,
Alessandra Russo
Imperial College, UK

Copyright © 2019 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Check for updates

Author Tags

Qualifiers

Extended-abstract

Conference

ESEC/FSE '19

Sponsor:

SIGSOFT

ESEC/FSE '19: 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 26 - 30, 2019

Tallinn, Estonia

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
378
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nyenah EDöll PKatz DReinecke R(2024)Software sustainability of global impact modelsGeoscientific Model Development10.5194/gmd-17-8593-202417:23(8593-8611)Online publication date: 5-Dec-2024
https://doi.org/10.5194/gmd-17-8593-2024
Wang YZhao QXu DLiu XLarson K(2024)Purpose enhanced reasoning through iterative promptingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/720(6513-6521)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/720
Endres MFakhoury SChakraborty SLahiri S(2024)Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions?Proceedings of the ACM on Software Engineering10.1145/36607911:FSE(1889-1912)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660791
Ding XPeng RChen XHuang YBian JZheng Z(2024)Do Code Summarization Models Process Too Much Information? Function Signature May Be All That Is NeededACM Transactions on Software Engineering and Methodology10.1145/365215633:6(1-35)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3652156
Boll ARani PSchultheiß AKehrer T(2024)Beyond code: Is there a difference between comments in visual and textual languages?Journal of Systems and Software10.1016/j.jss.2024.112087215(112087)Online publication date: Sep-2024
https://doi.org/10.1016/j.jss.2024.112087
Shen YJu XChen XYang G(2024)Bash comment generation via data augmentation and semantic-aware CodeBERTAutomated Software Engineering10.1007/s10515-024-00431-231:1Online publication date: 26-Mar-2024
https://doi.org/10.1007/s10515-024-00431-2
Sridharan MRantala LMäntylä M(2023)PENTACET data - 23 Million Contextual Code Comments and 250,000 SATD comments2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00063(412-416)Online publication date: May-2023
https://doi.org/10.1109/MSR59073.2023.00063
Park YPark AKim C(2023)ALSI-Transformer: Transformer-Based Code Comment Generation With Aligned Lexical and Syntactic InformationIEEE Access10.1109/ACCESS.2023.326863811(39037-39047)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3268638
Yang CWu J(2023)An Approach of Code Summary Generation Using Multi-Feature Fusion Based on TransformerWeb Information Systems and Applications10.1007/978-981-99-6222-8_23(271-283)Online publication date: 9-Sep-2023
https://doi.org/10.1007/978-981-99-6222-8_23
Wang CHe HPal UMarinov DZhou M(2022)Suboptimal Comments in Java Projects: From Independent Comment Changes to Commenting PracticesACM Transactions on Software Engineering and Methodology10.1145/354694932:2(1-33)Online publication date: 8-Jul-2022
https://dl.acm.org/doi/10.1145/3546949
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten