Automatically detecting the scopes of source code comments

https://doi.org/10.1016/j.jss.2019.03.010Get rights and content
Under a Creative Commons license
open access

Highlights

  • This work proposes the first method to detect the scopes of block/line comments

  • The detection of comment scopes can be modeled as a binary classification problem

  • We propose discriminative features in code and comments to characterize the scopes of comments.

  • We apply our method to two existing approaches in software engineering tasks and improve their performance.

Abstract

Comments convey useful information about the system functionalities and many methods for software engineering tasks take comments as an important source for many software engineering tasks such as code semantic analysis, code reuse and so on. However, unlike structural doc comments, it is challenging to identify the relationship between the functional semantics of the code and its corresponding textual descriptions nested inside the code and apply it to automatic analyzing and mining approaches in software engineering tasks efficiently.

In this paper, we propose a general method for the detection of source code comment scopes. Based on machine learning, our method utilized features of code snippets and comments to detect the scopes of source code comments automatically in Java programs. On the dataset of comment-statement pairs from 4 popular open source projects, our method achieved a high accuracy of 81.45% in detecting the scopes of comments. Furthermore, the results demonstrated the feasibility and effectiveness of our comment scope detection method on new projects.

Moreover, our method was applied to two specific software engineering tasks in our studies: analyzing software repositories for outdated comment detection and mining software repositories for comment generation. As a general approach, our method provided a solution to comment-code mapping. It improved the performance of baseline methods in both tasks, which demonstrated that our method is conducive to automatic analyzing and mining approaches on software repositories.

Keywords

Comment scope detection
Machine learning
Software repositories

Cited by (0)

Huanchao Chen is a postgraduate student at the Sun Yat-sen University. His research interest includes code analysis and comprehension, and mining software repositories.

Yuan Huang received the Ph.D. degree in computer science from Sun Yat-sen University in 2017. He is an associate research fellow in the School of Data and Computer Science, Sun Yat-sen University. He is particularly interested in software evolution and maintenance, code analysis.

Zhiyong Liu is a postgraduate student at the Sun Yat-sen University. His research interest includes software engineering, code analysis and comprehension.

Xiangping Chen is an associate professor in the School of Communication and Design, Sun Yat-sen University. She got her Ph.D. degree from the Peking University in 2010. Her research interest includes software engineering and mining software repositories.

Fan Zou is a professor in the School of Data and Computer Science, Sun Yat-sen University. His research interest includes smart home, software engineering and image processing.

Xiaonan Luo is a professor of School of Computer Science and Information Security, Guilin University of Electronic Technology. His research interests include image processing, computer graphics & CAD, mobile computing.