ABSTRACT
Source control systems permit developers to attach a free form message to every committed change. The content of these change messages can support software maintenance activities. We present an automated approach to classify a change message as either a bug fix, a feature introduction, or a general maintenance change. Researchers can study the evolution of project using our classification. For example, researchers can monitor the rate of bug fixes in a project without having access to bug reporting databases like Bugzilla.
A case study using change messages from several open source projects, shows that our approach produces results similar to a manual classifications performed by professional developers. These findings are similar to ones reported by Mockus and Votta for commercial projects.
- S. AA. A test for Homogeneity of the Marginal Distributions in a Two-way Classification. Biometrika, 42:412--416, 1955.Google ScholarCross Ref
- M. AE. Comparing the Classification of Subjects by Two Independent Judges. British Journal of Psychiatry, 116:651--655, 1970.Google ScholarCross Ref
- E. BS. The Analysis of Contingency Tables. Chapman and Hall, London, 1977.Google Scholar
- A. Chen, E. Chou, J. Wong, A. Y. Yao, Q. Zhang, S. Zhang, and A. Michail. CVSSearch: Searching through source code using CVS comments. In Proceedings of the 17th International Conference on Software Maintenance, pages 364--374, Florence, Italy, 2001. Google ScholarDigital Library
- K. Chen, S. R. Schach, L. Yu, J. Offutt, and G. Z. Heller. Open-Source Change Logs. Empirical Software Engineering, 9(197):210, 2004. Google ScholarDigital Library
- Rational ClearCase - Product Overview. Available online at http://www-306.ibm.com/software/awdtools/clearcase/.Google Scholar
- J. Cohen. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurements, pages 37--46, Dec. 1960.Google ScholarCross Ref
- CVS -- Concurrent Versions System. Available online at http://www.cvshome.org.Google Scholar
- E. B. Swanson. The Dimensions of Maintenance. In Proceedings of the 2nd International Conference on Software Engineering, pages 492--497, San Francisco, California, Oct. 1976. Google ScholarDigital Library
- K. E. Emam. Benchmarking Kappa: Interrater Agreement in Software Process Assessments. Empirical Software Engineering, 4(2):113--133, Dec. 1999. Google ScholarDigital Library
- A. E. Hassan and R. C. Holt. Studying The Chaos of Code Development. In Proceedings of the 10th Working Conference on Reverse Engineering, Victoria, British Columbia, Canada, Nov. 2003. Google ScholarDigital Library
- Tests of Marginal Homogeneity. Available online at http://ourworld.compuserve.com/homepages/jsuebersax/margin.htm.Google Scholar
- A. Mockus and L. G. Votta. Identifying reasons for software change using historic databases. In Proceedings of the 16th International Conference on Software Maintenance, pages 120--130, San Jose, California, Oct. 2000. Google ScholarDigital Library
- Perforce - The Fastest Software Configuration Management System. Available online at http://www.perforce.com.Google Scholar
- W. F. Tichy. RCS - a system for version control. Software - Practice and Experience, 15(7):637--654, 1985. Google ScholarDigital Library
Index Terms
- Automated classification of change messages in open source projects
Recommendations
Studying the fix-time for bugs in large open source projects
Promise '11: Proceedings of the 7th International Conference on Predictive Models in Software EngineeringBackground: Bug fixing lies at the core of most software maintenance efforts. Most prior studies examine the effort needed to fix a bug (fix-effort). However, the effort needed to fix a bug may not correlate with the calendar time needed to fix it (fix-...
An Empirical Study on the Occurrences of Code Smells in Open Source and Industrial Projects
ESEM '22: Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and MeasurementBackground: Reusing source code containing code smells can induce significant amount of maintenance time and cost. A list of code smells has been identified in the literature and developers are encouraged to avoid the smells from the very beginning ...
An empirical analysis of reopened bugs based on open source projects
EASE '16: Proceedings of the 20th International Conference on Evaluation and Assessment in Software EngineeringBackground: Bug fixing is a long-term and time-consuming activity. A software bug experiences a typical life cycle from newly reported to finally closed by developers, but it could be reopened afterwards for further actions due to reasons such as ...
Comments