skip to main content
10.1145/1363686.1363876acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Automated classification of change messages in open source projects

Published:16 March 2008Publication History

ABSTRACT

Source control systems permit developers to attach a free form message to every committed change. The content of these change messages can support software maintenance activities. We present an automated approach to classify a change message as either a bug fix, a feature introduction, or a general maintenance change. Researchers can study the evolution of project using our classification. For example, researchers can monitor the rate of bug fixes in a project without having access to bug reporting databases like Bugzilla.

A case study using change messages from several open source projects, shows that our approach produces results similar to a manual classifications performed by professional developers. These findings are similar to ones reported by Mockus and Votta for commercial projects.

References

  1. S. AA. A test for Homogeneity of the Marginal Distributions in a Two-way Classification. Biometrika, 42:412--416, 1955.Google ScholarGoogle ScholarCross RefCross Ref
  2. M. AE. Comparing the Classification of Subjects by Two Independent Judges. British Journal of Psychiatry, 116:651--655, 1970.Google ScholarGoogle ScholarCross RefCross Ref
  3. E. BS. The Analysis of Contingency Tables. Chapman and Hall, London, 1977.Google ScholarGoogle Scholar
  4. A. Chen, E. Chou, J. Wong, A. Y. Yao, Q. Zhang, S. Zhang, and A. Michail. CVSSearch: Searching through source code using CVS comments. In Proceedings of the 17th International Conference on Software Maintenance, pages 364--374, Florence, Italy, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Chen, S. R. Schach, L. Yu, J. Offutt, and G. Z. Heller. Open-Source Change Logs. Empirical Software Engineering, 9(197):210, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Rational ClearCase - Product Overview. Available online at http://www-306.ibm.com/software/awdtools/clearcase/.Google ScholarGoogle Scholar
  7. J. Cohen. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurements, pages 37--46, Dec. 1960.Google ScholarGoogle ScholarCross RefCross Ref
  8. CVS -- Concurrent Versions System. Available online at http://www.cvshome.org.Google ScholarGoogle Scholar
  9. E. B. Swanson. The Dimensions of Maintenance. In Proceedings of the 2nd International Conference on Software Engineering, pages 492--497, San Francisco, California, Oct. 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. E. Emam. Benchmarking Kappa: Interrater Agreement in Software Process Assessments. Empirical Software Engineering, 4(2):113--133, Dec. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. E. Hassan and R. C. Holt. Studying The Chaos of Code Development. In Proceedings of the 10th Working Conference on Reverse Engineering, Victoria, British Columbia, Canada, Nov. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tests of Marginal Homogeneity. Available online at http://ourworld.compuserve.com/homepages/jsuebersax/margin.htm.Google ScholarGoogle Scholar
  13. A. Mockus and L. G. Votta. Identifying reasons for software change using historic databases. In Proceedings of the 16th International Conference on Software Maintenance, pages 120--130, San Jose, California, Oct. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Perforce - The Fastest Software Configuration Management System. Available online at http://www.perforce.com.Google ScholarGoogle Scholar
  15. W. F. Tichy. RCS - a system for version control. Software - Practice and Experience, 15(7):637--654, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated classification of change messages in open source projects

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
              March 2008
              2586 pages
              ISBN:9781595937537
              DOI:10.1145/1363686

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 16 March 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate1,650of6,669submissions,25%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader