skip to main content
article

An investigation of the effect of module size on defect prediction using static measures

Authors Info & Claims
Published:15 May 2005Publication History
Skip Abstract Section

Abstract

We used several machine learning algorithms to predict the defective modules in five NASA products, namely, CM1, JM1, KC1, KC2, and PC1. A set of static measures were employed as predictor variables. While doing so, we observed that a large portion of the modules were small, as measured by lines of code (LOC). When we experimented on the data subsets created by partitioning according to module size, we obtained higher prediction performance for the subsets that include larger modules. We also performed defect prediction using class-level data for KC1 rather than the method-level data. In this case, the use of class-level data resulted in improved prediction performance compared to using method-level data. These findings suggest that quality assurance activities can be guided even better if defect prediction is performed by using data that belong to larger modules.

References

  1. Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth & Brooks, 1984.Google ScholarGoogle Scholar
  2. Khaled El Emam, Saïda Benlarbi, Nishith Goel, and Shesh N. Rai. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. on Software Engineering, 27(7):630--650, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sallie Henry and Dennis Kafura. Software structure metrics based on information flow. IEEE Trans. on Software Engineering, 7(5):510--518, September 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Taghi M. Khoshgoftaar, Edward B. Allen, Kalai S. Kalaichelvan, and Nishith Goel. Early quality prediction: A case study in telecommunications. IEEE Software, 13(1):65--71, January 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Taghi M. Khoshgoftaar, Abhijit S. Pandya, and David L. Lanning. Application of neural networks for predicting program faults. Annals of Software Engineering, 1:141--154, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  6. Tim Menzies, Justin S. Di Stefano, Chris Cunanan, and Robert (Mike) Chapman. Mining repositories to assist in project planning and resource allocation. In International Workshop on Mining Software Repositories, May 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. Allen P. Nikora and John C. Munson. The effects of fault counting methods on fault model quality. In COMPSAC '04: The 28th International Computer Software and Application Conference, pages 192--201. IEEE Press, September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Martin Shepperd and Darrel Ince. Derivation and Validation of Software Metrics. Clarendon Press - Oxford, Oxford University Press, Walton Street, Oxford OX2 6DP, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jeff Tian, Anthony Nguyen, Curt Allen, and Ravi Appan. Experience with identifying and characterizing problem prone modules in telecommunication software systems. Journal of Systems and Software, 57(3):207--215, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An investigation of the effect of module size on defect prediction using static measures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGSOFT Software Engineering Notes
        ACM SIGSOFT Software Engineering Notes  Volume 30, Issue 4
        July 2005
        1514 pages
        ISSN:0163-5948
        DOI:10.1145/1082983
        Issue’s Table of Contents
        • cover image ACM Other conferences
          PROMISE '05: Proceedings of the 2005 workshop on Predictor models in software engineering
          May 2005
          46 pages
          ISBN:1595931252
          DOI:10.1145/1083165

        Copyright © 2005 Copyright is held by the owner/author(s)

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 May 2005

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader