article

An investigation of the effect of module size on defect prediction using static measures

Authors:
A. Günes Koru

University of Maryland, Baltimore County - UMBC, Baltimore, MD

University of Maryland, Baltimore County - UMBC, Baltimore, MD
View Profile

,
Hongfang Liu

University of Maryland, Baltimore County - UMBC, Baltimore, MD

University of Maryland, Baltimore County - UMBC, Baltimore, MD
View Profile

Authors Info & Claims

ACM SIGSOFT Software Engineering Notes Volume 30 Issue 4July 2005pp 1–5https://doi.org/10.1145/1082983.1083172

Published:15 May 2005Publication History

ACM SIGSOFT Software Engineering Notes

Abstract

We used several machine learning algorithms to predict the defective modules in five NASA products, namely, CM1, JM1, KC1, KC2, and PC1. A set of static measures were employed as predictor variables. While doing so, we observed that a large portion of the modules were small, as measured by lines of code (LOC). When we experimented on the data subsets created by partitioning according to module size, we obtained higher prediction performance for the subsets that include larger modules. We also performed defect prediction using class-level data for KC1 rather than the method-level data. In this case, the use of class-level data resulted in improved prediction performance compared to using method-level data. These findings suggest that quality assurance activities can be guided even better if defect prediction is performed by using data that belong to larger modules.

References

Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth & Brooks, 1984.Google Scholar
Khaled El Emam, Saïda Benlarbi, Nishith Goel, and Shesh N. Rai. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. on Software Engineering, 27(7):630--650, July 2001. Google ScholarDigital Library
Sallie Henry and Dennis Kafura. Software structure metrics based on information flow. IEEE Trans. on Software Engineering, 7(5):510--518, September 1981.Google ScholarDigital Library
Taghi M. Khoshgoftaar, Edward B. Allen, Kalai S. Kalaichelvan, and Nishith Goel. Early quality prediction: A case study in telecommunications. IEEE Software, 13(1):65--71, January 1996. Google ScholarDigital Library
Taghi M. Khoshgoftaar, Abhijit S. Pandya, and David L. Lanning. Application of neural networks for predicting program faults. Annals of Software Engineering, 1:141--154, 1995.Google ScholarCross Ref
Tim Menzies, Justin S. Di Stefano, Chris Cunanan, and Robert (Mike) Chapman. Mining repositories to assist in project planning and resource allocation. In International Workshop on Mining Software Repositories, May 2004.Google ScholarCross Ref
Allen P. Nikora and John C. Munson. The effects of fault counting methods on fault model quality. In COMPSAC '04: The 28th International Computer Software and Application Conference, pages 192--201. IEEE Press, September 2004. Google ScholarDigital Library
Martin Shepperd and Darrel Ince. Derivation and Validation of Software Metrics. Clarendon Press - Oxford, Oxford University Press, Walton Street, Oxford OX2 6DP, 1993. Google ScholarDigital Library
Jeff Tian, Anthony Nguyen, Curt Allen, and Ravi Appan. Experience with identifying and characterizing problem prone modules in telecommunication software systems. Journal of Systems and Software, 57(3):207--215, July 2001. Google ScholarDigital Library
Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco, 2000. Google ScholarDigital Library

Index Terms

An investigation of the effect of module size on defect prediction using static measures
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. System management
        Quality assurance

Recommendations

An investigation of the effect of module size on defect prediction using static measures
PROMISE '05: Proceedings of the 2005 workshop on Predictor models in software engineering

We used several machine learning algorithms to predict the defective modules in five NASA products, namely, CM1, JM1, KC1, KC2, and PC1. A set of static measures were employed as predictor variables. While doing so, we observed that a large portion of ...
Read More
Heterogeneous defect prediction
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect ...
Read More
An empirical study on software defect prediction with a simplified metric set

ContextSoftware defect prediction plays a crucial role in estimating the most defect-prone components of software, and a large number of studies have pursued improving prediction accuracy within a project or across projects. However, the rules for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGSOFT Software Engineering Notes Volume 30, Issue 4
July 2005
1514 pages
ISSN:0163-5948
DOI:10.1145/1082983
Issue’s Table of Contents
PROMISE '05: Proceedings of the 2005 workshop on Predictor models in software engineering
May 2005
46 pages
ISBN:1595931252
DOI:10.1145/1083165
Copyright © 2005 Copyright is held by the owner/author(s)
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 May 2005
Check for updates
Author Tags
defect prediction
prediction models
software metrics
software quality management
static measures
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 72
  Total Citations
  View Citations
- 733
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An investigation of the effect of module size on defect prediction using static measures

ACM SIGSOFT Software Engineering Notes

Abstract

References

Cited By

Index Terms

Recommendations

An investigation of the effect of module size on defect prediction using static measures

Heterogeneous defect prediction

An empirical study on software defect prediction with a simplified metric set