ABSTRACT
Code churn, the amount of code change taking place within a software unit over time, has been correlated with fault-proneness in software systems. We investigate the use of code churn and static metrics collected at regular time intervals during the development cycle to predict faults in an iterative, in-process manner. We collected 159 churn and structure metrics from six, four-month snapshots of a 1 million LOC Microsoft product. The number of software faults fixed during each period is recorded per binary module. Using stepwise logistic regression, we create a prediction model to identify fault-prone binaries using three parameters: code churn (the number of new and changed blocks); class Fan In and class Fan Out (normalized by lines of code). The iteratively-built model is 80.0% accurate at predicting fault-prone and non-fault-prone binaries. These fault-prediction models have the advantage of allowing the engineers to observe how their fault-prediction profile evolves over time.
- Alfred V. Aho, Ravi Sethi, and J. D. Ullman, Compilers Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986. Google ScholarDigital Library
- V. Basili, Briand, L., Melo, W., "A Validation of Object Oriented Design Metrics as Quality Indicators," IEEE Transactions on Software Engineering, vol. Vol. 22, pp. 751--761, 1996. Google ScholarDigital Library
- L. C. Briand, Wuest, J., Daly, J. W., Porter, D. V., "Exploring the Relationship between Design Measures and Software Quality in Object Oriented Systems," Journal of Systems and Software, vol. Vol. 51, pp. 245--273, 2000. Google ScholarDigital Library
- L. C. Briand, Wuest, J., Ikonomovski, S., Lounis, H., "Investigating quality factors in object-oriented designs: an industrial case study," in ICSE, 1999, pp. 345--354. Google ScholarDigital Library
- S. R. Chidamber, Kemerer, C. F., "A Metrics Suite for Object Oriented Design," IEEE Transactions on Software Engineering, vol. 20, pp. 476--493, 1994. Google ScholarDigital Library
- K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai, "The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics," IEEE Transactions on Software Engineering, vol. 27, pp. 630--650, July 2001. Google ScholarDigital Library
- T. L. Graves, Karr, A. F., Marron, J. S., Siy, H., "Predicting Fault Incidence Using Software Change History," IEEE Transactions on Software Engineering, vol. 26, pp. 653--661, 2000. Google ScholarDigital Library
- J. E. Jackson, A User's Guide to Principal Components. New York: Wiley, 1991.Google Scholar
- A. Mockus, Zhang, P., Li, P., "Drivers for customer perceived software quality," in International Conference on Software Engineering (ICSE 05), St. Louis, MO, 2005, pp. 225--233. Google ScholarDigital Library
- N. Nagappan, Ball, T., "Use of Relative Code Churn Measures to Predict System Defect Density," in International Conference on Software Engineering (ICSE), St. Louis, MO, 2005, pp. 284--292. Google ScholarDigital Library
- N. Nagappan, Ball, T., Murphy, B., "Using Historical In-Process and Product Metrics for Early Estimation of Software Failures," in International Symposium on Software Reliability Engineering, 2006, pp. 62--74. Google ScholarDigital Library
- M. C. Ohlsson, von Mayrhauser, A., McGuire, B., Wohlin, C., "Code Decay Analysis of Legacy Software through Successive Releases," in IEEE Aerospace Conference, 1999, pp. 69--81.Google Scholar
- T. J. Ostrand, Weyuker, E. J, Bell, R. M., "Where the Bugs Are," in the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2004, pp. 86--96. Google ScholarDigital Library
- R. Subramanyam, Krishnan, M. S., "Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects," IEEE Transactions on Software Engineering, vol. Vol. 29, pp. 297--310, 2003. Google ScholarDigital Library
- M.-H. Tang, Kao, M.-H., Chen, M.-H., "An empirical study on object-oriented metrics," in Sixth International Software Metrics Symposium, 1999, pp. 242--249. Google ScholarDigital Library
- M. A. Vouk, Tai, K. C., "Multi-Phase Coverage- and Risk-Based Software Reliability Modeling," in CASCON '93, 1993, pp. 513--523. Google ScholarDigital Library
Index Terms
- Iterative identification of fault-prone binaries using in-process metrics
Recommendations
Fault prediction and the discriminative powers of connectivity-based object-oriented class cohesion metrics
Context: Several metrics have been proposed to measure the extent to which class members are related. Connectivity-based class cohesion metrics measure the degree of connectivity among the class members. Objective: We propose a new class cohesion metric ...
A comparison between software design and code metrics for the prediction of software fault content
Software metrics play an important role in measuring the quality of software. It is desirable to predict the quality of software as early as possible, and hence metrics have to be collected early as well. This raises a number of questions that has not ...
Interactive churn metrics: socio-technical variants of code churn
A central part of software quality is finding bugs. One method of finding bugs is by measuring important aspects of the software product and the development process. In recent history, researchers have discovered evidence of a "code churn" effect whereby ...
Comments