Skip to main content

Advertisement

Log in

A defect prediction method for software versioning

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

New methodologies and tools have gradually made the life cycle for software development more human-independent. Much of the research in this field focuses on defect reduction, defect identification and defect prediction. Defect prediction is a relatively new research area that involves using various methods from artificial intelligence to data mining. Identifying and locating defects in software projects is a difficult task. Measuring software in a continuous and disciplined manner provides many advantages such as the accurate estimation of project costs and schedules as well as improving product and process qualities. This study aims to propose a model to predict the number of defects in the new version of a software product with respect to the previous stable version. The new version may contain changes related to a new feature or a modification in the algorithm or bug fixes. Our proposed model aims to predict the new defects introduced into the new version by analyzing the types of changes in an objective and formal manner as well as considering the lines of code (LOC) change. Defect predictors are helpful tools for both project managers and developers. Accurate predictors may help reducing test times and guide developers towards implementing higher quality codes. Our proposed model can aid software engineers in determining the stability of software before it goes on production. Furthermore, such a model may provide useful insight for understanding the effects of a feature, bug fix or change in the process of defect detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Alpaydin, E. (2004). Introduction to machine learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Barry, M. J. A., & Linoff, G. (1997). Data mining techniques: For marketing, sales, and customer support. New York: John Wiley.

    Google Scholar 

  • Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press.

    Google Scholar 

  • Boehm, B., & Basili, V. R. (2000). Gaining intellectual control of software development. Computer, 33(5), 27–33.

    Article  Google Scholar 

  • Boehm, B., & Basili, V. (2001). Software defect reduction top 10 list. IEEE Computer, 34(1), 135–137.

    Google Scholar 

  • Boehm, B., Clark, B., Horowitz, E., & Westland, C. (1995). Cost models for future software life cycle processes: COCOMO 2.0. Annals of Software Engineering, Special Volume on Software Process and Product Measurement(1), 57–94.

  • Bowen, J. P., & Hinchey, M. G. (1995). Seven more myths of formal methods. IEEE Software, 12(4), 34–41. doi:10.1109/52.391826.

    Article  Google Scholar 

  • Brilliant, S. S., Knight, J. C., & Leveson, N. G. (1990). Analysis of faults in an N-version software experiment. IEEE Transactions on Software Engineering, 16(2), 238–247. doi:10.1109/32.44387.

    Article  Google Scholar 

  • Brun, Y., & Ernst, M. (2004). Finding latent code errors via machine learning over program executions. Edinburgh, Scotland: ICSE 2004, 26th International Conference on Software Engineering.

    Google Scholar 

  • Ceylan, E., Kutlubay, O., & Bener, A. (2006). Software Defect Identification Using Machine Learning Techniques. In 32nd Euromicro Conference on Software Engineering and Advanced Applications (Euromicro-SEAA 2006), Crotia.

  • Clarke, E. M., & Wing, J. M. (1996). Formal methods: state of the art and future directions. ACM Computing Surveys, 28(4), 626–643. doi:10.1145/242223.242257.

    Article  Google Scholar 

  • Coppit, D., Yang, J., Khurshid, S., Le, W., & Sullivan, K. (2005). Software assurance by bounded exhaustive testing. IEEE Transactions on Software Engineering, 31(4), 328–339. doi:10.1109/TSE.2005.52.

    Article  Google Scholar 

  • Fenton, N., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5), 675–689. doi:10.1109/32.815326.

    Article  Google Scholar 

  • Fenton, N., & Ohlsson, N. (2000). Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering, 26(8), 797–814. doi:10.1109/32.879815.

    Article  Google Scholar 

  • Gregoriades, A., & Sutcliffe, A. (2005). Scenario-based assessment of nonfunctional requirements. IEEE Transactions on Software Engineering, 31(5), 392–409. doi:10.1109/TSE.2005.59.

    Article  Google Scholar 

  • Groce, P., & Visser, W. (2003). What went wrong: Explaining counterexamples. 10th International SPIN Workshop on Model Checking of Software. Portland, Oregon, pp. 121–135.

  • Harrold, M. J. (2000). Testing: A roadmap. Proceedings of the Conference on the Future of Software Engineering, Limerick, Ireland.

  • Inoue, K., Yokomori, R., Yamamoto, R., Matsushita, M., & Kusumoto, S. (2005). Ranking significance of software components based on use relations. IEEE Transactions on Software Engineering, 31(3), 213–225. doi:10.1109/TSE.2005.38.

    Article  Google Scholar 

  • Jensen, F. (1996). An introduction to Bayesian networks. NY: Springer Verlag.

    Google Scholar 

  • Johnson, P. M., Kou, H., Paulding, M., Zhang, Q., Kagawa, A., & Yamashita, T. (2005). Improving software development management through software project telemetry. IEEE Software, 22(4), 76–85. doi:10.1109/MS.2005.95.

    Article  Google Scholar 

  • Jorgensen, M. (2005). Practical guidelines for expert-judgment-based software effort estimation. IEEE Software, 22(3), 57–63. doi:10.1109/MS.2005.73.

    Article  Google Scholar 

  • Khoshgoftaar, T. M., & Allen, E. B. (1999). A comparative study of ordering and classification of fault-prone software modules. Empirical Software Engineering, 4, 159–186. doi:10.1023/A:1009876418873.

    Article  Google Scholar 

  • Koru, G., & Liu, H. (2005). Building defect prediction models in practice. IEEE Software, 22(6), 23–29. doi:10.1109/MS.2005.149.

    Article  Google Scholar 

  • Kung, D. C., Gao, J., Hsia, F., Wen, F., Toyoshima, Y., & Chen, C. (1994). Change impact identification in object oriented software maintenance. Proceedings of the International Conference on Software Maintenance, pp. 202–211, IEEE Computer Society Press.

  • Menzies, T., DiStefano, J., & Chapman, R. (2004). Assessing predictors of software defects. Proccedings of Workshop on Predictive Software Models, Chicago.

  • Menzies, T., DiStefano, J. S., Chapman, M., & McGill, K. (2002). Metrics that matter. Proceedings of 27th NASA SEL Workshop on Software Engineering.

  • Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1), 2–13. doi:10.1109/TSE.2007.256941.

    Article  Google Scholar 

  • Menzies, T., D. Stefano, J. S., & Chapman, M. (2003). Learning early life cycle IV & V quality indicators. Proceedings of Ninth International Software Metrics Symposium.

  • Mitchell, T. M. (1997). Machine Learning. NY: McGrawHill.

    MATH  Google Scholar 

  • Munson, J., & Khoshgoftaar, T. M. (1990). Regression modelling of software quality: Empirical investigation. Journal of Electronic Materials, 19(6), 106–114.

    Google Scholar 

  • Nagappan, N., & Ball, T. (2005a). Use of relative code churn measures to predict system defect density. St. Louis, MO: ICSE 2005.

    Google Scholar 

  • Nagappan, N., & Ball, T. (2005b). Static analysis tools as early indicators of pre-release defect density. St. Louis, MO: ICSE 2005.

    Google Scholar 

  • Nagappan, N., Williams, L., Osborne, J., Vouk, M., & Abrahamsson, P. (2005). Providing test quality feedback using static source code and automatic test suite metrics. Chicago, IL: International Symposium on Software Reliability Engineering.

    Google Scholar 

  • Ostrand, T. J., Weyuker, E. J., & Bell, R. M. (2005). Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4), 340–355. doi:10.1109/TSE.2005.49.

    Article  Google Scholar 

  • Padberg, F., Ragg, T., & Schoknecht, R. (2004). Using machine learning for estimating the defect content after an inspection. IEEE Transactions on Software Engineering, 30(1), 17–28. doi:10.1109/TSE.2004.1265733.

    Article  Google Scholar 

  • Pendharkar, P. C., Subramanian, G. H., & Rodger, J. A. (2005). A probabilistic model for predicting software development effort. IEEE Transactions on Software Engineering, 31(7), 615–624. doi:10.1109/TSE.2005.75.

    Article  Google Scholar 

  • Podgurski, D., Leaon, P., Francis, P., Masri, W., Minch, M., Jiayang, S., & Wang, B. (2003). Automated support for classifying software failure reports. Portland, Oregon: ICSE 2003.

    Google Scholar 

  • Porter, A., & Votta, L. (2004). Comparing detection methods for software requirements inspections: A replication using professional subjects. Empirical Software Engineering, 3(4), 355–379.

    Article  Google Scholar 

  • Sarle, W. (1996). How many hidden layer should i use. Neural Nets FAQ, http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-9.html.

  • Sheppard, M., & Ince, D. C. (1994). A critique of three metrics. The Journal of Systems and Software, 26(33), 197–210.

    Article  Google Scholar 

  • Song, O., Sheppard, M., Cartwright, M., & Mair, C. (2006). Software defect association mining and defect correction effort prediction. IEEE Transactions on Software Engineering, 32(2), 69–82. doi:10.1109/TSE.2006.1599417.

    Article  Google Scholar 

  • Sontag, E. D. (1992). Feedback stabilization using two-hidden-layer nets. IEEE Transactions on Neural Networks, 3, 981–990. doi:10.1109/72.165599.

    Article  Google Scholar 

  • Sourceforge. from www.sourceforge.net.

  • Swingler, K. (1996). Applying neural networks: Apractical guide. London: Academic Press.

    Google Scholar 

  • Tahat, L. H., Vaysburg, B., Korel B., & Bader, A. J. (2001). A requirement-based automated black-box test generation. Proceedings of 25th Annual International Computer Software and Applications Conference, Chicago, IL, pp. 489–495.

  • Vaidyanathan, K., & Trivedi, S. (2005). A comprehensive model for software rejuvenation. IEEE Transactions on Dependable Secure Computing, 2(2), 124–137. doi:10.1109/TDSC.2005.15.

    Article  Google Scholar 

  • Zhang, D. (2000). Applying machine learning algorithms in software development. The Proceedings of 2000 Monterey Workshop on Modeling Software System Structures, Santa Margherita Ligure, Italy.

Download references

Acknowledgements

This work is supported in part by the Boğaziçi University research fund under grant number BAP–06HA104. Special thanks to our colleague Burak Turhan for his valuable comments on the manuscript. We would also thank Ms Cigdem Aksoy Fromm who has done the final editing of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yomi Kastro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kastro, Y., Bener, A.B. A defect prediction method for software versioning. Software Qual J 16, 543–562 (2008). https://doi.org/10.1007/s11219-008-9053-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-008-9053-8

Keywords

Navigation