A defect prediction method for software versioning

Kastro, Yomi; Bener, Ayşe Basar

doi:10.1007/s11219-008-9053-8

A defect prediction method for software versioning

Published: 01 May 2008

Volume 16, pages 543–562, (2008)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Yomi Kastro¹ &
Ayşe Basar Bener¹

708 Accesses
23 Citations
3 Altmetric
Explore all metrics

Abstract

New methodologies and tools have gradually made the life cycle for software development more human-independent. Much of the research in this field focuses on defect reduction, defect identification and defect prediction. Defect prediction is a relatively new research area that involves using various methods from artificial intelligence to data mining. Identifying and locating defects in software projects is a difficult task. Measuring software in a continuous and disciplined manner provides many advantages such as the accurate estimation of project costs and schedules as well as improving product and process qualities. This study aims to propose a model to predict the number of defects in the new version of a software product with respect to the previous stable version. The new version may contain changes related to a new feature or a modification in the algorithm or bug fixes. Our proposed model aims to predict the new defects introduced into the new version by analyzing the types of changes in an objective and formal manner as well as considering the lines of code (LOC) change. Defect predictors are helpful tools for both project managers and developers. Accurate predictors may help reducing test times and guide developers towards implementing higher quality codes. Our proposed model can aid software engineers in determining the stability of software before it goes on production. Furthermore, such a model may provide useful insight for understanding the effects of a feature, bug fix or change in the process of defect detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative Analysis of Software Development Life Cycle Models (SDLC)

How different are different diff algorithms in Git?

Article Open access 11 September 2019

Yusuf Sulistyo Nugroho, Hideaki Hata & Kenichi Matsumoto

Applications of AI in classical software engineering

Article Open access 26 July 2020

Marco Barenkamp, Jonas Rebstadt & Oliver Thomas

References

Alpaydin, E. (2004). Introduction to machine learning. Cambridge, MA: MIT Press.
Google Scholar
Barry, M. J. A., & Linoff, G. (1997). Data mining techniques: For marketing, sales, and customer support. New York: John Wiley.
Google Scholar
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press.
Google Scholar
Boehm, B., & Basili, V. R. (2000). Gaining intellectual control of software development. Computer, 33(5), 27–33.
Article Google Scholar
Boehm, B., & Basili, V. (2001). Software defect reduction top 10 list. IEEE Computer, 34(1), 135–137.
Google Scholar
Boehm, B., Clark, B., Horowitz, E., & Westland, C. (1995). Cost models for future software life cycle processes: COCOMO 2.0. Annals of Software Engineering, Special Volume on Software Process and Product Measurement(1), 57–94.
Bowen, J. P., & Hinchey, M. G. (1995). Seven more myths of formal methods. IEEE Software, 12(4), 34–41. doi:10.1109/52.391826.
Article Google Scholar
Brilliant, S. S., Knight, J. C., & Leveson, N. G. (1990). Analysis of faults in an N-version software experiment. IEEE Transactions on Software Engineering, 16(2), 238–247. doi:10.1109/32.44387.
Article Google Scholar
Brun, Y., & Ernst, M. (2004). Finding latent code errors via machine learning over program executions. Edinburgh, Scotland: ICSE 2004, 26th International Conference on Software Engineering.
Google Scholar
Ceylan, E., Kutlubay, O., & Bener, A. (2006). Software Defect Identification Using Machine Learning Techniques. In 32nd Euromicro Conference on Software Engineering and Advanced Applications (Euromicro-SEAA 2006), Crotia.
Clarke, E. M., & Wing, J. M. (1996). Formal methods: state of the art and future directions. ACM Computing Surveys, 28(4), 626–643. doi:10.1145/242223.242257.
Article Google Scholar
Coppit, D., Yang, J., Khurshid, S., Le, W., & Sullivan, K. (2005). Software assurance by bounded exhaustive testing. IEEE Transactions on Software Engineering, 31(4), 328–339. doi:10.1109/TSE.2005.52.
Article Google Scholar
Fenton, N., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5), 675–689. doi:10.1109/32.815326.
Article Google Scholar
Fenton, N., & Ohlsson, N. (2000). Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering, 26(8), 797–814. doi:10.1109/32.879815.
Article Google Scholar
Gregoriades, A., & Sutcliffe, A. (2005). Scenario-based assessment of nonfunctional requirements. IEEE Transactions on Software Engineering, 31(5), 392–409. doi:10.1109/TSE.2005.59.
Article Google Scholar
Groce, P., & Visser, W. (2003). What went wrong: Explaining counterexamples. 10th International SPIN Workshop on Model Checking of Software. Portland, Oregon, pp. 121–135.
Harrold, M. J. (2000). Testing: A roadmap. Proceedings of the Conference on the Future of Software Engineering, Limerick, Ireland.
Inoue, K., Yokomori, R., Yamamoto, R., Matsushita, M., & Kusumoto, S. (2005). Ranking significance of software components based on use relations. IEEE Transactions on Software Engineering, 31(3), 213–225. doi:10.1109/TSE.2005.38.
Article Google Scholar
Jensen, F. (1996). An introduction to Bayesian networks. NY: Springer Verlag.
Google Scholar
Johnson, P. M., Kou, H., Paulding, M., Zhang, Q., Kagawa, A., & Yamashita, T. (2005). Improving software development management through software project telemetry. IEEE Software, 22(4), 76–85. doi:10.1109/MS.2005.95.
Article Google Scholar
Jorgensen, M. (2005). Practical guidelines for expert-judgment-based software effort estimation. IEEE Software, 22(3), 57–63. doi:10.1109/MS.2005.73.
Article Google Scholar
Khoshgoftaar, T. M., & Allen, E. B. (1999). A comparative study of ordering and classification of fault-prone software modules. Empirical Software Engineering, 4, 159–186. doi:10.1023/A:1009876418873.
Article Google Scholar
Koru, G., & Liu, H. (2005). Building defect prediction models in practice. IEEE Software, 22(6), 23–29. doi:10.1109/MS.2005.149.
Article Google Scholar
Kung, D. C., Gao, J., Hsia, F., Wen, F., Toyoshima, Y., & Chen, C. (1994). Change impact identification in object oriented software maintenance. Proceedings of the International Conference on Software Maintenance, pp. 202–211, IEEE Computer Society Press.
Menzies, T., DiStefano, J., & Chapman, R. (2004). Assessing predictors of software defects. Proccedings of Workshop on Predictive Software Models, Chicago.
Menzies, T., DiStefano, J. S., Chapman, M., & McGill, K. (2002). Metrics that matter. Proceedings of 27th NASA SEL Workshop on Software Engineering.
Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1), 2–13. doi:10.1109/TSE.2007.256941.
Article Google Scholar
Menzies, T., D. Stefano, J. S., & Chapman, M. (2003). Learning early life cycle IV & V quality indicators. Proceedings of Ninth International Software Metrics Symposium.
Mitchell, T. M. (1997). Machine Learning. NY: McGrawHill.
MATH Google Scholar
Munson, J., & Khoshgoftaar, T. M. (1990). Regression modelling of software quality: Empirical investigation. Journal of Electronic Materials, 19(6), 106–114.
Google Scholar
Nagappan, N., & Ball, T. (2005a). Use of relative code churn measures to predict system defect density. St. Louis, MO: ICSE 2005.
Google Scholar
Nagappan, N., & Ball, T. (2005b). Static analysis tools as early indicators of pre-release defect density. St. Louis, MO: ICSE 2005.
Google Scholar
Nagappan, N., Williams, L., Osborne, J., Vouk, M., & Abrahamsson, P. (2005). Providing test quality feedback using static source code and automatic test suite metrics. Chicago, IL: International Symposium on Software Reliability Engineering.
Google Scholar
Ostrand, T. J., Weyuker, E. J., & Bell, R. M. (2005). Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4), 340–355. doi:10.1109/TSE.2005.49.
Article Google Scholar
Padberg, F., Ragg, T., & Schoknecht, R. (2004). Using machine learning for estimating the defect content after an inspection. IEEE Transactions on Software Engineering, 30(1), 17–28. doi:10.1109/TSE.2004.1265733.
Article Google Scholar
Pendharkar, P. C., Subramanian, G. H., & Rodger, J. A. (2005). A probabilistic model for predicting software development effort. IEEE Transactions on Software Engineering, 31(7), 615–624. doi:10.1109/TSE.2005.75.
Article Google Scholar
Podgurski, D., Leaon, P., Francis, P., Masri, W., Minch, M., Jiayang, S., & Wang, B. (2003). Automated support for classifying software failure reports. Portland, Oregon: ICSE 2003.
Google Scholar
Porter, A., & Votta, L. (2004). Comparing detection methods for software requirements inspections: A replication using professional subjects. Empirical Software Engineering, 3(4), 355–379.
Article Google Scholar
Sarle, W. (1996). How many hidden layer should i use. Neural Nets FAQ, http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-9.html.
Sheppard, M., & Ince, D. C. (1994). A critique of three metrics. The Journal of Systems and Software, 26(33), 197–210.
Article Google Scholar
Song, O., Sheppard, M., Cartwright, M., & Mair, C. (2006). Software defect association mining and defect correction effort prediction. IEEE Transactions on Software Engineering, 32(2), 69–82. doi:10.1109/TSE.2006.1599417.
Article Google Scholar
Sontag, E. D. (1992). Feedback stabilization using two-hidden-layer nets. IEEE Transactions on Neural Networks, 3, 981–990. doi:10.1109/72.165599.
Article Google Scholar
Sourceforge. from www.sourceforge.net.
Swingler, K. (1996). Applying neural networks: Apractical guide. London: Academic Press.
Google Scholar
Tahat, L. H., Vaysburg, B., Korel B., & Bader, A. J. (2001). A requirement-based automated black-box test generation. Proceedings of 25th Annual International Computer Software and Applications Conference, Chicago, IL, pp. 489–495.
Vaidyanathan, K., & Trivedi, S. (2005). A comprehensive model for software rejuvenation. IEEE Transactions on Dependable Secure Computing, 2(2), 124–137. doi:10.1109/TDSC.2005.15.
Article Google Scholar
Zhang, D. (2000). Applying machine learning algorithms in software development. The Proceedings of 2000 Monterey Workshop on Modeling Software System Structures, Santa Margherita Ligure, Italy.

Download references

Acknowledgements

This work is supported in part by the Boğaziçi University research fund under grant number BAP–06HA104. Special thanks to our colleague Burak Turhan for his valuable comments on the manuscript. We would also thank Ms Cigdem Aksoy Fromm who has done the final editing of the manuscript.

Author information

Authors and Affiliations

Department of Computer Engineering, Boğaziçi University, 34342 Bebek, Istanbul, 34342, Turkey
Yomi Kastro & Ayşe Basar Bener

Authors

Yomi Kastro
View author publications
You can also search for this author in PubMed Google Scholar
Ayşe Basar Bener
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yomi Kastro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kastro, Y., Bener, A.B. A defect prediction method for software versioning. Software Qual J 16, 543–562 (2008). https://doi.org/10.1007/s11219-008-9053-8

Download citation

Published: 01 May 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s11219-008-9053-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A defect prediction method for software versioning

Abstract

Access this article

Similar content being viewed by others

Comparative Analysis of Software Development Life Cycle Models (SDLC)

How different are different diff algorithms in Git?

Applications of AI in classical software engineering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A defect prediction method for software versioning

Abstract

Access this article

Similar content being viewed by others

Comparative Analysis of Software Development Life Cycle Models (SDLC)

How different are different diff algorithms in Git?

Applications of AI in classical software engineering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation