An industrial case study of classifier ensembles for locating software defects

Mısırlı, Ayşe Tosun; Bener, Ayşe Başar; Turhan, Burak

doi:10.1007/s11219-010-9128-1

An industrial case study of classifier ensembles for locating software defects

Published: 13 January 2011

Volume 19, pages 515–536, (2011)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Ayşe Tosun Mısırlı¹,
Ayşe Başar Bener² &
Burak Turhan³

653 Accesses
53 Citations
Explore all metrics

Abstract

As the application layer in embedded systems dominates over the hardware, ensuring software quality becomes a real challenge. Software testing is the most time-consuming and costly project phase, specifically in the embedded software domain. Misclassifying a safe code as defective increases the cost of projects, and hence leads to low margins. In this research, we present a defect prediction model based on an ensemble of classifiers. We have collaborated with an industrial partner from the embedded systems domain. We use our generic defect prediction models with data coming from embedded projects. The embedded systems domain is similar to mission critical software so that the goal is to catch as many defects as possible. Therefore, the expectation from a predictor is to get very high probability of detection (pd). On the other hand, most embedded systems in practice are commercial products, and companies would like to lower their costs to remain competitive in their market by keeping their false alarm (pf) rates as low as possible and improving their precision rates. In our experiments, we used data collected from our industry partners as well as publicly available data. Our results reveal that ensemble of classifiers significantly decreases pf down to 15% while increasing precision by 43% and hence, keeping balance rates at 74%. The cost-benefit analysis of the proposed model shows that it is enough to inspect 23% of the code on local datasets to detect around 70% of defects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using Ensemble of Different Classifiers for Defect Prediction

Optimized ensemble machine learning model for software bugs prediction

Article 03 December 2022

A variable-level automated defect identification model based on machine learning

Article Open access 23 March 2019

References

Adrian, R. W., Branstad, A. M., & Cherniavsky, C. J. (1982). Validation, verification and testing of computer software. ACM Computing Surveys, 14(22), 159–192.
Article Google Scholar
Alpaydın, E. (2004). Introduction to machine learning. Cambridge: The MIT Press.
Google Scholar
Amasaki, S., Takagi, Y., Mizuno, O., & Kikuno, T. (2005). Constructing a Bayesian belief network to predict final quality in embedded system development. IEICE Transactions on Information and Systems, 134, 1134–1141.
Google Scholar
Arisholm, E., & Briand, L. C. (2006). Predicting fault-prone components in a java legacy system. In ISESE ’06: Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering (pp. 8–17). ACM.
Basili, V. R., Briand, L. C., & Melo, W. L. (1996). A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering. IEEE Press, 22, 751–761
Article Google Scholar
Basili, V. R., McGarry, F. E., Pajerski, R., & Zelkowitz, M. V. (2002). Lessons learned from 25 years of process improvement: The rise and fall of the NASA software engineering laboratory. In ICSE ’02: proceedings of the 24th international conference on software engineering (pp. 69–79). ACM.
Biffl, S., Halling, M., & Kszegi, S. (2003). Investigating the accuracy of defect estimation models for individuals and teams based on inspection data. In ISESE ’03: Proceedings of the 2003 international symposium on empirical software engineering (p. 232). IEEE Computer Society.
Boetticher, G., Menzies, T., & Ostrand, T. J. (2007). The PROMISE repository of empirical software engineering data West Virginia University, Lane Department of Computer Science and Electrical Engineering.
Brooks, F. P. (1995). The mythical man-month: Essays on software engineering. Reading: Anniversary Edition Addison-Wesley
Google Scholar
Demiroz, G., & Guvenir, H. A. (1997). Classification by voting feature intervals. In ECML ’97: Proceedings of the 9th European conference on machine learning (pp. 85–92). Springer.
Fagan, M. (1976). Design and code inspections to reduce errors in program development. IBM Systems Journal, 15, 182–211.
Google Scholar
Fenton, N., Neil, M., Marsh, W., Hearty, P., Marquez, D., Krause, P. & Mishra, R. (2007). Predicting software defects in varying development lifecycles using Bayesian nets Information and Software Technology. Butterworth-Heinemann, 49, 32–43.
Google Scholar
Fenton, N., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5), 675–689
Article Google Scholar
Hall, M. A., & Holmes, G. (2003). Benchmarking attribute selection techniques for discrete class data mining IEEE transactions on knowledge and data engineering. IEEE Educational Activities Department, 15, 1437–1447.
Google Scholar
Heeger, D. (1998). Signal detection theory.
IEEE Glossary of Software Engineering Terminology. (1990). ANSI/IEEE Standard 610.12 IEEE, New York.
Jiang, Y., Cukic, B., & Menzies, T. (2008). Can data transformation help in the detection of fault-prone modules? In DEFECTS ’08: Proceedings of the 2008 workshop on defects in large software systems (pp. 16–20). ACM, New York.
Kocaguneli, E., Tosun, A., Bener, A., Turhan, B., & Caglayan, B. (2009). Prest: An intelligent software metrics extraction. In Analysis and Defect Prediction Tool, SEKE’09: Proceedings of the 21st international conference on software engineering & knowledge engineering (SEKE’2009) (pp. 526–529). Boston, MA, USA, July 1–3.
Lee, E. A. (2002). Embedded software, advances in computers 56. London: Academic Press.
Google Scholar
Kan, S. H. (2002). Metrics and models in software quality engineering. Reading: Addison-Wesley.
Google Scholar
Khoshgoftaar, T. M., & Szabo, R. M. (1996). Using neural networks to predict software faults during testing. IEEE Transactions on Reliability, 45, 456–462.
Article Google Scholar
Khoshgoftaar, T. M., & Allen, E. B. (1999). Predicting fault-prone software modules in embedded systems with classification trees. In HASE ’99: The 4th IEEE international symposium on high-assurance systems engineering (p. 105). IEEE Computer Society.
Khoshgoftaar, T. M., & Seliya, N. (2003). Fault prediction modeling for software quality estimation: Comparing commonly used techniques. Empirical Software Engineering, 8 255–283.
Google Scholar
Khoshgoftaar, T., & Seliya, N. (2004). The necessity of assuring quality in software measurement data. In METRICS ’04: Proceedings of the software metrics, 10th international symposium (pp. 119–130). IEEE Computer Society, Washington, DC, USA.
Khoshgoftaar, T., Zhong, S., & Joshi, V. (2005). Enhancing software quality estimation using ensemble-classifier based noise filtering. Intelligent Data Analysis, 9, 3–27
Google Scholar
Khoshgoftaar, T. M., & Gao, K. (2006). Assessment of a multi-strategy classifier for an embedded software system. In ICTAI ’06: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence (pp. 651–658). IEEE Computer Society.
Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.
Article Google Scholar
Kocaguneli, E., Tosun, A., Bener, A., Turhan, B., Caglayan, B. (2009). Prest: An intelligent software metrics extraction. In Analysis and Defect Prediction Tool, SEKE 2009: Proceedings of the 21st international conference on software engineering and knowledge engineering (pp. 637–642).
Koru, A. G., Zhang, D., El Emam, K., & Liu, H. (2009). An investigation into the functional form of the size-defect relationship for software modules. IEEE Transactions on Software Engineering, 35(2), 293–304
Article Google Scholar
Kuncheva, L. I. (2004). Combining pattern classifiers: Methods and algorithms. Hoboken: Wiley-Interscience.
Book MATH Google Scholar
Lessmann, S., Baesens, B., Mues, C., & Pietsch, S. (2008). Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4), 1–12.
Article Google Scholar
Libralon, G., Carvalho, A., & Lorena, A. (2009). Ensembles of pre-processing techniques for noise detection in gene expression data. In Proceedings of the 15th international conference on advances in neuro-information processing (pp. 486–493). New Zealand.
Li, Q., & Yao, C. (2003). Real-time concepts for embedded systems. San Francisco: CMP Books.
Google Scholar
Marchenko, A., & Abrahamsson, P. (2007). Predicting software defect density: A case study on automated static code analysis. In XP ’07: Proceedings of the International Conference on Agile Processes in Software Engineering and Extreme Programming (pp. 137–140). Springer.
Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, IEEE Computer Society, 32(11), 2–13
Article Google Scholar
Menzies, T., Dekhtyar, A., Distefano, J., & Greenwald, J. (2007). Problems with Precision: A response to comments on data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(9), 637–640.
Article Google Scholar
Munson, J. C., & Khoshgoftaar, T. M. (1992). The Detection of Fault-Prone Programs. IEEE Transactions on Software Engineering, IEEE Press, 18, 423–433.
Article Google Scholar
Ohlsson, N., & Wohlin, C. (1998). Experiences of fault data in a large software system. Failure and Lessons Learned in Information Technology Management, 2, 163–171.
Google Scholar
Oral, A. D., & Bener, A. (2007). Defect Prediction for Embedded Software. ISCIS ’07: Proceedings of the 22nd international symposium on computer and information sciences (pp. 1–6).
Ostrand, T. J., Weyuker E. J., & Bell, R. M. (2005). Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4), 340–355.
Article Google Scholar
Padberg, F., Ragg, T., & Schoknecht, R. (2004). Using machine learning for estimating the defect content after an inspection. IEEE Transactions on Software Engineering, IEEE Press, 30, 17–28.
Article Google Scholar
Rombach, D., & Seelisch, F. (2008). Formalisms in software engineering: Myths Vs. empirical facts, CEE-SET’07. Lecture Notes in Computer Science (LNCS), 5082, 18–25.
Google Scholar
Runeson, P., Ohlsson, M. C., & Wohlin, C. (2001). A classification scheme for studies on fault-prone components. In PROFES ’01: Proceedings of the third international conference on product focused software process improvement (pp. 341–355). Springer, Berlin.
Shull, F., Boehm, V. B., Brown, A., Costa, P., Lindvall, M., Port, D., Rus, I., Tesoriero, R., & Zelkowitz, M. (2002). What we have learned about fighting defects. In Proceedings of the eighth international software metrics symposium (pp. 249–258).
Shull, F. J., Carver, J. C., Vegas, S., & Juristo, N. (2008). The role of replications in empirical software engineering. Empirical Software Engineering Journal, 13, 211–218.
Article Google Scholar
Tosun, A., Turhan, B., & Bener, A. (2008). Ensemble of software defect predictors: A case study. In Proceedings of the 2nd international symposium on empirical software engineering and measurement (pp. 318–320).
Tosun, A., Turhan, B., & Bener, A. (2009). Practical Considerations in Deploying AI for defect prediction: A case study within the Turkish telecommunication industry. In PROMISE’09: Proceedings of the first international conference on predictor models in software engineering. Vancouver, Canada.
Twala, B., & Cartwright, M. (2010). Ensemble missing data techniques for software effort prediction. Intelligent Data Analysis, 14(3), 299–331.
Google Scholar
Twala, B., Cartwright, M., & Shepperd, M. (2006). Ensemble of missing data techniques to improve software prediction accuracy. Proceedings of International Conference on Software Engineering (pp. 909–912).
Turhan, B., & Bener, A. (2008). Analysis of naive Bayes’ assumptions on software fault data: An empirical study. Data and Knowledge Engineering Journal, 68, 278–290.
Article Google Scholar
Turhan, B., & Bener, A. (2007). Software defect prediction: Heuristics for weighted naive bayes. In Proceedings of the 2nd international conference on software and data technologies (ICSOFT’07) (pp. 244–249).
Turhan, B., Menzies, T., Bener, A., & Distefano, J. (2009). On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering Journal, 14(5), 540–578.
Article Google Scholar
Verbaeten, S., & Assche, A. V. (2003). Ensemble methods for noise elimination in classification problems. In Proceedings of the 4th international conference on multiple classifier systems (pp. 317–325). UK.
Wohlin, C., Aurum, A., Petersson, H., Shull, F., & Ciolkowski, M. (2002). Software inspection benchmarking—A qualitative and quantitative comparative opportunity. In METRICS ’02: Proceedings of the 8th international symposium on software metrics (pp. 118–127). IEEE Computer Society.
Xu, W., Qin, Z., Ji, L., & Chang, Y. (2009). A feature weighted ensemble classifier on stream data. In Proceedings of international conference on computational intelligence and software engineering (pp. 1–5). China.
Zhang, H., & Zhang, X. (2007). Comments on data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(9), 635–637.
Article MATH Google Scholar

Download references

Acknowledgments

This research is supported in part by Turkish State Planning Organization (DPT) under project number 2007K120610.

Author information

Authors and Affiliations

Department of Computer Engineering, Boğaziçi University, 34342, Bebek, Istanbul, Turkey
Ayşe Tosun Mısırlı
Ted Rogers School of Information Technology Management, Ryerson University, M5B 2K3, Toronto, Canada
Ayşe Başar Bener
Department of Information Processing Science, University of Oulu, Oulu, Finland
Burak Turhan

Authors

Ayşe Tosun Mısırlı
View author publications
You can also search for this author in PubMed Google Scholar
Ayşe Başar Bener
View author publications
You can also search for this author in PubMed Google Scholar
Burak Turhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayşe Tosun Mısırlı.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mısırlı, A.T., Bener, A.B. & Turhan, B. An industrial case study of classifier ensembles for locating software defects. Software Qual J 19, 515–536 (2011). https://doi.org/10.1007/s11219-010-9128-1

Download citation

Published: 13 January 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s11219-010-9128-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An industrial case study of classifier ensembles for locating software defects

Abstract

Access this article

Similar content being viewed by others

Using Ensemble of Different Classifiers for Defect Prediction

Optimized ensemble machine learning model for software bugs prediction

A variable-level automated defect identification model based on machine learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An industrial case study of classifier ensembles for locating software defects

Abstract

Access this article

Similar content being viewed by others

Using Ensemble of Different Classifiers for Defect Prediction

Optimized ensemble machine learning model for software bugs prediction

A variable-level automated defect identification model based on machine learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation