Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Khoshgoftaar, Taghi M.; Seliya, Naeem; Gao, Kehan

doi:10.1007/s10664-004-6191-x

Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Original Article
Published: April 2005

Volume 10, pages 183–218, (2005)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Taghi M. Khoshgoftaar¹,
Naeem Seliya¹ &
Kehan Gao¹

189 Accesses
Explore all metrics

Abstract

The primary aim of risk-based software quality classification models is to detect, prior to testing or operations, components that are most-likely to be of high-risk. Their practical usage as quality assurance tools is gauged by the prediction-accuracy and cost-effective aspects of the models. Classifying modules into two risk groups is the more commonly practiced trend. Such models assume that all modules predicted as high-risk will be subjected to quality improvements. Due to the always-limited reliability improvement resources and the variability of the quality risk-factor, a more focused classification model may be desired to achieve cost-effective software quality assurance goals. In such cases, calibrating a three-group (high-risk, medium-risk, and low-risk) classification model is more rewarding. We present an innovative method that circumvents the complexities, computational overhead, and difficulties involved in calibrating pure or direct three-group classification models. With the application of the proposed method, practitioners can utilize an existing two-group classification algorithm thrice in order to yield the three risk-based classes. An empirical approach is taken to investigate the effectiveness and validity of the proposed technique. Some commonly used classification techniques are studied to demonstrate the proposed methodology. They include, the C4.5 decision tree algorithm, discriminant analysis, and case-based reasoning. For the first two, we compare the three-group model calibrated using the respective techniques with the one built by applying the proposed method. Any two-group classification technique can be employed by the proposed method, including those that do not provide a direct three-group classification model, e.x., logistic regression and certain binary classification trees, such as CART. Based on a case study of a large-scale industrial software system, it is observed that the proposed method yielded promising results. For a given classification technique, the expected cost of misclassification of the proposed three-group models were significantly better (generally) when compared to the technique’s direct three-group model. In addition, the proposed method is also evaluated against an alternate indirect three-group classification method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminating features-based cost-sensitive approach for software defect prediction

Article Open access 12 July 2021

Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach

Balancing the trade-off between accuracy and interpretability in software defect prediction

Article 25 July 2018

References

Basili, V. R., Briand, L. C., and Melo, W. L. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22(10): 751–761.
Google Scholar
Beizer, B. 1990. Software Testing Techniques. 2nd edition. New York, NY, USA: ITP Van Nostrand Rienhold.
Google Scholar
Berenson, M. L., Levine, D. M., and Goldstein, M. 1983. Intermediate Statistical Methods and Applications: A Computer Package Approach. Englewood Cliffs, NJ, USA: Prentice Hall.
Google Scholar
Bhupathiraju, S. S. 2002. An empirical study of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Briand, L. C., Basili, V. R., and Hetmanski, C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.
Google Scholar
Ebert, C. 1996. Classification techniques for metric-based software development. Software Quality Journal 5(4): 255–272.
Google Scholar
Fayyad, U. M. 1996. Data mining and knowledge discovery: making sense out of data. IEEE Expert 11(4): 20–25.
Google Scholar
Fenton, N. E., and Pfleeger, S. L. 1997. Software Metrics: A Rigorous and Practical Approach. 2nd edition. Boston, MA, USA: PWS Publishing Company: ITP.
Google Scholar
Gray, A. R., and MacDonell, S. G. 1999. Software metrics data analysis: exploring the relative performance of some commonly used modeling techniques. Empirical Software Engineering Journal 4: 297–316.
Google Scholar
Hochman, R., Khoshgoftaar, T. M., Allen, E. B., and Hudepohl, J. P. 1997. Evolutionary neural networks: a robust approach to software reliability problems. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 13–26.
Khoshgoftaar, T. M., and Allen, E. B. 1999. Logistic regression modeling of software quality. International Journal of Reliability, Quality and Safety Engineering 6(4): 303–317.
Google Scholar
Khoshgoftaar, T. M., and Allen, E. B. 2000. A practical classification rule for software quality models. IEEE Transactions on Reliability 49(2): 209–216.
Google Scholar
Khoshgoftaar, T. M., and Allen, E. B. 2001. Modeling software quality with classification trees. In: H. Pham, (ed), Recent Advances in Reliability and Quality Engineering, Singapore: World Scientific Publishing, pp. 247–270, Chapt. 15.
Google Scholar
Khoshgoftaar, T. M., and Lanning, D. L. 1995. A neural network approach for early detection of program modules having high risk in the maintenance phase. Journal of Systems and Software 29(1): 85–91.
Google Scholar
Khoshgoftaar, T. M., and Seliya, N. 2002. Improving usefulness of software quality classification models based on boolean discriminant functions. Proceedings: 13th International Symposium on Software Reliability Engineering. Annapolis, MD, USA, pp. 221–230.
Khoshgoftaar, T. M., and Seliya, N. 2003. Analogy-based practical classification rules for software quality estimation. Empirical Software Engineering Journal 8(4): 325–350.
Google Scholar
Khoshgoftaar, T. M. , Allen, E. B., and Busboom, J. C. 2000a. Modeling software quality: The software measurement analysis and reliability toolkit. Proceedings: 12th International Conference on Tools with Artificial Intelligence. Vancouver, BC, Canada, pp. 54–61.
Khoshgoftaar, T. M., Allen, E. B., Jones, W. D., and Hudepohl, J. P. 2000b. Accuracy of software quality models over multiple releases. Annals of Software Engineering 9(1–4): 103–116. Kluwer Academic Publishers.
Google Scholar
Khoshgoftaar, T. M., Yuan, X., and Allen, E. B. 2000c. Balancing misclassification rates in classification tree models of software quality. Empirical Software Engineering Journal 5: 313–330. Kluwer Academic Publishers.
MATH Google Scholar
Khoshgoftaar, T. M., Allen, E. B., and Deng, J. 2002. Using regression trees to classify fault-prone software modules. IEEE Transactions on Reliability 51(4): 455–462.
Google Scholar
Kolodner, J. 1993. Case-Based Reasoning. San Mateo, CA, USA: Morgan Kaufmann Publishers Inc.
Google Scholar
Lanning, D. L., and Khoshgoftaar, T. M. 1995. The impact of software enhancement on software reliability. IEEE Transactions on Reliability 44(4): 677–682.
Google Scholar
Leake, D. B. 1996. Case-Based Reasoning: Experience, Lessons, and Future Directions. Cambridge, MA, USA: MIT Press.
Google Scholar
Michalski, R. S., Bratko, I., and Kubat, M. 1998. Machine Learning and Data Mining: Methods and Applications. New York, NY: John Wiley and Sons.
Ohlsson, M. C., and Runeson, P. 2002. Experience from replicating empirical studies on prediction models. Proceedings: 8th International Software Metrics Symposium. Ottawa, Ontario, Canada, pp. 217–226.
Ohlsson, M. C., and Wohlin, C. 1998. Identification of green, yellow and red legacy components. Proceedings: International Conference on Software Maintenance. Bethesda, Washington D.C., USA, pp. 6–15.
Ohlsson, N., Helander, M., and Wohlin, C. 1996. Quality improvement by identification of fault-prone modules using software design metrics. Proceedings: International Conference on Software Quality. Ottawa, Ontario, Canada, pp. 1–13.
Ohlsson, M. C., Mayrhauser, A. V., McGuire, B., and Wohlin, C. 1999. Code decay analysis of legacy software through successive releases. Proceedings: Aerospace Conference (Volume 5), Vol. 5. Aspen, CO, USA, pp. 69–81.
Ponnuswamy, V. 2001. Classification of software quality with tree modeling using C4.5 algorithm. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Porter, A. A., and Selby, R. W. 1990. Empirically guided software development using metric-based classification trees. IEEE Software 7(2): 46–54.
Google Scholar
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning, Machine Learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Runeson, P., Ohlsson, M. C., Wohlin, C. 2001. A classification scheme for studies on fault-prone components. Lecture Notes in Computer Science 2188: 341–355. Springer Link.
Article Google Scholar
Schneidewind, N. F. 1997. Software metrics model for integrating quality control and prediction. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 402–415.
Schneidewind, N. F. 2001. Investigation of logistic regression as a discriminant of software quality. Proceedings: 7th International Software Metrics Symposium. London, UK, pp. 328–337.
Seber, G. A. F. 1984. Multivariate Observations. New York, NY, USA: John Wiley and Sons.
MATH Google Scholar
Shepperd, M., and Kadoda, G. 2001. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering 27(11): 1014–1022.
Google Scholar
Song, H. 2001. Implementation of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by T. M. Khoshgoftaar.
Szabo, R. M. 1995. Improved models of software quality. Ph.D. thesis. Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Szabo, R. M., and Khoshgoftaar, T. M. 2000. Classifying software modules into three risk groups. In H. Pham and M.-W. Lu, (eds.), Proceedings: 6th International Conference on Reliability and Quality in Design. Orlando, FL, USA, pp. 90–95.
Takahashi, R., Muraoka, Y., and Nakamura, Y. 1997. Building software quality classification trees: Approach, experimentation, evaluation. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 222–233.
Votta, L. G., and Porter, A. A. 1995. Experimental software engineering: A report on the state of the art. Proceedings of the 17th. International Conference on Software Engineering. Seattle, WA, USA, pp. 277–279.
Wohlin, C., Runeson, P., Host, M., Ohlsson, M. C., Regnell, B., and Wesslen, A. 2000. Experimentation in Software Engineering: An Introduction, Kluwer International Series in Software Engineering. Massachuesetts, USA: Kluwer Academic Publishers.
Google Scholar
Xu, Z., and Khoshgoftaar, T. M. 2001. Software quality prediction for high assurance network telecommunications systems. The Computer Journal 44(6): 557–568. British Computer Society.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431, USA
Taghi M. Khoshgoftaar, Naeem Seliya & Kehan Gao

Authors

Taghi M. Khoshgoftaar
View author publications
You can also search for this author in PubMed Google Scholar
Naeem Seliya
View author publications
You can also search for this author in PubMed Google Scholar
Kehan Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taghi M. Khoshgoftaar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khoshgoftaar, T.M., Seliya, N. & Gao, K. Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study. Empir Software Eng 10, 183–218 (2005). https://doi.org/10.1007/s10664-004-6191-x

Download citation

Issue Date: April 2005
DOI: https://doi.org/10.1007/s10664-004-6191-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discriminating features-based cost-sensitive approach for software defect prediction

Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach

Balancing the trade-off between accuracy and interpretability in software defect prediction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discriminating features-based cost-sensitive approach for software defect prediction

Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach

Balancing the trade-off between accuracy and interpretability in software defect prediction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation