Skip to main content
Log in

Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

  • Original Article
  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The primary aim of risk-based software quality classification models is to detect, prior to testing or operations, components that are most-likely to be of high-risk. Their practical usage as quality assurance tools is gauged by the prediction-accuracy and cost-effective aspects of the models. Classifying modules into two risk groups is the more commonly practiced trend. Such models assume that all modules predicted as high-risk will be subjected to quality improvements. Due to the always-limited reliability improvement resources and the variability of the quality risk-factor, a more focused classification model may be desired to achieve cost-effective software quality assurance goals. In such cases, calibrating a three-group (high-risk, medium-risk, and low-risk) classification model is more rewarding. We present an innovative method that circumvents the complexities, computational overhead, and difficulties involved in calibrating pure or direct three-group classification models. With the application of the proposed method, practitioners can utilize an existing two-group classification algorithm thrice in order to yield the three risk-based classes. An empirical approach is taken to investigate the effectiveness and validity of the proposed technique. Some commonly used classification techniques are studied to demonstrate the proposed methodology. They include, the C4.5 decision tree algorithm, discriminant analysis, and case-based reasoning. For the first two, we compare the three-group model calibrated using the respective techniques with the one built by applying the proposed method. Any two-group classification technique can be employed by the proposed method, including those that do not provide a direct three-group classification model, e.x., logistic regression and certain binary classification trees, such as CART. Based on a case study of a large-scale industrial software system, it is observed that the proposed method yielded promising results. For a given classification technique, the expected cost of misclassification of the proposed three-group models were significantly better (generally) when compared to the technique’s direct three-group model. In addition, the proposed method is also evaluated against an alternate indirect three-group classification method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Basili, V. R., Briand, L. C., and Melo, W. L. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22(10): 751–761.

    Google Scholar 

  • Beizer, B. 1990. Software Testing Techniques. 2nd edition. New York, NY, USA: ITP Van Nostrand Rienhold.

    Google Scholar 

  • Berenson, M. L., Levine, D. M., and Goldstein, M. 1983. Intermediate Statistical Methods and Applications: A Computer Package Approach. Englewood Cliffs, NJ, USA: Prentice Hall.

    Google Scholar 

  • Bhupathiraju, S. S. 2002. An empirical study of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.

  • Briand, L. C., Basili, V. R., and Hetmanski, C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.

    Google Scholar 

  • Ebert, C. 1996. Classification techniques for metric-based software development. Software Quality Journal 5(4): 255–272.

    Google Scholar 

  • Fayyad, U. M. 1996. Data mining and knowledge discovery: making sense out of data. IEEE Expert 11(4): 20–25.

    Google Scholar 

  • Fenton, N. E., and Pfleeger, S. L. 1997. Software Metrics: A Rigorous and Practical Approach. 2nd edition. Boston, MA, USA: PWS Publishing Company: ITP.

    Google Scholar 

  • Gray, A. R., and MacDonell, S. G. 1999. Software metrics data analysis: exploring the relative performance of some commonly used modeling techniques. Empirical Software Engineering Journal 4: 297–316.

    Google Scholar 

  • Hochman, R., Khoshgoftaar, T. M., Allen, E. B., and Hudepohl, J. P. 1997. Evolutionary neural networks: a robust approach to software reliability problems. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 13–26.

  • Khoshgoftaar, T. M., and Allen, E. B. 1999. Logistic regression modeling of software quality. International Journal of Reliability, Quality and Safety Engineering 6(4): 303–317.

    Google Scholar 

  • Khoshgoftaar, T. M., and Allen, E. B. 2000. A practical classification rule for software quality models. IEEE Transactions on Reliability 49(2): 209–216.

    Google Scholar 

  • Khoshgoftaar, T. M., and Allen, E. B. 2001. Modeling software quality with classification trees. In: H. Pham, (ed), Recent Advances in Reliability and Quality Engineering, Singapore: World Scientific Publishing, pp. 247–270, Chapt. 15.

    Google Scholar 

  • Khoshgoftaar, T. M., and Lanning, D. L. 1995. A neural network approach for early detection of program modules having high risk in the maintenance phase. Journal of Systems and Software 29(1): 85–91.

    Google Scholar 

  • Khoshgoftaar, T. M., and Seliya, N. 2002. Improving usefulness of software quality classification models based on boolean discriminant functions. Proceedings: 13th International Symposium on Software Reliability Engineering. Annapolis, MD, USA, pp. 221–230.

  • Khoshgoftaar, T. M., and Seliya, N. 2003. Analogy-based practical classification rules for software quality estimation. Empirical Software Engineering Journal 8(4): 325–350.

    Google Scholar 

  • Khoshgoftaar, T. M. , Allen, E. B., and Busboom, J. C. 2000a. Modeling software quality: The software measurement analysis and reliability toolkit. Proceedings: 12th International Conference on Tools with Artificial Intelligence. Vancouver, BC, Canada, pp. 54–61.

  • Khoshgoftaar, T. M., Allen, E. B., Jones, W. D., and Hudepohl, J. P. 2000b. Accuracy of software quality models over multiple releases. Annals of Software Engineering 9(1–4): 103–116. Kluwer Academic Publishers.

    Google Scholar 

  • Khoshgoftaar, T. M., Yuan, X., and Allen, E. B. 2000c. Balancing misclassification rates in classification tree models of software quality. Empirical Software Engineering Journal 5: 313–330. Kluwer Academic Publishers.

    MATH  Google Scholar 

  • Khoshgoftaar, T. M., Allen, E. B., and Deng, J. 2002. Using regression trees to classify fault-prone software modules. IEEE Transactions on Reliability 51(4): 455–462.

    Google Scholar 

  • Kolodner, J. 1993. Case-Based Reasoning. San Mateo, CA, USA: Morgan Kaufmann Publishers Inc.

    Google Scholar 

  • Lanning, D. L., and Khoshgoftaar, T. M. 1995. The impact of software enhancement on software reliability. IEEE Transactions on Reliability 44(4): 677–682.

    Google Scholar 

  • Leake, D. B. 1996. Case-Based Reasoning: Experience, Lessons, and Future Directions. Cambridge, MA, USA: MIT Press.

    Google Scholar 

  • Michalski, R. S., Bratko, I., and Kubat, M. 1998. Machine Learning and Data Mining: Methods and Applications. New York, NY: John Wiley and Sons.

  • Ohlsson, M. C., and Runeson, P. 2002. Experience from replicating empirical studies on prediction models. Proceedings: 8th International Software Metrics Symposium. Ottawa, Ontario, Canada, pp. 217–226.

  • Ohlsson, M. C., and Wohlin, C. 1998. Identification of green, yellow and red legacy components. Proceedings: International Conference on Software Maintenance. Bethesda, Washington D.C., USA, pp. 6–15.

  • Ohlsson, N., Helander, M., and Wohlin, C. 1996. Quality improvement by identification of fault-prone modules using software design metrics. Proceedings: International Conference on Software Quality. Ottawa, Ontario, Canada, pp. 1–13.

  • Ohlsson, M. C., Mayrhauser, A. V., McGuire, B., and Wohlin, C. 1999. Code decay analysis of legacy software through successive releases. Proceedings: Aerospace Conference (Volume 5), Vol. 5. Aspen, CO, USA, pp. 69–81.

  • Ponnuswamy, V. 2001. Classification of software quality with tree modeling using C4.5 algorithm. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.

  • Porter, A. A., and Selby, R. W. 1990. Empirically guided software development using metric-based classification trees. IEEE Software 7(2): 46–54.

    Google Scholar 

  • Quinlan, J. R. 1993. C4.5: Programs for Machine Learning, Machine Learning. San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Runeson, P., Ohlsson, M. C., Wohlin, C. 2001. A classification scheme for studies on fault-prone components. Lecture Notes in Computer Science 2188: 341–355. Springer Link.

    Article  Google Scholar 

  • Schneidewind, N. F. 1997. Software metrics model for integrating quality control and prediction. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 402–415.

  • Schneidewind, N. F. 2001. Investigation of logistic regression as a discriminant of software quality. Proceedings: 7th International Software Metrics Symposium. London, UK, pp. 328–337.

  • Seber, G. A. F. 1984. Multivariate Observations. New York, NY, USA: John Wiley and Sons.

    MATH  Google Scholar 

  • Shepperd, M., and Kadoda, G. 2001. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering 27(11): 1014–1022.

    Google Scholar 

  • Song, H. 2001. Implementation of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by T. M. Khoshgoftaar.

  • Szabo, R. M. 1995. Improved models of software quality. Ph.D. thesis. Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.

  • Szabo, R. M., and Khoshgoftaar, T. M. 2000. Classifying software modules into three risk groups. In H. Pham and M.-W. Lu, (eds.), Proceedings: 6th International Conference on Reliability and Quality in Design. Orlando, FL, USA, pp. 90–95.

  • Takahashi, R., Muraoka, Y., and Nakamura, Y. 1997. Building software quality classification trees: Approach, experimentation, evaluation. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 222–233.

  • Votta, L. G., and Porter, A. A. 1995. Experimental software engineering: A report on the state of the art. Proceedings of the 17th. International Conference on Software Engineering. Seattle, WA, USA, pp. 277–279.

  • Wohlin, C., Runeson, P., Host, M., Ohlsson, M. C., Regnell, B., and Wesslen, A. 2000. Experimentation in Software Engineering: An Introduction, Kluwer International Series in Software Engineering. Massachuesetts, USA: Kluwer Academic Publishers.

    Google Scholar 

  • Xu, Z., and Khoshgoftaar, T. M. 2001. Software quality prediction for high assurance network telecommunications systems. The Computer Journal 44(6): 557–568. British Computer Society.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taghi M. Khoshgoftaar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khoshgoftaar, T.M., Seliya, N. & Gao, K. Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study. Empir Software Eng 10, 183–218 (2005). https://doi.org/10.1007/s10664-004-6191-x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-004-6191-x

Keywords

Navigation