Abstract
The primary aim of risk-based software quality classification models is to detect, prior to testing or operations, components that are most-likely to be of high-risk. Their practical usage as quality assurance tools is gauged by the prediction-accuracy and cost-effective aspects of the models. Classifying modules into two risk groups is the more commonly practiced trend. Such models assume that all modules predicted as high-risk will be subjected to quality improvements. Due to the always-limited reliability improvement resources and the variability of the quality risk-factor, a more focused classification model may be desired to achieve cost-effective software quality assurance goals. In such cases, calibrating a three-group (high-risk, medium-risk, and low-risk) classification model is more rewarding. We present an innovative method that circumvents the complexities, computational overhead, and difficulties involved in calibrating pure or direct three-group classification models. With the application of the proposed method, practitioners can utilize an existing two-group classification algorithm thrice in order to yield the three risk-based classes. An empirical approach is taken to investigate the effectiveness and validity of the proposed technique. Some commonly used classification techniques are studied to demonstrate the proposed methodology. They include, the C4.5 decision tree algorithm, discriminant analysis, and case-based reasoning. For the first two, we compare the three-group model calibrated using the respective techniques with the one built by applying the proposed method. Any two-group classification technique can be employed by the proposed method, including those that do not provide a direct three-group classification model, e.x., logistic regression and certain binary classification trees, such as CART. Based on a case study of a large-scale industrial software system, it is observed that the proposed method yielded promising results. For a given classification technique, the expected cost of misclassification of the proposed three-group models were significantly better (generally) when compared to the technique’s direct three-group model. In addition, the proposed method is also evaluated against an alternate indirect three-group classification method.
Similar content being viewed by others
References
Basili, V. R., Briand, L. C., and Melo, W. L. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22(10): 751–761.
Beizer, B. 1990. Software Testing Techniques. 2nd edition. New York, NY, USA: ITP Van Nostrand Rienhold.
Berenson, M. L., Levine, D. M., and Goldstein, M. 1983. Intermediate Statistical Methods and Applications: A Computer Package Approach. Englewood Cliffs, NJ, USA: Prentice Hall.
Bhupathiraju, S. S. 2002. An empirical study of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Briand, L. C., Basili, V. R., and Hetmanski, C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.
Ebert, C. 1996. Classification techniques for metric-based software development. Software Quality Journal 5(4): 255–272.
Fayyad, U. M. 1996. Data mining and knowledge discovery: making sense out of data. IEEE Expert 11(4): 20–25.
Fenton, N. E., and Pfleeger, S. L. 1997. Software Metrics: A Rigorous and Practical Approach. 2nd edition. Boston, MA, USA: PWS Publishing Company: ITP.
Gray, A. R., and MacDonell, S. G. 1999. Software metrics data analysis: exploring the relative performance of some commonly used modeling techniques. Empirical Software Engineering Journal 4: 297–316.
Hochman, R., Khoshgoftaar, T. M., Allen, E. B., and Hudepohl, J. P. 1997. Evolutionary neural networks: a robust approach to software reliability problems. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 13–26.
Khoshgoftaar, T. M., and Allen, E. B. 1999. Logistic regression modeling of software quality. International Journal of Reliability, Quality and Safety Engineering 6(4): 303–317.
Khoshgoftaar, T. M., and Allen, E. B. 2000. A practical classification rule for software quality models. IEEE Transactions on Reliability 49(2): 209–216.
Khoshgoftaar, T. M., and Allen, E. B. 2001. Modeling software quality with classification trees. In: H. Pham, (ed), Recent Advances in Reliability and Quality Engineering, Singapore: World Scientific Publishing, pp. 247–270, Chapt. 15.
Khoshgoftaar, T. M., and Lanning, D. L. 1995. A neural network approach for early detection of program modules having high risk in the maintenance phase. Journal of Systems and Software 29(1): 85–91.
Khoshgoftaar, T. M., and Seliya, N. 2002. Improving usefulness of software quality classification models based on boolean discriminant functions. Proceedings: 13th International Symposium on Software Reliability Engineering. Annapolis, MD, USA, pp. 221–230.
Khoshgoftaar, T. M., and Seliya, N. 2003. Analogy-based practical classification rules for software quality estimation. Empirical Software Engineering Journal 8(4): 325–350.
Khoshgoftaar, T. M. , Allen, E. B., and Busboom, J. C. 2000a. Modeling software quality: The software measurement analysis and reliability toolkit. Proceedings: 12th International Conference on Tools with Artificial Intelligence. Vancouver, BC, Canada, pp. 54–61.
Khoshgoftaar, T. M., Allen, E. B., Jones, W. D., and Hudepohl, J. P. 2000b. Accuracy of software quality models over multiple releases. Annals of Software Engineering 9(1–4): 103–116. Kluwer Academic Publishers.
Khoshgoftaar, T. M., Yuan, X., and Allen, E. B. 2000c. Balancing misclassification rates in classification tree models of software quality. Empirical Software Engineering Journal 5: 313–330. Kluwer Academic Publishers.
Khoshgoftaar, T. M., Allen, E. B., and Deng, J. 2002. Using regression trees to classify fault-prone software modules. IEEE Transactions on Reliability 51(4): 455–462.
Kolodner, J. 1993. Case-Based Reasoning. San Mateo, CA, USA: Morgan Kaufmann Publishers Inc.
Lanning, D. L., and Khoshgoftaar, T. M. 1995. The impact of software enhancement on software reliability. IEEE Transactions on Reliability 44(4): 677–682.
Leake, D. B. 1996. Case-Based Reasoning: Experience, Lessons, and Future Directions. Cambridge, MA, USA: MIT Press.
Michalski, R. S., Bratko, I., and Kubat, M. 1998. Machine Learning and Data Mining: Methods and Applications. New York, NY: John Wiley and Sons.
Ohlsson, M. C., and Runeson, P. 2002. Experience from replicating empirical studies on prediction models. Proceedings: 8th International Software Metrics Symposium. Ottawa, Ontario, Canada, pp. 217–226.
Ohlsson, M. C., and Wohlin, C. 1998. Identification of green, yellow and red legacy components. Proceedings: International Conference on Software Maintenance. Bethesda, Washington D.C., USA, pp. 6–15.
Ohlsson, N., Helander, M., and Wohlin, C. 1996. Quality improvement by identification of fault-prone modules using software design metrics. Proceedings: International Conference on Software Quality. Ottawa, Ontario, Canada, pp. 1–13.
Ohlsson, M. C., Mayrhauser, A. V., McGuire, B., and Wohlin, C. 1999. Code decay analysis of legacy software through successive releases. Proceedings: Aerospace Conference (Volume 5), Vol. 5. Aspen, CO, USA, pp. 69–81.
Ponnuswamy, V. 2001. Classification of software quality with tree modeling using C4.5 algorithm. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Porter, A. A., and Selby, R. W. 1990. Empirically guided software development using metric-based classification trees. IEEE Software 7(2): 46–54.
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning, Machine Learning. San Mateo, CA: Morgan Kaufmann.
Runeson, P., Ohlsson, M. C., Wohlin, C. 2001. A classification scheme for studies on fault-prone components. Lecture Notes in Computer Science 2188: 341–355. Springer Link.
Schneidewind, N. F. 1997. Software metrics model for integrating quality control and prediction. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 402–415.
Schneidewind, N. F. 2001. Investigation of logistic regression as a discriminant of software quality. Proceedings: 7th International Software Metrics Symposium. London, UK, pp. 328–337.
Seber, G. A. F. 1984. Multivariate Observations. New York, NY, USA: John Wiley and Sons.
Shepperd, M., and Kadoda, G. 2001. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering 27(11): 1014–1022.
Song, H. 2001. Implementation of a three-group classification model using case-based reasoning. Master’s thesis, Florida Atlantic University, Boca Raton, FL, USA. Advised by T. M. Khoshgoftaar.
Szabo, R. M. 1995. Improved models of software quality. Ph.D. thesis. Florida Atlantic University, Boca Raton, FL, USA. Advised by Taghi M. Khoshgoftaar.
Szabo, R. M., and Khoshgoftaar, T. M. 2000. Classifying software modules into three risk groups. In H. Pham and M.-W. Lu, (eds.), Proceedings: 6th International Conference on Reliability and Quality in Design. Orlando, FL, USA, pp. 90–95.
Takahashi, R., Muraoka, Y., and Nakamura, Y. 1997. Building software quality classification trees: Approach, experimentation, evaluation. Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, pp. 222–233.
Votta, L. G., and Porter, A. A. 1995. Experimental software engineering: A report on the state of the art. Proceedings of the 17th. International Conference on Software Engineering. Seattle, WA, USA, pp. 277–279.
Wohlin, C., Runeson, P., Host, M., Ohlsson, M. C., Regnell, B., and Wesslen, A. 2000. Experimentation in Software Engineering: An Introduction, Kluwer International Series in Software Engineering. Massachuesetts, USA: Kluwer Academic Publishers.
Xu, Z., and Khoshgoftaar, T. M. 2001. Software quality prediction for high assurance network telecommunications systems. The Computer Journal 44(6): 557–568. British Computer Society.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Khoshgoftaar, T.M., Seliya, N. & Gao, K. Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study. Empir Software Eng 10, 183–218 (2005). https://doi.org/10.1007/s10664-004-6191-x
Issue Date:
DOI: https://doi.org/10.1007/s10664-004-6191-x