Three empirical studies on predicting software maintainability using ensemble methods

Elish, Mahmoud O.; Aljamaan, Hamoud; Ahmad, Irfan

doi:10.1007/s00500-014-1576-2

Three empirical studies on predicting software maintainability using ensemble methods

Focus
Published: 08 January 2015

Volume 19, pages 2511–2524, (2015)
Cite this article

Soft Computing Aims and scope Submit manuscript

Mahmoud O. Elish¹,
Hamoud Aljamaan¹ &
Irfan Ahmad¹

906 Accesses
47 Citations
Explore all metrics

Abstract

More accurate prediction of software maintenance effort contributes to better management and control of software maintenance. Several research studies have recently investigated the use of computational intelligence models for software maintainability prediction. The performance of these models, however, may vary from dataset to dataset. Consequently, ensemble methods have become increasingly popular as they take advantage of the capabilities of their constituent computational intelligence models toward a dataset to come up with more accurate or at least competitive prediction accuracy compared to individual models. This paper investigates and empirically evaluates different homogenous and heterogeneous ensemble methods in predicting software maintenance effort and change proneness. Three major empirical studies were designed and conducted taken into consideration different design such as the types of the investigated ensembles methods, types of prediction problems, used datasets, and other experimental setup. Overall empirical evidence obtained from the three studies confirms that some ensemble methods provide more accurate or at least competitive prediction accuracy compared to individual models across datasets, and thus they are more reliable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Ensemble of Computational Intelligence Models for Software Maintenance Effort Prediction

Software Reliability Prediction with Ensemble Method and Virtual Data Point Incorporation

Software enhancement effort estimation using correlation-based feature selection and stacking ensemble method

Article 23 November 2021

References

Ahmed M, Al-Jamimi H (2013) Machine learning approaches for predicting software maintainability: a fuzzy-based transparent model. IET Softw 7(6):317–326
Article Google Scholar
Al-Dallal J (2013) Object-oriented class maintainability prediction using internal quality attributes. Inf Softw Technol 55:2028–2048
Article Google Scholar
Aljamaan H, Elish M (2009) An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: IEEE symposium on computational intelligence and data mining, pp 187–194
Aljamaan H, Elish M, Ahmad I (2013) An ensemble of computational intelligence models for software maintenance effort prediction. In: 12th International work conference on artificial neural networks (IWANN 2013), part I, LNCS 7902, pp 592–603
Bandi R, Vaishnavi V, Turk D (2003) Predicting maintenance performance using object-oriented design complexity metrics. IEEE Trans Softw Eng 29(1):77–87
Article Google Scholar
Banfield R, Hall L, Bowyer K, Kegelmeyer W (2007) A comparison of decision tree ensemble creation techniques. IEEE Trans Pattern Anal Mach Intell 29(1):173–180
Article Google Scholar
Bittencourt V, Abreu M, Souto M, Canuto A (2005) An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction. In: International joint conference on neural networks, pp 527–531
Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159
Article Google Scholar
Braga P, Oliveira A, Ribeiro G, Meira S (2007) Bagging predictors for estimation of software project effort. In: International joint conference on neural networks, pp 1595–1600
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MathSciNet Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Briand L, Bunse C, Daly J (2001) A controlled experiment for evaluating quality guidelines on the maintainability of object-oriented designs. IEEE Trans Softw Eng 27(6):513–530
Article Google Scholar
Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Article Google Scholar
Conte S, Dunsmore H, Shen V (1986) Software engineering metrics and models. Benjamin/Cummings, Menlo Park
Google Scholar
De Lucia A, Pompella E, Stefanucci S (2005) Assessing effort estimation models for corrective maintenance through empirical studies. Inf Softw Technol 47(1):3–15
Article Google Scholar
DTREG, Predictive modeling software by Phillip Sherrod. http://www.dtreg.com. Accessed 5 Jan 2014
Elish M, Al-Khiaty M (2013) A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software. J Softw Evol Process 25(5):407–437
Article Google Scholar
Elish M, Elish K (2009) Application of TreeNet in predicting object-oriented software maintainability: a comparative study. In: 13th European conference on software maintenance and reengineering (CSMR ’09), pp 69–78
Elish M, Helmy T, Hussain M (2013) Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math Probl Eng 2013:1–21. doi:10.1155/2013/312067
Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst 13(2):87–129
Google Scholar
Fioravanti F, Nesi P (2001) Estimation and prediction metrics for adaptive maintenance effort of object-oriented systems. IEEE Trans Softw Eng 27(12):1062–1084
Article Google Scholar
Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121(2):256–285
Article Google Scholar
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: European conference on computational learning theory, pp 23–37
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Thirteenth international conference on machine learning, Italy, pp 148–156
Gutta S, Wechsler H (1996) Face recognition using hybrid classifier systems. In: IEEE international conference on neural networks, pp 1017–1022
Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001
Article Google Scholar
Hartigan J, Wong M (1979) Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
Google Scholar
Hashem S, Schmeiser B, Yih Y (1994) Optimal linear combinations of neural networks. Neural Netw 3:1507–1512
Google Scholar
Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, New Jersey
Google Scholar
Huang FJ, Zhou Z, Zhang H-J, Chen T (2000) Pose invariant face recognition. In: Proceedings of the 4th IEEE international conference on automatic face and gesture recognition, France, pp 245–250
Khoshgoftaar T, Geleyn E, Nguyen L (2003) Empirical case studies of combining software quality classification models. In: Third international conference on quality software, p 40
Kiran N, Ravi V (2008) Software reliability prediction by soft computing techniques. J Syst Softw 81(4):576–583
Article Google Scholar
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI), pp 1137–1143
Koten C, Gray A (2006) An application of Bayesian network for predicting object-oriented software maintainability. Inf Softw Technol 48(1):59–67
Article Google Scholar
Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. Adv Neural Inf Process Syst 7:231–238
Google Scholar
Li W, Henry S (1993) Object-oriented metrics that predict maintainability. J Syst Softw 23(2):111–122
Article Google Scholar
Mao J (1998) A case study on bagging, boosting and basic ensembles of neural networks for OCR. In: Proceedings of IEEE international joint conference on neural networks, pp 1828–1833
Misra S (2005) Modeling design/coding factors that drive maintainability of software systems. Softw Qual Control 13(3):297–320
Article Google Scholar
Opitz D, Shavlik J (1996) Actively searching for an effective neural-network ensemble. Connect Sci 8(3/4):337–353
Article Google Scholar
Opitz D, Shavlik J (1996) Generating accurate and diverse members of a neural-network ensemble. Adv Neural Inf Process Syst 8:535–541
Google Scholar
Optiz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
Google Scholar
Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497
Article Google Scholar
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco
Google Scholar
Quinlan R (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, Singapore, pp 343–348
Shevade S, Keerthi S, Bhattacharyya C, Murthy K (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11(5):1188–1193
Article Google Scholar
Shimshoni Y, Intrator N (1998) Classification of seismic signals by integrating ensembles of neural networks. IEEE Trans Signal Process 46(5):1194–1201
Article Google Scholar
Sollich P (1996) Learning with ensembles: how over-fitting can be useful. Adv Neural Inf Process Syst 8:190–196
Google Scholar
Thwin M, Quah T (2005) Application of neural networks for software quality prediction using object-oriented metrics. J Syst Softw 76(2):147–156
Article Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Wang Y, Witten IH (1997) Induction of model trees for predicting continuous classes. In: Poster papers of the 9th European conference on machine learning
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Google Scholar
Zhang C, Zhang J, Zhang G (2008) An efficient modified boosting method for solving classification problems. J Comput Appl Math 214:381–392
Article MathSciNet Google Scholar
Zheng J (2009) Predicting software reliability with neural network ensembles. Expert Syst App 36(2):2116–2122
Article Google Scholar
Zhou Y, Leung H (2007) Predicting object-oriented software maintainability using multivariate adaptive regression splines. J Syst Softw 80(8):1349–1361
Article Google Scholar

Download references

Acknowledgments

The authors wish to acknowledge King Fahd University of Petroleum and Minerals (KFUPM) for utilizing the various facilities in carrying out this research.

Author information

Authors and Affiliations

Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
Mahmoud O. Elish, Hamoud Aljamaan & Irfan Ahmad

Authors

Mahmoud O. Elish
View author publications
You can also search for this author in PubMed Google Scholar
Hamoud Aljamaan
View author publications
You can also search for this author in PubMed Google Scholar
Irfan Ahmad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahmoud O. Elish.

Additional information

Communicated by I. R. Ruiz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Elish, M.O., Aljamaan, H. & Ahmad, I. Three empirical studies on predicting software maintainability using ensemble methods. Soft Comput 19, 2511–2524 (2015). https://doi.org/10.1007/s00500-014-1576-2

Download citation

Published: 08 January 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s00500-014-1576-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Three empirical studies on predicting software maintainability using ensemble methods

Abstract

Access this article

Similar content being viewed by others

An Ensemble of Computational Intelligence Models for Software Maintenance Effort Prediction

Software Reliability Prediction with Ensemble Method and Virtual Data Point Incorporation

Software enhancement effort estimation using correlation-based feature selection and stacking ensemble method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Three empirical studies on predicting software maintainability using ensemble methods

Abstract

Access this article

Similar content being viewed by others

An Ensemble of Computational Intelligence Models for Software Maintenance Effort Prediction

Software Reliability Prediction with Ensemble Method and Virtual Data Point Incorporation

Software enhancement effort estimation using correlation-based feature selection and stacking ensemble method

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation