Abstract
Software defect prediction has caused widespread concern among software engineering researchers, which aims to erect a software defect prediction model according to historical data. Among all the techniques used in this field, extreme learning machine is widely used by researchers because of its simple structure and excellent learning speed. At the same time, the prediction performance of extreme learning machine is greatly affected by the random selection of parameters and the weak generalization ability. In this sense, in order to improve the prediction performance of the model, researchers uses swarm intelligence optimization algorithm to provide the optimal parameters for the model. Sparrow search algorithm is a new meta-heuristic algorithm that simulates the foraging and anti-predation behavior of the sparrow group. However, the original sparrow search algorithm is easily trapped to local optimal solutions in the later stage of the iterations. To improve the global optimization ability of the original sparrow search algorithm, this paper proposed an adaptive variable sparrow search algorithm (AVSSA) based on adaptive hyper-parameters and variable logarithmic spiral. This work run experiments of AVSSA in eight benchmark functions, and obtained the satisfactory results. In the traditional software defect prediction algorithm, the imbalance of data distribution is also one of the main reasons that affect the performance of the model. Therefore, this paper uses the adaptive variable sparrow search algorithm to optimize the extreme learning machine as the base predictor for Bagging ensemble learning (AVSEB). A new software defect prediction ensemble learning model is proposed in this paper. Firstly, the model used the unstable cut-points algorithm to preprocess Bagging sample set in this model. Then, the adaptive variable sparrow search algorithm is used to optimize the extreme learning machine as the base predictor of ensemble learning. Finally, the voting method is used to output the prediction results of software defects. Based on the experimental results, the evaluation index of our proposed algorithm is significantly superior to the other four advanced comparison algorithms in 15 open software defect datasets. According to the test results of Friedman ranking and Holm’s post hoc test, this paper proposed algorithm has obvious statistical significance compared with other advanced prediction algorithms.
Similar content being viewed by others
References
Jayanthi R, Florence L (2019) Software defect prediction techniques using metrics based on neural network classifier. Clust Comput 22(1):77–88
Jin C (2021) Software defect prediction model based on distance metric learning. Soft Comput 25(1):447–461
Czibula G, Marian Z, Czibula IG (2014) Software defect prediction using relational association rule mining. Inf Sci 264:260–278
Milićević V, Denić N, Milićević Z, Arsić L, Spasić-Stojković M, Petković D, Stojanović J, Krkic M, Milovančević NS, Jovanović A (2021) E-learning perspectives in higher education institutions. Technol Forecast Soc Chang 166:120618
Stojanović J, Petkovic D, Alarifi IM, Cao Y, Denic N, Ilic J, Assilzadeh H, Resic S, Petkovic B, Khan A et al (2021) Application of distance learning in mathematics through adaptive neuro-fuzzy learning method. Comput Elect Eng 93:107270
Spasić B, Siljković B, Denić N, Petković D, Vujović V (2020) Natural lignite resources in Kosovo and Metohija and their influence on the environment. 561–566
Jing X-Y, Wu F, Dong X, Xu B (2016) An improved SDA based defect prediction framework for both within-project and cross-project class-imbalance problems. IEEE Trans Softw Eng 43(4):321–339
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496
Denić N, Petković D (2020) journal=Encyclopedia of Renewable and Sustainable Materials, B. Spasić: global economy increasing by enterprise resource planning, 331–337
Czibula IG, Czibula G, Miholca D-L, Onet-Marian Z (2019) An aggregated coupling measure for the analysis of object-oriented software systems. J Syst Softw 148:1–20
Arora I, Tetarwal V, Saha A (2015) Open issues in software defect prediction. Procedia Comput Sci 46:906–912
Mahmood Z, Bowes D, Hall T, Lane PC, Petrić J (2018) Reproducibility and replicability of software defect prediction studies. Inf Softw Technol 99:148–163
Bishnu PS, Bhattacherjee V (2011) Software fault prediction using quad tree-based k-means clustering algorithm. IEEE Trans Knowl Data Eng 24(6):1146–1150
Gong L, Jiang S, Bo L, Jiang L, Qian J (2019) A novel class-imbalance learning approach for both within-project and cross-project defect prediction. IEEE Trans Reliab 69(1):40–54
Ghosh S, Rana A, Kansal V (2018) A nonlinear manifold detection based model for software defect prediction. Procedia Comput Sci 132:581–594
Arar ÖF, Ayan K (2017) A feature dependent Naive bayes approach and its application to the software defect prediction problem. Appl Soft Comput 59:197–209
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2011) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Software Eng 38(6):1276–1304
Miholca D-L, Czibula G, Czibula IG (2018) A novel approach for software defect prediction through hybridizing gradual relational association rules with artificial neural networks. Inf Sci 441:152–170
Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140
Zheng S, Gai J, Yu H, Zou H, Gao S (2021) Training data selection for imbalanced cross-project defect prediction. Comput Elect Eng 94:107370
Yu T, Zhu H (2020) Hyper-parameter optimization: a review of algorithms and applications. arXiv preprint arXiv:2003.05689
Shu R, Xia T, Williams L, Menzies T (2019) Better security bug report classification via hyperparameter optimization. arXiv preprint arXiv:1905.06872
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2018) The impact of automated parameter optimization on defect prediction models. IEEE Trans Softw Eng 45(7):683–711
Tong H, Liu B, Wang S (2018) Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. Inf Softw Technol 96:94–111
Chen H, Jing X-Y, Zhou Y, Li B, Xu B (2022) Aligned metric representation based balanced multiset ensemble learning for heterogeneous defect prediction. Inf Softw Technol 147:106892
Huang X, Zhan J, Ding W, Pedrycz W (2022) An error correction prediction model based on three-way decision and ensemble learning. Int J Approx Reason 146:21–46
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science, pp 39–43. Ieee
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
Xue J, Shen B (2020) A novel swarm intelligence optimization approach: sparrow search algorithm. Syst Sci Control Eng 8(1):22–34
Zhang C, Ding S (2021) A stochastic configuration network based on chaotic sparrow search algorithm. Knowl-Based Syst 220:106924
Wu J, Wang Y-G, Burrage K, Tian Y-C, Lawson B, Ding Z (2020) An improved firefly algorithm for global continuous optimization problems. Expert Syst Appl 149:113340
Bhairavi R, Sudha GF (2022) Hybrid sparrow search optimization technique for quality of service cooperative routing in underwater acoustic sensor networks. Phys Chem Earth, Parts A/B/C, 103175
Petković D, Barjaktarovic M, Milošević S, Denić N, Spasić B, Stojanović J, Milovancevic M (2021) Neuro fuzzy estimation of the most influential parameters for Kusum biodiesel performance. Energy 229:120621
Kuzman B, Petković B, Denić N, Petković D, Ćirković B, Stojanović J, Milić M (2021) Estimation of optimal fertilizers for optimal crop yield by adaptive neuro fuzzy logic. Rhizosphere 18:100358
Milić M, Petković B, Selmi A, Petković D, Jermsittiparsert K, Radivojević A, Milovancevic M, Khan A, Vidosavljević ST, Denić N et al. (2021) Computational evaluation of microalgae biomass conversion to biodiesel. Biomass Convers Biorefinery 1–8
Lakovic N, Khan A, Petković B, Petkovic D, Kuzman B, Resic S, Jermsittiparsert K, Azam S (2021) Management of higher heating value sensitivity of biomass by hybrid learning technique. Biomass Convers Biorefinery 1–8
Gavrilović S, Denić N, Petković D, Živić NV, Vujičić S (2018) Statistical evaluation of mathematics lecture performances by soft computing approach. Comput Appl Eng Educ 26(4):902–905
Petković D, Gocic M, Trajkovic S, Milovančević M, Šević D (2017) Precipitation concentration index management by adaptive neuro-fuzzy methodology. Clim Change 141(4):655–669
Momani S, Abo-Hammour ZS, Alsmadi OM (2016) Solution of inverse kinematics problem using genetic algorithms. Appl Math Inform Sci 10(1):225
Abo-Hammour Z, Arqub OA, Alsmadi O, Momani S, Alsaedi A (2014) An optimization algorithm for solving systems of singular boundary value problems. Appl Math Inf Sci 8(6):2809
Abo-Hammour Z, Abu Arqub O, Momani S, Shawagfeh N (2014) Optimization solution of Troesch’s and Bratu’s problems of ordinary type using novel continuous genetic algorithm. Discrete Dyn Nat Soc 2014
Abu Arqub O, Abo-Hammour Z, Momani S, Shawagfeh N (2012) Solving singular two-point boundary value problems using continuous genetic algorithm. In: Abstract and Applied Analysis, vol. 2012. Hindawi
Ding L, Zhang X-Y, Wu D-Y, Liu M-l (2021) Application of an extreme learning machine network with particle swarm optimization in syndrome classification of primary liver cancer. J Integr Med 19(5):395–407
Li L-L, Sun J, Tseng M-L, Li Z-G (2019) Extreme learning machine optimized by whale optimization algorithm using insulated gate bipolar transistor module aging degree evaluation. Expert Syst Appl 127:58–67
Huda S, Alyahya S, Ali MM, Ahmad S, Abawajy J, Al-Dossari H, Yearwood J (2017) A framework for software defect prediction and metric selection. IEEE Access 6:2844–2858
Li W, Huang Z, Li Q (2016) Three-way decisions based software defect prediction. Knowl-Based Syst 91:263–274
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Tang Y, Zhang T (2019) Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf Softw Technol 106:182–200
Pandey SK, Rathee D, Tripathi AK (2020) Software defect prediction using k-PCA and various kernel-based extreme learning machine: an empirical study. IET Softw 14(7):768–782
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Duffy N, Helmbold D (2002) Boosting methods for regression. Mach Learn 47(2):153–200
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Wang T, Zhang Z, Jing X, Zhang L (2016) Multiple kernel ensemble learning for software defect prediction. Autom Softw Eng 23(4):569–590
Sun Z, Song Q, Zhu X (2012) Using coding-based ensemble learning to improve software defect prediction. IEEE Trans Syst Man Cybern Part C (Applications and Reviews) 42(6):1806–1817
Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inf Softw Technol 58:388–402
Tang T, Yuan H (2021) The capacity prediction of Li-ion batteries based on a new feature extraction technique and an improved extreme learning machine algorithm. J Power Sources 514:230572
Chen C, Jiang B, Cheng Z, Jin X (2019) Joint domain matching and classification for cross-domain adaptation via elm. Neurocomputing 349:314–325
Liu N, Wang H (2010) Ensemble based extreme learning machine. IEEE Signal Process Lett 17(8):754–757
Zhai J-H, Xu H-Y, Wang X-Z (2012) Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 16(9):1493–1502
Zhang H, Peng Z, Tang J, Dong M, Wang K, Li W (2022) A multi-layer extreme learning machine refined by sparrow search algorithm and weighted mean filter for short-term multi-step wind speed forecasting. Sustain Energy Technol Assess 50:101698
Xing S, Ming Z (2018) A study on unstable cuts and its application to sample selection. Int J Mach Learn Cybern 9(9):1541–1552
Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl 6(1):20–29
Huang G, Zhu Q, Siew C (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Gaspar A, Oliva D, Hinojosa S, Aranguren I, Zaldivar D (2022) An optimized kernel extreme learning machine for the classification of the autism spectrum disorder by using gaze tracking images. Appl Soft Comput 120:108654
Tummalapalli S, Kumar L, Neti LBM, Krishna A (2022) Detection of web service anti-patterns using weighted extreme learning machine. Comput Stand Interfaces 82:103621
Liu Y, Wang J (2022) Transfer learning based multi-layer extreme learning machine for probabilistic wind power forecasting. Appl Energy 312:118729
Yaman MA, Rattay F, Subasi A (2021) Comparison of bagging and boosting ensemble machine learning methods for face recognition. Procedia Comput Sci 194:202–209
Ahmad A, Farooq F, Niewiadomski P, Ostrowski K, Akbar A, Aslam F, Alyousef R (2021) Prediction of compressive strength of fly ash based concrete using individual and ensemble algorithm. Materials 14(4):794
Zhou S, Xie H, Zhang C, Hua Y, Zhang W, Chen Q, Gu G, Sui X (2021) Wavefront-shaping focusing based on a modified sparrow search algorithm. Optik 244:167516
Wang X, Liu J, Hou T, Pan C (2021) The SSA-bp-based potential threat prediction for aerial target considering commander emotion. Defence Technology
Yang B, Guo Z, Yang Y, Chen Y, Zhang R, Su K, Shu H, Yu T, Zhang X (2021) Extreme learning machine based meta-heuristic algorithms for parameter extraction of solid oxide fuel cells. Appl Energy 303:117630
Ma J, Hao Z, Sun W (2022) Enhancing sparrow search algorithm via multi-strategies for continuous optimization problems. Inf Process Manage 59(2):102854
Ren J, Wang Y, Mao M, Cheung Y-M (2022) Equalization ensemble for large scale highly imbalanced data classification. Knowl-Based Syst 242:108295
Dai Q, Liu J-W, Liu Y (2022) Multi-granularity relabeled under-sampling algorithm for imbalanced data. Appl Soft Comput 109083
Li K, Yan D, Liu Y, Zhu Q (2022) A network-based feature extraction model for imbalanced text data. Expert Syst Appl 195:116600
Garcı S, Triguero I, Carmona CJ, Herrera F et al (2012) Evolutionary-based selection of generalized instances for imbalanced classification. Knowl-Based Syst 25(1):3–12
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant No. 52074126. We would like to thank the editor and anonymous reviewers for their valuable comments and suggestions to improve the paper.
Author information
Authors and Affiliations
Contributions
YT: writing-original draft, conceptualization, methodology, and program writing and results collecting. QD: methodology and reviewing. MY: data curation, visualization. TD: reviewing, editing, visualization. LC: reviewing, supervision.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tang, Y., Dai, Q., Yang, M. et al. Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm. Int. J. Mach. Learn. & Cyber. 14, 1967–1987 (2023). https://doi.org/10.1007/s13042-022-01740-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01740-2