Backpropagation Neural Network optimization and software defect estimation modelling using a hybrid Salp Swarm optimizer-based Simulated Annealing Algorithm

https://doi.org/10.1016/j.knosys.2022.108511Get rights and content

Abstract

Software Defect Estimation (SDE) is a fundamental problem solving mechanism in the field of software engineering (SE). SDE is a task that identifies software models that are likely to have defects. In addition, SDE plays a vital overall role in improving software quality, reducing software development costs and accelerating software development processes. The Backpropagation Neural Network (BPNN) is a popular machine learning (ML) estimator widely utilized in SE estimation problems. Unfortunately, its performance depends on the initial weight and bias values. Metaheuristic optimization algorithms, as an alternative method, have proven to have strengths in parameter optimizations. Additionally, population-based metaheuristic algorithms suffer from low exploitation capabilities. In this paper, a new hybrid metaheuristic algorithm-based BPNN (SSA–SA) is proposed by hybridizing the Salp Swarm Algorithm (SSA) with the Simulated Annealing (SA) algorithm. The main goal of the hybridization is to adjust the balance between exploration and exploitation in SSA. The proposed algorithm is also assembled with the BPNN estimator to optimize its parameters to reduce the overall estimation error, which boosts the estimation accuracy. Thus, the proposed algorithm addresses the SDE problem. Experimental results prove the superiority of the proposed hybrid algorithm in optimizing BPNN parameters in comparisons against other estimators and algorithms in most SDE datasets and evaluation criteria.

Introduction

In machine learning (ML) research communities, many models have been proposed to address various software engineering estimation problems [1], [2]. The software defect estimation (SDE) problem [3] is one of these significant problems. The estimation process is considered a supervised learning technique and a pattern discovery based on historical data.

In the last decade, a significant amount of research has been done in the field of SDE using ML methods to address the limitations of traditional and parametric estimation methods and to align with existing development and management strategies [4], such as Artificial Neural Networks (ANNs) [5], [6], [7], [8], Bayesian Network (BN) [9], Multilayer Perception (MLP) [9], Fuzzy Clustering (FC) [10], Case-based Reasoning (CR) [11], Logistic Regression (LR) [10], Decision Trees [12], Naïve Bayes [13] and Support Vector Machines (SVMs) [14], [15], [16].

Artificial neural networks (ANNs) are a prominent type of ML technology. In addition, Backpropagation Neural Networks (BPNNs) are also one of the most popular types of ANNs [17], as they have a reliable training algorithm that enables them to address both common and complex problems [18], in addition to their accuracy and self-organizing capabilities for estimation problems [19], [20], [21]. Consequently, BPNN is more broadly used for estimation problems as compared to other ML techniques [22], [23]. Unfortunately, the traditional BPNN training algorithm has some limitations, such as easily falling into a local optimum trap [24], [25], the need for extensive parameter settings [26], and a slow convergence rate [24]. Therefore, it is essential to define a metaheuristic optimizer that can be integrated with BPNN to overcome these limitations, which will be a valuable contribution to the research community.

Recently, several studies have integrated metaheuristic algorithms with ANNs for the purpose of optimizing ANN parameters to boost estimation accuracy [1], [3], [27]. These studies utilized the Salp Swarm Algorithm to optimize the Backpropagation Neural Network to boost its estimation accuracy to model software reliability and software fault estimation problems. In addition, to estimate the team size of the software testing phase workers, Kassaymeh et al. [28] some studies combined the Salp Swarm Algorithm with an ANN predictor to enhance the accuracy of its estimation [28]. These integration attempts succeeded in developing new models that performed better than other traditional estimation methods [29], [30]. Furthermore, the integration improved the BPNN estimation accuracy, thus overcoming the performance drawbacks [31].

Recently, by mimicking the navigation and foraging behaviour of salps in the sea environment, a new stochastic, nature-inspired, metaheuristic optimizer, namely, the Salp Swarm Algorithm (SSA), was proposed [32] to address global optimization issues. SSA has been applied to a wide range of optimization problems and has proven its robustness as an effective optimizer. Some of its applications include feature selection [33], [34], [35], estimation and forecasting [36], [37], [38], [39], image thresholding [40], multiobjective problems [34] and constrained engineering optimization problems [41].

The robustness of the SSA is due to the presence of fewer parameters to be tuned [42] and its ability to handle a wide range of engineering optimization problems [3]. Despite the aforementioned, the SSA still suffers from some limitations, such as a low exploitation ability, which can lead to slow convergence behaviour [24], [43]. To address these limitations, SSA must be hybridized using a local search algorithm with strong exploitation capabilities. Recently, a local search algorithm called the Simulated Annealing Algorithm (SA) [44], which has proven its ability in many optimization problems and has been examined in numerous hybridization studies with many population-based metaheuristic algorithms, was hybridized with SSA for the purpose of balancing search exploration and exploitation.

In this paper, a new hybrid metaheuristic algorithm-based BPNN called SSA–SA is proposed by hybridizing the SSA optimizer with the SA algorithm. The main motivation behind selecting the SSA algorithm is due to several SSA characteristics, such as its ease of use [45], superior performance [46], [47], [48], [49], [50], ease of tailoring [45], [46], [51], convergence speed [51], effective optimization [25], [48], [52] and high robustness [25]. On the other hand, the SA algorithm has unique exploitation abilities [53], [54], boosting the SSA algorithm in local search exploitation [55] and also has the ability to enhance the balance between exploitation and exploration when hybridized with population-based metaheuristic algorithms [56], [57]. Therefore, this study utilizes a combination of the SSA and SA algorithms to combine the benefits of both algorithms and remove their drawbacks. The hybrid algorithm is also integrated with the BPNN estimator to optimize its parameters (weights and biases) to reduce the total estimation error, which in turn boosts the estimation’s accuracy [58]. In addition, the proposed algorithm is utilized to address the software defect estimation (SDE) problem. Experimental results proved the superiority of the proposed hybrid algorithm in optimizing BPNN parameters in comparisons against the traditional BPNN estimator, the standard SSA algorithm and other state-of-the-art algorithms in most SDE datasets and evaluation criteria.

This paper is organized as follows: The research background is reviewed in Section 2. The software defect estimation problem is explored in Section 3. The proposed algorithm is described in Section 4. The experimental results and performance evaluation are presented and discussed in Section 5. The conclusion and prospects for future work are provided in Section 6.

Section snippets

Research background

In this section, the Salp Swarm Algorithm, the Backpropagation Neural Network and the Simulated Annealing Algorithm are briefly discussed.

Software defect estimation problem

The field of software engineering engages with various estimation problems. Software Defect Estimation (SDE) is an essential issue in software engineering and software project management, as it plays a critical role in enhancing software quality, reducing software development costs and accelerating software development processes. SDE is defined as the method of predicting the flaws in a software project under construction based on earlier determined metrics or based on historical data extracted

Proposed algorithm

This section will discuss the proposed method that hybridizes the population-based SSA algorithm with the local-search SA algorithm, namely, the SSA–SA. The target of the hybridization designed in this research is to obtain a balance between exploration and exploitation. The SSA algorithm has a robust global search capability [72], [73], [74]. In contrast, the SA algorithm has a robust local search capability [75], so the hybrid algorithm can realize the complementary advantages of the two

Experimental results and performance evaluation

Several experiments have been performed on different SDE benchmark datasets to assess the proposed hybrid SSA–SA performance. The results obtained from SSA–SA were compared with those of the SSA-BPNN and conventional BPNN. Then, a statistical evaluation was carried out in which the Wilcoxon Mann–Whitney statistical test was used to provide statistical indicators of significant outcomes, and boxplots and convergence behaviour analyses were performed. The results obtained from SSA–SA were

Conclusion and future work

Parameter optimization is considered a complex task in engineering estimation problems. To address this issue, metaheuristic algorithms have been studied and have proven to be successful and appropriate in this field. This paper’s main concern was developing and proposing a new hybrid algorithm-based Backpropagation Neural Network (BPNN) called SSA–SA by hybridizing the population-based Salp Swarm Algorithm and the single-solution Simulated Annealing algorithm. In SSA–SA, the main intent of the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (102)

  • FarisHossam et al.

    An efficient binary salp swarm algorithm with crossover scheme for feature selection problems

    Knowl.-Based Syst.

    (2018)
  • AssadAssif et al.

    A hybrid harmony search and simulated annealing algorithm for continuous optimization

    Inform. Sci.

    (2018)
  • MafarjaMajdi M. et al.

    Hybrid whale optimization algorithm with simulated annealing for feature selection

    Neurocomputing

    (2017)
  • YuCaiyang et al.

    A quantum-behaved simulated annealing algorithm-based moth-flame optimization method

    Appl. Math. Modelling

    (2020)
  • SalamaMohamed et al.

    Adaptive neighborhood simulated annealing for sustainability-oriented single machine scheduling with deterioration effect

    Appl. Soft Comput.

    (2021)
  • JinCong et al.

    Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization

    Appl. Soft Comput.

    (2015)
  • BaklaciogluTolga et al.

    Metaheuristic approach for an artificial neural network: exergetic sustainability and environmental effect of a business aircraft

    Transp. Res. D

    (2018)
  • CatalCagatay

    Software fault prediction: A literature review and current trends

    Exp. Syst. Appl.

    (2011)
  • TurabiehHamza et al.

    Iterated feature selection algorithms with layered recurrent neural network for software fault prediction

    Exp. Syst. Appl.

    (2019)
  • WangKaipu et al.

    Modeling and optimization of multi-objective partial disassembly line balancing problem considering hazard and profit

    J. Clean. Prod.

    (2019)
  • MiholcaDiana-Lucia et al.

    A novel approach for software defect prediction through hybridizing gradual relational association rules with artificial neural networks

    Inform. Sci.

    (2018)
  • LiWeiwei et al.

    Three-way decisions based software defect prediction

    Knowl.-Based Syst.

    (2016)
  • QiaoLei et al.

    Deep learning based software defect prediction

    Neurocomputing

    (2020)
  • De CarvalhoAndre B. et al.

    A symbolic fault-prediction model based on multiobjective particle swarm optimization

    J. Syst. Softw.

    (2010)
  • EmaryEid et al.

    Binary ant lion approaches for feature selection

    Neurocomputing

    (2016)
  • MafarjaMajdi et al.

    Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems

    Knowl.-Based Syst.

    (2018)
  • WangZhichun et al.

    A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure

    Inform. Sci.

    (2015)
  • KassaymehSofian et al.

    Salp swarm optimizer for modeling software reliability prediction problems

    Neural Process. Lett.

    (2021)
  • ShetaAlaa F. et al.

    Estimating the number of test workers necessary for a software testing process using artificial neural networks

    Int. J. Adv. Comput. Sci. Appl.

    (2014)
  • KassaymehSofian et al.

    Salp swarm optimizer for modeling the software fault prediction problem

    J. King Saud Univ.-Comput. Inf. Sci.

    (2021)
  • ShuklaSuyash et al.

    An extreme learning machine based approach for software effort estimation

  • WangGai-Ge et al.

    Self-adaptive extreme learning machine

    Neural Comput. Appl.

    (2016)
  • YiJiao-Hong et al.

    Improved probabilistic neural networks with self-adaptive strategies for transformer fault diagnosis problem

    Adv. Mech. Eng.

    (2016)
  • CuiZhihua et al.

    Detection of malicious code variants based on deep learning

    IEEE Trans. Ind. Inform.

    (2018)
  • WangGaige et al.

    Wavelet neural network using multiple wavelet functions in target threat assessment

    Sci. World J.

    (2013)
  • CarrozzaGabriella et al.

    Analysis and prediction of mandelbugs in an industrial software system

  • YuanXiaohong et al.

    An application of fuzzy clustering to software quality prediction

  • AlshareefAlmahdi Mohammed et al.

    A case-based reasoning approach for pattern detection in Malaysia rainfall data

    Int. J. Big Data Intell.

    (2015)
  • WanAlvin et al.

    NBDT: Neural-backed decision trees

    (2020)
  • BowesDavid et al.

    Software defect prediction: do different classifiers find the same defects?

    Softw. Qual. J.

    (2018)
  • RashidJunaid et al.

    Study of software development cost estimation techniques and models

    Mehran Univ. Res. J. Eng. Technol.

    (2020)
  • AbdullahDahlan et al.

    Drug users prediction using backpropagation educational method

  • ChaudharyRashmi et al.

    A survey on backpropagation algorithm for neural networks

    Int. J. Technol. Res. Eng.

    (2015)
  • VoraKuldip et al.

    A survey on backpropagation algorithms for feedforward neural networks

    Int. J. Eng. Dev. Res.

    (2014)
  • EverYoney Kirsal et al.

    Comparison of machine learning techniques for prediction problems

  • SekerogluBoran et al.

    Student performance prediction and classification using machine learning algorithms

  • OgidanEzekiel T. et al.

    Machine learning for expert systems in data analysis

  • AbusnainaAhmed A. et al.

    Training neural networks using salp swarm algorithm for pattern classification

  • WuJun et al.

    Improved salp swarm algorithm based on weight factor and adaptive mutation

    J. Exp. Theor. Artif. Intell.

    (2019)
  • KhazaiepoorMahdi et al.

    A hybrid approach for software development effort estimation using neural networks, genetic algorithm, multiple linear regression and imperialist competitive algorithm

    Int. J. Nonlinear Anal. Appl.

    (2020)
  • Cited by (30)

    View all citing articles on Scopus
    View full text