1 Introduction

Particle swarm optimization (PSO) is a meta-heuristic search technique that is inspired by the behavior of bird flocks (Kennedy and Eberhart 1995). Due to its advantages such as simplicity, fast convergence and population-based feature, the attention of researchers upon PSO is much high, and a large number of PSO variations have been developed and applied successfully to a wide range of real problems as summarized in Clerc (2006), Wang and Liu (2008). However, most of those methods require users to tune control parameters such as the inertia weight, the acceleration coefficients and the velocity clamping in order to obtain desirable solution. And empirical and theoretical studies have shown that the convergence behavior of PSO depends strongly on values of those control parameters (Clerc and Kennedy 2002; Jiang et al. 2007; Van den Bergh and Engelbrecht 2010). On the one hand, although recommendations for those parameters’ values have been made in such literature as Clerc and Kennedy (2002), Cristian (2003), those values are not universally applicable. On the other hand, some adaptive or dynamic adjusting strategies related to the inertia weight and/or the acceleration coefficients have been considered in such literature as Shi and Eberhart (1999), Tripathi et al. (2007), Cooren et al. (2011). However, since needing to consider specific applications, how to set appropriate parameter values is still a challenging research subject.

Bare-bones particle swarm optimization (BBPSO) was first proposed by Kennedy 2003, where the traditional velocity equation of PSO is removed and a Gaussian distribution with the global and local best positions is used to update the particles’ positions. Compared to the traditional PSO, BBPSO is probably the simplest version PSO since it does not involve the inertia weight, the acceleration coefficient and the velocity. Due to simplicity and effectiveness, it’s natural to extend or apply BBPSO in some real problems. Parts of successful examples include the integer programming (Omran et al. 2007), the unsupervised image classification (Omran and Al-Sharhan 2007), the multi-objective economic dispatch problem (Zhang et al. 2012), the parameter estimation of mixed Weibull distribution (Krohling Renato et al. 2010), and so on.

Although BBPSO has shown potential to solve different real problems, it still suffers from the problem of premature convergence. To improve its search efficiency, there have been a few modifications to the original BPSO algorithm. Krohling and Mendel (2009) incorporated the jump strategy in BBPSO to discourage the premature convergence of swarm, and later Blackwell (2012) and Majid al-Rifaie and Blackwell (2012) refined the use of the jump strategy. Omran and Haibo et al. used the mutation and/or crossover operators from differential evolution (DE) to improve BBPSO (Mahamed et al. 2009; Haibo et al. 2011). Recently, Zhang et al. (2011) proposed an adaptive BBPSO by using the cloud model.

In order to keep a balance between the diversity and convergence speed of swarm, this paper proposes an improved BBPSO algorithm with an adaptive disturbance, called ABPSO. By the proposed approaches, each particle has its own disturbance value which depends upon its convergence degree and the diversity of swarm. Even if in the same iteration, that value may differ for different particles. Also, the disturbance value is designed to converge finally to zero for making sure the convergence of the proposed algorithm. Since only added a disturbance factor on the variance of Gaussian distribution, the proposed algorithm still maintains the simplicity and easy implement of BBPSO. Moreover, the performance of ABPSO with an adaptive mutation is investigated. Inspired by the analysis approach introduced by Jiang et al. (2007), convergence of the proposed algorithm is analyzed by using stochastic process theory.

The remainder of this paper is organized as follows. The traditional PSO and BBPSO algorithm are described in Sect. 2. The proposed ABPSO algorithm with adaptive mutation is presented in Sect. 3. Section 4 focuses on the convergence analysis of ABPSO. Performance of ABPSO and effectiveness of the adaptive mutation and the disturbance factor are discussed in Sect. 5. Finally, conclusions from this study are given in Sect. 6.

2 Particle swarm optimization

2.1 Traditional PSO

PSO was proposed first by Kennedy and Eberhart (1995), which inspired by the social behavior of some biological organisms. In the PSO, a swarm consists of a set of particles; and each particle represents a potential solution of that optimized problem. Taking the \(i\)-th particle in the swarm as an example, and supposing its position and velocity are \({\varvec{X}}_{i}(t)=(x_{i,1} (t),x_{i,2} (t),\ldots ,x_{i,D} (t))\) and \(V_i (t)=(v_{i,1} (t),v_{i,2} (t),\ldots ,v_{i,D} (t))\) at iteration \(t\), this particle will be updated using the following equations (Shi and Eberhart 1998):

$$\begin{aligned} v_{i,j} (t+1)&= w*v_{i,j}(t)+r_1 c_1 *(pb_{i,j} (t)-x_{i,j} (t))\nonumber \\&+\, r_2 c_2 *(gb_j (t)-x_{i,j} (t))\end{aligned}$$
(1)
$$\begin{aligned} x_{i,j} (t+1)&= x_{i,j} (t)+v_{i,j} (t+1). \end{aligned}$$
(2)

Where \({{\varvec{Pb}}}_i (t)=(pb_{i,1} (t),pb_{i,2} (t),\ldots , pb_{i,D} (t))\) represents the best position found so far by the \(i\)-th particle, usually called Pbest; \({{\varvec{Gb}}}_i (t)=(gb_{i,1} (t),gb_{i,2} (t),\ldots ,gb_{i,D} (t))\) represents the global best position found so far by neighbors of this particle, usually called Gbest. \(c_{1}\) and \(c_{2}\) are nonnegative constants called acceleration coefficients, \(w\) is the inertia weight to control particle’s exploration in search space, \(r_{1}\) and \(r_{2}\) are two uniform random numbers distributed in [0, 1].

2.2 Bare-bones PSO

BBPSO eliminates the velocity equation of PSO and uses a Gaussian distribution based on Pbest and Gbest to sample the search space. The standard BBPSO algorithm has the following update equation:

$$\begin{aligned}&x_{i,j} (t+1)=g_{i,j} (t)+\sigma _{i,j} (t)N(0,1) \nonumber \\&g_{i,j} (t)=0.5(pb_{i,j} (t)+gb_j (t)) \nonumber \\&\sigma _{i,j} (t)=| {pb_{i,j} (t)-gb_j (t)}| \end{aligned}$$
(3)

where \(N\)(0, 1) is the Gaussian distribution with the mean 0 and the variance 1. Pan et al. (2008) demonstrated that BBPSO can be mathematically deduced from the canonical PSO.

Kennedy also proposed an alternative version of BBPSO, called the exploiting bare-bones PSO (BBExp), where Eq. (3) is replaced with

$$\begin{aligned}&x_{i,j} (t+1)= \left\{ \begin{array}{ll} g_{i,j} (t)+\sigma _{i,j} (t)N(0,1),&{}\quad rand<0.5\\ pb_{i,j} (t), &{}\quad \hbox {otherwise}\\ \end{array}\right. \nonumber \\&g_{i,j} (t)=0.5(pb_{i,j} (t)+gb_j (t)) \nonumber \\&\sigma _{i,j} (t)=| {pb_{i,j} (t)-gb_j (t)}| \end{aligned}$$
(4)

As pointed out in Mahamed et al. (2009), since there is 50 % chance that the \(j\)-th dimension of particle changes to corresponding Pbest, this version BBPSO biases towards exploiting Pbest.

Recently, Blackwell discussed the collapse condition of BBPSO, and given a generalized format of BBPSO (Blackwell (2012)), where the search focus on \({{\varvec{g}}}_{i}\) and the search spread \(\sigma _{i}\) can be chosen from local or global neighbourhood of the current particle \({{\varvec{X}}}_{i}\). This idea is embodied in the following update equation:

$$\begin{aligned}&x_{i,j} (t+1)=g_{i,j} (t)+\alpha \times \sigma _{i,j} (t)N(0,1)\nonumber \\&g_i (t)=BEST({{\varvec{Pb}}}_k \in N_i)\\&\sigma _{i,j} (t)=|{pb_{i,j} (t)-gb_j (t)}|\text {- - - -}\hbox {global neighbourhood}\nonumber \\&\sigma _{i,j} (t)=|{pb_{i-1,j} (t)-pb_{i+1,j} (t)}|(\hbox {mod} \;n) \text {- - - -}\nonumber \\&\hbox {local neighbourhood}\nonumber \end{aligned}$$
(5)

where \(\alpha \) is a scale parameter and \(N_{i}\) denotes the search neighbourhood of the particle \({\varvec{X}}_{i}\). \(N_{i}\), also called the \(\mu \)-neighbourhood, can be global or any local structure. At each iteration, \({{\varvec{g}}}_{i}\) takes the best one from the Pbest values of \(N_{i}\). The separation factor \(\sigma _{i}\) which controls search concentration, can be taken from a local or global informer neighbourhood (for short, the \(\delta \)-neighbourhood).

Also, Blackwell proposed an improved Bare-bones algorithm based on the jump strategy (BBJ1) to improve further performance of the proposed algorithm. In this algorithm, a particle will jump uniformly in any dimension with a small probability (Blackwell 2012). Recently, Majid al-Rifaie and Blackwell (2012) given the second bare-bones PSO with the jump strategy (BBJ2), by modifying the separation factor of random distribution. After those work, Krohling and Mendel (2009) have introduced the jump strategy in BBPSO by a different approach: a particle jumps if it has not improved for a given number of iterations. The jump is made to every position component, and the values of jump are drawn from a scaled Normal or Cauchy probability distribution.

Moreover, Omran et al. proposed a new optimization algorithm, called the barebones differential evolution (BBDE) by combining differential evolution (DE). In this algorithm, DE was used to mutate for each particle, the attractor associated with that particle, defined as a weighted average of its personal and neighborhood best positions (Mahamed et al. 2009). Similarly, Haibo et al. (2011) modified the original BBPSO by using mutation and crossover to update fixed particles. zhang et al. (2011) proposed an adaptive BBPS inspired by the cloud model (ACMBBPS). The cloud model adaptively produces different standard deviations of the Gaussian sampling for different particles based on evolutionary state.

3 Adaptive bare-bones particle swarm optimization

Although BBPSO has been found to be competitive, it still suffers from the disadvantage of premature convergence. It is observed from Eqs. (3) and (4) that, if the Pbest of a particle happens to be close or equal to Gbest in evolution process, obviously, this particle will stop updating at once because the variance of Gaussian distribution becomes 0. Furthermore, if most particles among the swarm stop updating prematurely, the BBPSO algorithm will converge to a false global optimum with high probability.

To get around the above disadvantage, a natural approach is to assign a disturbance on the variance of Gaussian distribution. Also, some studies have shown that a minor change to the sampling distribution can greatly help BBPSO exploring continuous space, and even enables BBPSO to be a global optimizer (Poli and Langdon 2007). However, setting a modest disturbance value for every particle still is a challenge. A big disturbance value can improve the diversity of swarm, but should slow down the convergence speed of algorithm; in contrast, a small disturbance value can improve the convergence speed of algorithm, but leads easily to premature convergence.

3.1 Particles’ update based on the adaptive disturbance

In order to keep a balance between the diversity and convergence of swarm, this section presents an improved BBPSO algorithm based on an adaptive disturbance, called ABPSO. By using the adaptive disturbance, each particle has its own disturbance value which changes adaptively based on its convergence degree and the diversity of swarm. Even if in the same iteration, that value may differ for different particles. At the end of our algorithm, the disturbance value converges to zero in order to make sure the convergence of swarm.

Taking the \(i\)-th particle \({\varvec{X}}_i (t)=(x_{i,1} ,x_{i,2},\ldots ,x_{i,D})\) as an example, the new update strategy is described as follows:

$$\begin{aligned}&x_{i,j} (t+1)=\left\{ \begin{array}{ll} g_{i.j} (t)+\sigma _{i,j} (t)N(0,1),&{}\quad rand<prob\\ x_{i,j} (t), &{}\quad \hbox {otherwise}\\ \end{array}\right. \nonumber \\&g_{i,j} (t)=0.5(pb_{i,j} (t)+gb_j (t)) \nonumber \\&\sigma _{i,j} (t)=|{pb_{i,j} (t)-gb_j (t)}|+\Delta \nonumber \\&\Delta = r_3 \times |{pb_{k1,j} (t)-pb_{k2,j} (t)} |\times \exp (f({{\varvec{Gb}}}(t))\nonumber \\&\qquad -f({\varvec{X}}_i (t))) \end{aligned}$$
(6)

where \(r_{3}\) is an uniform random numbers within [0, 1]; \({{\varvec{Pb}}}_{k1}\) and \({{\varvec{Pb}}}_{k2}\) are two Pbests selected from other particles at random; \(f()\) is the objective function optimized, the selection probability 0 \(\le prob \le \) 1.

Compared with the standard BBPSO, the above equation has the following features: instead of taking a constant value, the adaptive approach is adopted to adjust the value of \(\Delta \) based on the differential fitness value of the current position vector \(X_{i}\) and the Gbest, as well as the differential value of the \(j\) th dimension in two randomly selected Pbests. It is observed from Eq. (6) that, on the assumption that \(|{{\varvec{Pb}}}_{k1}-{{{\varvec{Pb}}}_{k2}}|\) remains constant, the smaller the differential fitness value between \({\varvec{X}}_{i}\) and Gbest, the higher the disturbance factor \(\Delta \) is. When the particle has the same fitness as Gbest, this particle will be affected by a disturbance with maximal magnitude. On this case, this disturbance, like a re-initialization operator, may prevent our algorithm from trapping into a local optimum. As the differential fitness value decreases, effect of the disturbance will decrease, which is reflected by shrinking the disturbance value. In the end of ABPSO, since all the Pbest values converge to one position, i.e., Gbest, the value of \(|{{\varvec{Pb}}}_{k1} -{{\varvec{Pb}}}_{k2}|\) will converge to zero, so the disturbance \(\Delta \) will converge to zero. As one of necessary conditions, this ensures the convergence of swarm (please see Sect. 4 for the proof).

3.2 The adaptive mutation

Since the value of Gbest still is probably a local optimum before the particles lose diversity, it is difficult only by the above update strategy to guarantee the global convergence of ABPSO. In this subsection, an adaptive mutation strategy is proposed to enhance further the convergence of ABPSO. The reasons we selected the mutation are as follows: (1) part technologies such as re-initialization (Wang et al. 2012) and the jump strategy (Krohling and Mendel 2009; Blackwell 2012; Majid al-Rifaie and Blackwell 2012), still are a variant of mutation essentially; (2) many literature, as we know, have shown that a suitable mutation operator is able to help swarm/population find global optimum with probability 1 (Gao and Xu 2011; Hu et al. 2012).

Figure 1 shows pseudo code of the proposed mutation, where a mutation probability \(p_m\) is used to control the speed of mutation, \(Ux_{i}\) and \(Lx_{i}\) are the up and low limits of the \(i\)-th decision variable. At each iteration, each particle is checked in turn. If the mutation probability \(p_m\) is larger than a random number rand, then the current particle will be mutated. However, as most existing mutation strategies, how to set a right mutation probability \(p_m\) still is a key problem. A large value of \(p_m\) allows particles to implement mutation operator more time, but leads easily to destroying experience knowledge from Pbest and Gbest. On the contrary, a small value of \(p_m\) often needs the particles spend long time to escape from local optima. In view of above, it may not be the best approach in fixing \(p_m\) all the time along the search process.

Fig. 1
figure 1

Pseudo code of the adaptive mutation

In this paper we propose a new method to adjust \(p_m\) dynamically. At each iteration, the value of \(p_m\) is decided by using the following equation:

$$\begin{aligned} p_m (t)&\!=\!&\frac{0.2}{1+1.2^{20-Fm(t)}}\end{aligned}$$
(7)
$$\begin{aligned} F_m (t)&\!=\!&\left\{ \begin{array}{ll} F_m (t-1)+1, &{} |f({{\varvec{Gb}}}(t))\!-\!f({{\varvec{Gb}}}(t-1))|\,{\le }\, 0\nonumber \\ 0, &{} \hbox {otherwise} \end{array} \right. \\ \end{aligned}$$
(8)

where the parameter \(F_{m}\) is used to take count of continuous iteration times, which the fitness of Gbest stops update. Its initial value is set 0 in this paper. It can be seen that the value of \(p_m\) increases from 0 to 0.2 as the number of \(F_{m}\) increases. Overall, the more the stagnation iterations of ABPSO is, the more the influence degree of mutation on particles is.

3.3 Steps of the proposed algorithm

Based on the above work, Fig. 2 shows flowchart of the proposed algorithm. First, the particles are initialized in the search space at random; the Pbest position of each particle is set as the particle itself. After that, the same iteration steps are run circularly to find optimal solution of the optimized problem, until the maximum iteration number \(T_{max}\) is reached. Within the iteration, each particle updates its Pbest and Gbest positions by using the common method (Pan et al. 2008). Note that in this paper the global version PSO is adopted to update Gbest. Based on two best positions obtained, the position of each particle gets then an update by using the strategy proposed in Sect. 3.1. Next, the adaptive mutation proposed in Sect. 3.2 is activated to improve the diversity of particles. After evaluating the swarm fitness, a new cycle begins.

Fig. 2
figure 2

The flowchart of ABPSO algorithm

4 Convergence analysis of ABPSO

In this section, the convergence of ABPSO is analyzed by using stochastic process theory first, and then a set of experiments is designed to validate the conclusion obtained. For convenience of analysis, the mutation probability of ABPSO is fixed to zero in the following discussion.

4.1 Analysis of stochastic convergence

This section analyzes the stochastic convergence of ABPSO by regarding each particle’s position as a stochastic vector. It is worth to point out that the stochastic process theory has been used in literature (Jiang et al. 2007) to analyze stochastic convergence of the standard PSO.

First, assuming that the values of Pbest and Gbest keep constant during a period of time, then ABPSO is reduced as a one-particle one-dimensional ABPSO system with fixed Pbest and Gbest. Second, the expectation and variance of the particle’s position in the simplified one-particle one-dimensional ABPSO system are calculated, and their corresponding convergence is analyzed. After that, the assumption of fixed Pbest is removed, and the convergence property of particle’s cognition is analyzed. Finally, the \(N\)-particle \(D\)-dimensional ABPSO system is recalled, and the results obtained from one-particle one-dimensional PSO system are applied to analyze the convergence of ABPSO.

4.1.1 Reduction of ABPSO

When the swarm operates on an optimization problem, the values of Pbest and Gbest are continually updated, as the system evolves toward an optimum. For convenience, we consider the case that the values of Pbest and Gbest both keep constant during a period of time, and then all particles evolve independently. This means that discussing only one particle is enough. Moreover, it appears from Eq. (6) that each dimension of a particle is updated independently from the others. Thus, without loss of generality, the algorithm description can be reduced to the one-dimensional case. By omitting particle and dimension notations, the update equation becomes as follows:

$$\begin{aligned} {{\varvec{X}}}_{t+1} =\left\{ \begin{array}{ll} {{\varvec{g}}}_t +{\varvec{\sigma }}_t \times N(0,1),&{}\quad rand<prob\\ {\varvec{X}}_t, &{}\quad \hbox {otherwise} \end{array} \right. \end{aligned}$$
(9)

It should be noticed that the above simplification is only for analysis purpose, the original algorithm will be recalled after the analysis is finished.

4.1.2 Convergence analysis of one-particle one-dimensional ABPSO system

In this subsection, the iterative equation of \({\varvec{EX}}_{t}\) is obtained, where \({\varvec{EX}}_{t}\) is the expectation of random variable \({\varvec{X}}_{t}\). Based on the iterative equation, next the convergence of sequence \(\{{\varvec{EX}}_{t}\}\) is analyzed.

Theorem 1

If and only if   \(0< prob\le 1\), iterative process \(\{{\varvec{EX}}_{t}\}\) is guaranteed to converge to \(0.5({{\varvec{Pb}}}_{i}+{{\varvec{Gb}}}).\)

Proof

First we discuss convergence condition of the iterative process \(\{{\varvec{EX}}_{t}\}\). According to Eq. (9), iterative equation of \(\{{\varvec{EX}}_{t}\}\) can be obtained as follows:

$$\begin{aligned} {\varvec{EX}}_{t+1}{=}0.5\times prob \times ({\varvec{Pb}}_i \!+\!{\varvec{Gb}})+(1{-}prob)\times {\varvec{EX}}_t.\nonumber \\ \end{aligned}$$
(10)

And, the characteristic equation of the above iterative process is

$$\begin{aligned} \lambda + prob-1=0 \end{aligned}$$
(11)

As we know, the convergence condition of iterative process \(\{{\varvec{EX}}_{t}\}\) is that the absolute value of \(\lambda \) is less than 1, i.e., \(\lambda =1-prob<1\). Further, because \(0\le prob \le 1,\) the probability prob must satisfy \(0<prob\le 1.\)

After that, we prove that the iterative process \(\{{\varvec{EX}}_{t}\}\) converges to \(0.5({\varvec{Pb}}_{i}+{\varvec{Gb}}\)). Assuming that \(\{{\varvec{EX}}_{t}\}\) converges to \({\varvec{EX}}\), according to Eq. (10), we have

$$\begin{aligned} {\varvec{EX}}=0.5\times prob\times ({\varvec{Pb}}_i +{\varvec{Gb}})+(1-prob)\times {\varvec{EX}}\nonumber \\ \end{aligned}$$
(12)

Simplifying Eq. (12), obviously, \({\varvec{EX}}=0.5({\varvec{Pb}}_i +{\varvec{Gb}})\). \(\square \)

Theorem 2

For a given disturbance \(\Delta \ge 0\), if and only if \(0< prob\le 1,\) iterative process \(\{{{\varvec{DX}}}_t\}\) is guaranteed to converge to \(|{{\varvec{Pb}}_i-{\varvec{Gb}}}|+\Delta .\)

Proof

Supposing \(A=0.5( {{\varvec{Pb}}_i +{\varvec{Gb}}})\hbox { and }B\,{=}\,|{{\varvec{Pb}}_i\,{-}\,{\varvec{Gb}}}|+\Delta \), first we discuss the convergence condition of iterative process \(\{{\varvec{DX}}_{t}\}\). According to the definition of variance, \({\varvec{DX}}_{t}\) is calculated as follows:

$$\begin{aligned} {\varvec{DX}}_{t+1} ={\varvec{EX}}_{t+1}^{2} -({\varvec{EX}}_{t+1})^{2} \end{aligned}$$
(13)

Based on fundamental properties of Gaussian distribution, the distribution function of \(\{{\varvec{X}}_{t}^{2}\}\)can be calculated as follows:

$$\begin{aligned}&{\varvec{X}}_{t+1}^2 =\left\{ \begin{array}{ll} H({\varvec{g}}_t, \varvec{\sigma }_{t}^{\prime } ), &{}\quad rand<prob\\ {\varvec{X}}_t^2, &{}\quad \hbox {otherwise}\\ \end{array} \right. \nonumber \\&{\varvec{g}}_t ={\varvec{A}}^{2}+{\varvec{B}}, \varvec{\sigma }_{t}^{\prime } =2{\varvec{B}}({2{\varvec{A}}^2+{\varvec{B}}}) \end{aligned}$$
(14)

where \({\varvec{g}}_t\) and \(\varvec{\sigma }_{t}^{\prime }\) are the mean and variance of the distribution function \(H\), respectively.

Further, iterative equation of \(\{{\varvec{EX}}_t^2\}\) is described as follows:

$$\begin{aligned} {\varvec{EX}}_{t+1}^{2} =(1-prob)\times {\varvec{EX}}_t^2 +prob \times ({\varvec{A}}^2+{\varvec{B}}) \end{aligned}$$
(15)

Substituting Eqs. (10) and (15) into Eq. (13), then we have

$$\begin{aligned} {\varvec{DX}}_{t+1}&= {\varvec{EX}}_{t+1}^2 -({\varvec{EX}}_{t+1} )^2 \nonumber \\&= (1-prob)\times {\varvec{EX}}_t^2 +prob\times ({\varvec{A}}^2+{\varvec{B}}) \nonumber \\&-\,(prob\times {\varvec{A}}+(1-prob)\times {\varvec{EX}}_t)^{2}\nonumber \\&= (1-prob)\times {\varvec{DX}}_t +prob\times (1-prob)\nonumber \\&\times \, ({\varvec{EX}}_t -{\varvec{A}})^2\;+prob\times {\varvec{B}} \end{aligned}$$
(16)

Since the convergence condition of iterative process \(\{{\varvec{DX}}_t\}\) is that the absolute value of \(\lambda \) is less than 1, and \(0\le prob\le 1\), the probability prob must satisfy \(0<prob\le 1\).

After that, we prove that iterative process \(\{{\varvec{DX}}_t\}\) converges to \({\varvec{B}}\). According to Theorem 1, \(\{{\varvec{EX}}_t\}\) will converge to \({\varvec{A}}\) when \(0<prob\le 1\). Assuming that iterative process \(\{{\varvec{DX}}_t\}\) converges to \({\varvec{DX}}\), according to Eq. (16), we have

$$\begin{aligned} {\varvec{DX}}&= (1-prob)\times {\varvec{DX}}+prob\times (1-prob)\nonumber \\&\times \, ({\varvec{EX}}-{\varvec{A}})^2+prob\times {\varvec{B}} \end{aligned}$$
(17)

Simplifying Eq. (12), obviously, \({\varvec{DX}} = {\varvec{B}}\). \(\square \)

4.1.3 Convergence analysis of Pbest

In the above analysis, it is supposed that the values of Pbest and Gbest keep constant, which is an ideal case. This subsection considers the following real situation: Pbest constantly updates, during the evolution process. But the value of Gbest is still supposed to keep constant and be the best position found so far. As Jiang et al shown in (2007), this assumption is reasonable, because the value of Gbest only influences the final convergent position, and it does not influence the convergence property at all. This subsection focuses still on the one-dimensional one-particle ABPSO system.

Theorem 3

For a given disturbance \(\Delta \ge 0\), if and only if \(0<prob\le 1,\) iterative process \(\{{\varvec{Pb}}_i (t)\}\) will converge to \({\varvec{Gb}}\) with probability \(1.\)

Proof

Since \(0<prob\le 1,\) according to Theorems 1 and 2, the iterative process \(\{{\varvec{DX}}_{t}\}\) converges to \({\varvec{B}}\), and the iterative process \(\{{\varvec{EX}}_{t}\}\) converges to \({\varvec{A}}\). Hence \({\varvec{X}}_{i}\) will converge to a random distribution with the mean \({\varvec{A}}\) and variance \({\varvec{B}}\). No matter what the value of Pbest is, if \(0<prob \le 1,\) then

$$\begin{aligned} |{\varvec{Gb}}-{\varvec{EX}}| =0.5|{\varvec{Gb}}-{\varvec{Pb}}_i| <|{\varvec{Gb}}-{\varvec{Pb}}_{i}|+\Delta \,{=}\,{\varvec{DX}}\nonumber \\ \end{aligned}$$
(18)

Hence, it is obvious that the probability of \({\varvec{X}}_{t}={\varvec{Gb}}\) is more than zero. Moreover, since we still adopt the standard strategy to update Pbest in this paper, shown as follows:

$$\begin{aligned} {\varvec{Pb}}_i (t+1)=\left\{ \begin{array}{ll} {\varvec{Pb}}_{i} (t+1), &{} \hbox {if } f({\varvec{X}}_i (t+1))\\ &{}\quad \ge f({\varvec{Pb}}_{i} (t))\\ {\varvec{X}}_i (t+1), &{} \hbox {otherwise}\\ \end{array} \right. \end{aligned}$$
(19)

Hence, we have

$$\begin{aligned} pro\left( \mathop {\lim }\limits _{t \rightarrow \infty } {\varvec{Pb}}_i (t)={\varvec{Gb}}\right) =1 \end{aligned}$$
(20)

It is evident that the iterative process \(\{{\varvec{Pb}}_{i}(t)\}\) will converge to Gbest with probability 1. \(\square \)

4.1.4 Convergence analysis of ABPSO

In this subsection, the \(N\)-particle \(D\)-dimensional ABPSO system is recalled, and the results obtained from one-particle one-dimensional ABPSO system are used to analyze its convergence.

Theorem 4

If \(0<pro\le 1,\) the ABPSO system will converge in mean square to Gbest.

Proof

According to the results of Theorems 1 and 2, if Pbest and Gbest keep constant during a period of time, then each dimension \(d\) of the particle \({\varvec{X}}_{i}\) will satisfy the following conclusions:

$$\begin{aligned}&\mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{EX}}_i^d (t)=\frac{{\varvec{Pb}}_i^d +{\varvec{Gb}}^d}{2}\end{aligned}$$
(21)
$$\begin{aligned}&\mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{DX}}_i^d (t)=|{\varvec{Pb}}_i^d -{\varvec{Gb}}^d|+\Delta ^d \end{aligned}$$
(22)

It appears from Eq. (6) that each dimension of the particle \({\varvec{X}}_{i}\) is updated independently from the others. Hence for the particle \({\varvec{X}}_{i}\), we have

$$\begin{aligned}&\mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{EX}}_{i} (t)=0.5({\varvec{Pb}}_i +{\varvec{Gb}})\end{aligned}$$
(23)
$$\begin{aligned}&\mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{DX}}_i (t)=| {{\varvec{Pb}}_i -{\varvec{Gb}}}|+\Delta . \end{aligned}$$
(24)

And from the result of Theorem 3, Pbest will converge to Gbest with probability 1, so we have

$$\begin{aligned} \mathop {\lim }\limits _{t\rightarrow \infty }\Delta&= \mathop {\lim }\limits _{t\rightarrow \infty } r_3 \times |{{\varvec{Pb}}_{k1} (t){-}{\varvec{Pb}}_{k2} (t)} |{\times } e^{f(Gb(t))-f(X_i (t))} \nonumber \\&\le \mathop {\lim }\limits _{t\rightarrow \infty } | {{\varvec{Pb}}_{k1} (t)-{\varvec{Pb}}_{k1} (t)}|=0 \end{aligned}$$
(25)

Considering \(\Delta \ge 0, r_3 \in [0,1]\) and \(f(Gb(t))-f(X_i (t))\le 0\), then we have \(\mathop {\lim }\nolimits _{t\rightarrow \infty } \Delta =0\). Further,

$$\begin{aligned} \mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{EX}}_i (t)&= \mathop {\lim }\limits _{t\rightarrow \infty } 0.5({\varvec{Pb}}_i (t)+{\varvec{Gb}})={\varvec{Gb}},\nonumber \\ \mathop {\lim }\limits _{t\rightarrow \infty } {\varvec{DX}}_i (t)&= \mathop {\lim }\limits _{t\rightarrow \infty }| {{\varvec{Pb}}_i (t)-{\varvec{Gb}}}|+\mathop {\lim }\limits _{t\rightarrow \infty } \Delta =0. \end{aligned}$$
(26)

Therefore, \(\{{\varvec{EX}}_{i}(t)\}\) will converge to Gbest finally, and \(\{{\varvec{DX}}_{i}(t)\}\) will converge to 0 finally. This indicates that each sequence \(\{{\varvec{X}}_{i}(t)\}\) in the swarm will stochastically evolve toward Gbest until it converges in mean square to Gbest. This conclusion applies to all particles, thus the whole ABPSO system will converge to Gbest. \(\square \)

4.2 Experiment analysis of convergence

To validate the above theorem, this subsection tests the proposed ABPSO on Sphere function in \(D=30\) for various values of pro and for a swarm of 30 particles. The feasible search space is \(\varvec{I}=[-100,100]^{30}\). The swarm is randomized in the corners of \(I\) at the start of each run. In this paper, a simple method, i.e., the standard deviation among the particles’ fitness, is used to calculate the diversity of particles, and the maximal distance between particles and Gbest is used to calculate the convergence degree of particles, although they only show approximate results.

Figures 3 and 4 plot convergence and diversity curve of particles for different pro values, respectively. It can be seen from Fig. 3 that a fast decrease in the maximal distance occurs for all \(pro>0\) as the swarm converges gradually. The larger the value of pro is, the faster the convergence of swarm is, and ABPSO has the fastest convergence at pro = 1. However, as Fig. 4 shows, a big value of pro results in a fast decrease in the diversity of particles, and the diversity curves tend to be the same shape when \(pro>0.5\). By large numbers of experiments, values in the range [0.5, 0.8] are found to be optimal. In the following experiments, pro was set to 0.7.

Fig. 3
figure 3

Convergence plots for ABPSO on the sphere function. The maximal distance between particles and Gbest is plotted against the iteration times

Fig. 4
figure 4

Diversity plots for ABPSO on the sphere function. The diversity among particles’ fitness is plotted against the iteration times

5 Experimental study on function optimization

5.1 Algorithms for comparison

In order to evaluate the effectiveness and efficiency of ABPSO, we compare its performance with five BBPSO-based algorithms. These algorithms for comparison are listed as follows:

  • BBPSO: the standard BBPSO proposed first by Kennedy (2003);

  • BBEXP: another version proposed by Kennedy (2003);

  • BBDE: the barebones differential evolution (Mahamed et al. 2009);

  • BBPSO-MC: BBPSO with mutation and crossover (Haibo et al. 2011);

  • BBJ: bare-bones algorithm based on the jump strategy (Blackwell 2012);

  • ABPSO: the proposed algorithm with adaptive mutation.

Note that, according to comparison results in Majid al-Rifaie and Blackwell (2012), the BBJ algorithm adopts the global neighbourhood, and its scale parameter is set as \(\alpha =0.75\) in this paper. For BBPSO-MC, its neighbourhood size is 2. The rest parameter settings for those algorithms are inherited from the referenced papers.

5.2 Test functions

To provide a comprehensive comparison, 24 numerical optimization problems are used to conduct this experiment (see Table 1). Based on problems’ characteristics, these benchmarks are grouped into three groups. The first group includes six classical benchmarks (F1–F6), which have been used by Blackwell (2012), Majid al-Rifaie and Blackwell (2012), Mahamed et al. (2009) and Haibo et al. (2011). These functions all are high dimensional, but the distribution of local optima is regular and the variables are separable. The second group includes four low dimensional functions (F7–F10). These functions have a few local optima, but their variables all are non-separable. In order to better approximate real-world problem behavior and to make the benchmark functions more challenging, the last group includes 14 functions from the CEC 2005 test suite (Suganthan et al. 2005). Their global optima are shifted by an arbitrary amount within the entire search space. And parts are rotated to make the global optima lies on or beyond the edge of the search space.

Table 1 Benchmarks functions

5.3 Performance metrics

Three different performance measures (Majid al-Rifaie and Blackwell 2012; Engelbrecht 2006) are used in the following experiment. The first measure is the accuracy of swarms, which is defined by the quality of the best position in terms of its closeness to the optimal position. Assuming that the optimal position of a problem is \(x_{opt}\), the best position of swarm is \(Gb_{t}\) at time \(t,\) then the accuracy (AC) can be calculated as follows:

$$\begin{aligned} AC=|{f(Gb_t)-f(x_{opt})}| \end{aligned}$$
(27)

Another measure is successful ratio (SR), i.e., the percentage of trials where swarms converge with a specified accuracy (in this paper we set \(10^{-8})\), it is defined by

$$\begin{aligned} SR=\frac{n'}{n}\times 100~\% \end{aligned}$$
(28)

When all the trials are successful, i.e., SR  = 100 %, the average number of function evaluations (NFE) required to find the global optima are considered. As the third measure, NFE is used to describe the efficiency of an algorithm, and defined by

$$\begin{aligned} NFE=\frac{1}{n}\sum _{i=1}^n {FE} \end{aligned}$$
(29)

Where \(n\) is the total number of trials and, \(n'\) is the number of successful trials, and FE is the number of function evaluations before convergence.

5.4 Comparison experiments

In this subsection, we set the maximal function evaluation times to be 300,000, and set the number of particles to be 30 for all the comparison algorithms. For all test functions, each algorithm carries out 30 independent runs. All the algorithms are developed in the MATLAB environment and runs on a PC with a 2.6 GHz CPU and 2.0 GB of RAM. In Tables 2, 3 and 4, statistical results of the accuracy, SR and NFE resulting from independent runs are recorded.

Table 2 Optimization results for group 1
Table 3 Optimization results for group 2

5.4.1 Experimental results on group 1

Since the test functions in group 1 are separable, they can be solved by divided-and-conquer methods, such as Yang et al. (2008). It can be observed from Table 2 that only ABPSO found global optima with 100 % successful ratio for the functions F1, and F3–F6. For the function F2, ABPSO also got the best AC value, followed by BBJ. Therefore, it is evident that ABPSO has the best universality on these problems. Furthermore, the paired \(t\)-tests at the 0.05 level of significance (\(\alpha =0.05\)) is used to test the statistical significance of the results in terms of AC values. Note that here ‘Y+’ indicates that ABPSO is significantly better than the selected one at a 0.05 level of significance by two-tailed test, ‘Y\(-\)’ indicates that the selected algorithm is significantly better than ABPSO at a 0.05 level of significance by two-tailed test \(\mathrm{s}\) ‘N’ means that the difference among compared algorithms is not statistically significant, and ‘–’ stands for Not Applicable.

Table 4 Optimization results for group 3

The functions F1 and F2 are uni-modal problems. Among these two functions, the function F1 is relatively easy one, because BBDE, BBJ and ABPSO all solved it with 100 % successful ratio, and BBPSO also found good solution. However, it is inevitable that ABPSO cost more NFE times compared with BBDE and BBJ. The reason for this behavior is likely that the adaptive disturbance operator needs to sacrifice the convergence speed to improve the diversity of particles. For the function F2, a well-known hard test problem, it should be noted that among six algorithms, ABPSO has the best value in terms of AC, and its performance outperforms significantly BBEXP, BBDE, BBJ and BBPSO-MC as the t-test results show.

The functions F3–F6 are separate multi-modal problems, where the number of local optima increases exponentially as the dimension of decision variables increases. The functions F3 and F4 are the hardest ones with the deepest local optima among these problems. It is shown in Table 2 that for F3 and F4, only ABPSO succeeded in converging to global optima with 100 % successful ratio, followed by BBJ. Furthermore, the t-test values show that the performance of ABPSO is significantly better than that of other five algorithms (excluding BBJ for F3) in terms of AC. For the functions F5 and F6, BBEXP, BBPSO-MC and ABPSO all converged to the global optima with 100 % successful ratio, where BBEXP showed faster convergence speed than BBPSO-MC and ABPSO, as the NFE values in Table 2 show.

Furthermore, Fig. 5 shows convergence cures of the six algorithms for part functions. It is observed that the initial convergence speed of ABPSO on these functions is not so high, but after a learning period, the convergence speed of ABPSO increases. This trend is much clearer for Rosenbrock and Rastrigin, on which ABPSO provides the best performance. However, once the basin containing the global optimum is found, ABPSO abruptly accelerates, see functions Rastrigin and Schwefel 2.26.

Fig. 5
figure 5

Convergence curves for part functions of the group 1

5.4.2 Experimental results on group 2

As Table 1 shows, the test functions in group 2 are non-separable. Although they only content a few variables, it becomes difficult to find optimal solutions by adjusting only single variable. Table 3 shows optimization results for group 2 with the six algorithms. Furthermore, the paired \(t\)-tests at the 0.05 level of significance (\(\alpha =0.05\)) is used to test the statistical significance of the results in terms of AC values. The results of \(t\) test have been indicated in the 6-th and 11-th column of this table.

It can be seen from Table 3 that both ABPSO and BBPSO-MC do not show significant difference, they are able to find their global optima with 100 % successful ratio for F7, F8 and F9, although ABPSO shows faster convergence speed than BBPSO-MC as their NFE times reflect. For function F10, ABPSO shows the second best values in terms of AC and SR, and BBPSO-MC has the best ones. However, five algorithms (excluding BBDE) do not show significant difference as their \(t\)-test values reflect. BBPSO found successfully the global optima of F7 and F8 by using the least NFE times, but its performance is worse than both ABPSO and BBPSO-MC for F9 and F10 in terms of AC and SR. By using slightly high NFE times, BBEXP and BBDE also solved successfully F7 and F8. It is worth being noted that BBJ got the worst AC and SR values for F8 and F9, and the worst AC value for F10. This is partly because the jump strategy doesn’t consider the relationship among decision variables. Therefore, ABPSO is a highly competitive algorithm for solving non-separable optimization problems.

5.4.3 Experimental results on group 3

In the group 3, these functions present more challenging versions of common functions from the first group, where their global optima are shifted or/and rotated. The concrete expressions and other details of functions are in Suganthan et al. (2005). Table 4 shows optimization results for group 3 with six algorithms. Furthermore, the paired \(t\)-tests at the 0.05 level of significance (\(\alpha =0.05\)) is used to test the statistical significance of the results in terms of AC values. The results of \(t\)-test have been indicated in the 6-th and 11-th column of this table.

For relatively easy functions F11 and F12, ABPSO, BBPSO and BBDE all are able to find their global optima with 100 % successful ratio. However, herein ABPSO spent more NFE times compared with BBPSO and BBDE. As we know, the convergence and diversity of an algorithm are conflictive each other generally. In ABPSO, when improving diversity of the particles, the adaptive disturbance operator should sacrifice part convergence speed of the swarm.

For function F13, all the six algorithms didn’t find its global optimum by 30 independent runs, because their SR values are still 0. By analyzing their t-test values, we can see that the solutions with BBPSO are significantly better than that with other algorithms; BBPSO-MC has the worst solutions. However, the rest four algorithms do not outperform significantly each other in terms of AC. In other words, ABPSO, BBEXP, BBDE and BBJ all belong to the second best rank for function F13. Similarly, for functions F23, BBJ has the best AC value, which is significantly better than other algorithms. However, the rest five algorithms including ABPSO do not outperform significantly each other in terms of AC.

For functions F14, F15, F18, F19, F20, F22 and F24, ABPSO has all the best AC values compared with other algorithms. Specially, for function F19, only ABPSO among six algorithms is able to solve successfully it with 100 % successful ratio, followed by BBJ with 50 % successful ratio. For F14 and F20, the AC values of ABPSO are significantly better than that of other algorithms. For function F15, both BBDE and BBPSO-MC have good AC values which aren’t significantly inferior to ABPSO, although their means of AC both are less than that of ABPSO. Similarly, for function F22, both BBJ and BBDE have good solutions which aren’t significantly inferior to ABPSO. For functions F18 and F24, BBJ and BBDE also find competitive solutions compared with ABPSO respectively, as their t-test values reflect.

For function F16, BBPSO has the best AC value, and ABPSO has the second best one, which is significantly better than the rest four algorithms. Similarly, for function F21, ABPSO has the second best one, which is significantly better than BBPSO, BBEXP, BBJ and BBPSO-MC. BBDE has the best AC value. For function F17, BBPSO-MC has the best solutions in terms of AC and SR, followed by ABPSO and BBDE. Compared with BBDE, although ABPSO has worse AC value, but it is able to find global optima with high successful ratio, where the SR value of BBDE and ABPSO are 0 and 23.33 % respectively. Overall, ABPSO is also a highly competitive optimization algorithm for solving these Shifted or/and Rotated functions.

5.4.4 Summary

We observe that the proposed ABPSO in general outperforms other algorithms on most of the test functions. To assess the overall performance, the dominance index from Hu et al. (2012) is adopted to quantitatively measure the ABPSO algorithm. Considering any two algorithms, A and B, we say algorithm A dominates algorithm B on a function when satisfying the following condition: within the maximal iteration times, the AC value obtained by algorithm A is significantly better than that obtained by algorithm B. For each algorithm, we obtain the total number of dominated algorithms on each function and then compute the dominance rate as the cumulative number of dominated algorithms on the 24 functions. Then the dominance rates of BBPSO, BBEXP, BBDE, BBJ, BBPSO-MC and ABPSO are 24.2, 24.2, 22.5, 28.3, 26.7 and 53.3 %, respectively. It is observed that ABPSO has the largest dominance rate. This means that ABPSO is the most generalized algorithm for the selected test functions with different properties.

5.5 Analysis of the self-adaptive mutation

The performance of ABPSO has been discussed in Sect. 5.4 by comparing with five BBPSO-based algorithms, but effect of the self-adaptive mutation in ABPSO is not clear. Therefore, this experiment performs an extensive analysis about its effect on the performance of our algorithm, where the mutation probability \(p_{m}\) is selected in turn from \(\{\)0, 0.02, 0.04, 0.1, 0.2, 0.4, 0.6\(\}\). Obviously, \(p_{m}\) = 0 indicates the case without mutation. The rest parameters in this experiment take the same values as Sect. 5.4.

Taking group 1 as an example, Table 5 shows results obtained by the proposed algorithm with different values of \(p_{m}\) . We can easily find that, for F1and F2, the proposed algorithm provided the best AC values when \(p_{m}=0,\) and its AC and/or SR values have gradually worsened as the mutation probability \(P_{m}\) increases. Similarly, the NFE times used by the proposed algorithm have gradually increased as the mutation probability \(p_{m}\) increases for F6. This means that the adaptive disturbance is sufficient for the proposed algorithm to solve these three functions. When \(p_{m}=0.2\), the proposed algorithm provided the best AC and SR values for F3, and solved successfully function F4 with the smallest NFE times. For F5, the proposed algorithm found its global optima with the smallest NFE times when \(p_{m}=0.1\). Taking 0.2 (for F3 and F4) or 0.1 (for F5) as a starting point, the performance of ABPSO has both worsened whether the mutation probability is increased or decreased.

Table 5 Optimization results produced by the proposed algorithm with different \(P_{m}\) values for group 1

Furthermore, the Friedman test is adopted to demonstrate the influence of \(p_{m}\) on our algorithm in the statistical sense. The Friedman test is a non-parametric statistical test, in this paper its hypothesis being tested is that the algorithms with different \(p_{m}\) value show equal optimization performance. To test the above hypothesis, first a rank matrix \((R_{ij})_{n\times k} \) has to be constructed, where \(R_{ij} \) represents the rank (from 1 to k) assigned to the \(j\)-th \(p_{m}\) value on the \(i\)-th test function. Herein, for a fixed \(p_{m}\) value, if the algorithm obtains the lowest AC then its rank is assigned with 1 (if two \(p_{m}\) values have same AC, compare their NFEs values). Based on the results in Table 5, we calculate the value \(T_{f}=0.8879\) in this paper; while the 1\(-\)0.05 quantile of F-distribution with \(k-1\) and \((k-1)(n-1)\) freedom degrees is 2.87 by querying the quantile table of F-distribution, where k = 7, n = 6. Since \(T_{f}=0.8879\) is smaller than 2.87, the null hypothesis that the algorithms with different \(p_{m}\) have the same performance is received at 0.05 significance level. This implies that it is not the best approach to fix \(p_m\) value all the time for all functions. In addition, the ranks’ sums of ABPSO-0, \(-\)0.2, \(-\)0.04, \(-\)0.1, \(-\)0.2, \(-\)0.4 and \(-\)0.6 are 24, 24, 23, 18, 19, 27 and 33, respectively. This indicates partially that ABPSO has good solutions for all six functions when \(p_{m}=0.1\) and 0.2. Overall, in this paper we proposed the self-adaptive mutation to trade-off different functions, and set the upper limit of \(p_{m}\) to be 0.2.

Moreover, the self-adaptive mutation is inserted in the BBEXP and BBDE to validate its effectiveness. Here, the BBEXP with the self-adaptive mutation is denoted by BBEXP1, and the BBDE with the self-adaptive mutation is denoted by BBDE1. Table 6 shows results obtained by the BBEXP, BBEXP1, BBDE and BBDE1 on optimizing Group 1.

Table 6 Results produced by the BBEXP and BBDE with the self-adaptive mutation for group 1

It can be seen that the performance of BBDE gets significant increase by the help of the self-adaptive mutation for functions F2, F3, F4 and F5, as their AC and SR values reflect. And, for the rest functions F1 and F6, the BBDE1 has also similar results as BBDE in terms of AC and SR. For the BBEXP algorithm, the self-adaptive mutation improves also its capability to tackle functions F3 and F4, but the capability of tackling uni-modal functions F1 and F2 decreases. This degeneration phenomenon can be partly attributed to the decrease of local exploitation of warm. Further, by increasing the maximal evaluation times from 300,000 to 450,000, experimental results on functions F1 and F2 proved the above opinion, where AC(F1) = 7.4534E\(-\)8, and AC(F2) =  0.2157.

5.6 Analysis of the disturbance factor

As the self-adaptive mutation, the disturbance factor plays an important role in improving the performance of ABPSO. Therefore, this experiment performs an extensive analysis about it. The rest parameters in this experiment take the same values as Sect. 5.4. For the sake of simplicity, the ABPSO without the disturbance is denoted by ABPSO1. Table 7 shows results obtained by the ABPSO and ABPSO1 on optimizing different kinds of functions.

Table 7 Results produced by the ABPSO with/without the adaptive disturbance for six functions

It can be seen that the performance of ABPSO was improved by the help of the adaptive disturbance for uni-modal functions F1 and F2, as well as two functions with non-separable variables, F9 and F10. For all these functions, the mean values of ABPSO are both obviously better than that of ABPSO1 in terms of AC and SR measures. For multi-modal functions F3 and F4, compared with ABPSO1, ABPSO has also good results in terms of both AC and SR, which are light better than ABPSO.

6 Conclusion

In this paper, an improved BBPSO algorithm with adaptive disturbance is proposed. The proposed algorithm employs the adaptive disturbance to balance the diversity and convergence speed of swarm, and introduces the self-adaptive mutation to improve the global convergence. Moreover, the stochastic process theory is applied to analyze the convergence of ABPSO. The analysis results lead to a convergent condition for the ABPSO, and the corresponding parameter ranges are given. Finally, the effectiveness of ABPSO is experimentally verified via comparing it with several BBPSO-based algorithms on 24 test functions with different characteristics such as uni-modality, multi-modality, rotation and ill-condition. Results show that ABPSO is highly competitive to other BBPSO-based algorithms.