A Cluster-Based Competitive Particle Swarm Optimizer with a Sparse Truncation Operator for Multi-Objective Optimization

https://doi.org/10.1016/j.swevo.2022.101083Get rights and content

Abstract

Many different types of multi-objective optimization problems, e.g. multi-modal problems and large-scale problems, have been solved with high performance by numbers of tailored multi-objective evolutionary algorithms. Little attention has been paid on sparse optimization problems, whose most decision variables are zero in the Pareto optimal solution set. Most recently, algorithms for solving sparse problems have been developed rapidly, and many sparse optimization problems in machine learning, such as the search for lightweight neural networks, can be solved with the help of multi-objective evolutionary algorithms. In this paper, we introduce a sparse truncation operator which uses the accumulative gradient value as a criterion for setting a decision variable to zero. In addition, to balance the exploration and exploitation, a cluster-based competitive particle swarm optimizer is proposed, which takes advantage of both particle swarm optimization and competitive swarm optimizer to search efficiently and escape from local optima. Consequently, aiming at solving sparse multi-objective optimization problems, a novel cluster-based competitive particle swarm optimizer with a sparse truncation operator is proposed, and experimental results show that the proposed algorithm outperforms its peers on sparse test instances and neural network training tasks.

Introduction

Convex optimization is an important field in the optimization community, which has been studied extensively [1]. In fact, there are many traditional mathematical methods to solve single objective convex optimization problems, e.g. gradient descent method [2], Newton method [3], and Quasi-Newton method [4]. It is relatively easy for these aforementioned methods to get the minimum of a convex function at a convex set by using the first-order or second-order gradient information. For non-differentiable or multi-modal single-objective optimization problems, or for multi-objective optimization problems, evolutionary algorithms have been widely used [5,6]. Recently, some researchers have applied the idea of the gradient descent method to evolutionary algorithms to improve the search efficiency and maintain a good balance between exploration and exploitation. For example, Han et al. proposed an adaptive gradient multi-objective particle swarm optimization (AGMOPSO) [7], which used a multi-objective gradient method to update the archive. Tian et al. introduced the gradient information into the simulated binary crossover (SBX) to determine the search direction [8], and this strategy can search efficiently in training deep neural networks.

Since the first evolutionary algorithm (EA) was proposed by Schaffer for solving multi-objective optimization problems (MOPs) [9], a great number of population-based meta-heuristic multi-objective evolutionary algorithms (MOEAs) have been developed [10], [11], [12], [13] to solve MOPs, due to their global search ability and insensitivity to the properties of functions. These MOEAs depend on various principles of biology mechanisms [14,15]. Among these EAs, a group of swarm intelligence algorithms, i.e., particle swarm optimization (PSO) algorithms [16], [17], [18] show promising performance due to their high search efficiency and high convergence speed.

In PSO, each particle represents a solution, and particles exchange information by learning from the global/local best and their own personal best (pBest). Furthermore, Blackwell and Kennedy [19] have mentioned that most PSO variants use the global best (gBest) rather than the local best (lBest) in the process of finding optimal solutions, however learning from gBest is easy to arise premature convergence. Accordingly, there are many methods to avoid particles being trapped into local optima. Getting information from local particle neighbors is one typical strategy of these methods [20]. Another way is to modify the topology of PSO [21] to improve the diversity of the swarm, such as subpopulation methods [22], pyramid topology methods [23], and multi-swarm methods[24-25]. However, these methods are time-consuming, and may cause a low convergence speed [19].

To overcome the weakness of premature convergence in PSO, Jin and Cheng have proposed a competitive swarm optimizer (CSO) [26], where a competitive mechanism is introduced to improve the diversity of a swarm. In CSO, two particles are randomly selected in a swarm, and their fitness values are compared. The winner with a smaller value passes directly to the next generation, and the loser learns from the winner. Unlike PSO, neither gBest nor pBest is used in CSO, so particles can learn from various positions and have a better chance to escape from local optima [27-28]. However, CSO is less competitive when it comes to the search efficiency [29], due to using potential misleading features of a good particle and missing useful information of a bad particle [30]. In the newly-proposed LMOCSO [29], an “acceleration” term is added to the position updating formula of the classical CSO to help the algorithm converge fast. On the other hand, CMOPSO [31] introduces the competitive mechanism to the classical PSO, which significantly improves the search effectiveness.

For multi-objective applications [32], [33], [34], most of them can be solved by tailored MOEAs efficiently. However, few algorithms focus on solving sparse MOPs whose most variables are zero in the Pareto optimal solution set [35]. Many fields in machine learning, such as feature selection [36,37], pattern recognition [38], critical node detection [39] and neural network training, regard the sparse optimization as a method for dimensionality reduction and generalization improvement.

Taking neural network training as an example, during the process of training, many weights become very small and it is desirable to push them to zero for improving generalization [40]. There are various ways to prune hidden nodes in the neural network community, such as adding the group lasso penalty to the loss function [41], adding the gate function to the hidden layer [42], and dropout [43]. In order to get a good training model, accuracy and complexity should be balanced by selecting appropriate hyper-parameters, which is very time-consuming in traditional approaches. On the contrary, MOEAs can obtain a set of neural networks at the same time without hyper-parameters required in standard NN training methods [8]. Besides, evolutionary algorithms can also optimize the hyper-parameters (rather than the weights) [44] in neural networks, such as neural network structure [45].

In [35], Tian et al. proposed a novel evolutionary algorithm for solving large-scale sparse MOPs, i.e., SparseEA, where two different encoding strategies have been introduced to initialize the population and generate sparse solutions. Due to its ability to generate sparse solutions, SparseEA is capable of achieving lightweight neural networks, which may of be particular importance in some real-world applications [35,46]. So far, only a limited number of MOEAs have been developed to solve large-scale sparse problems, and most existing MOEAs for sparse optimization adopt the bi-level encoding strategy proposed in SparseEA as their base strategy [35,47,48].

To improve the efficiency of MOEAs to solve sparse problems and to deal with large-scale problems, we simultaneously consider the sparsity, exploration, and exploitation during the search process. We first introduce a sparse truncation operator to be more efficient in finding sparse solutions by using a gradient-based criterion, rather than the binary encoding strategy in [35,47,48], and this newly proposed operator can directly set variables to be zero. Besides, a new particle updating strategy is proposed for balancing the exploration and exploitation by combining the advantages of PSO and CSO. This updating strategy enables the MOEA to be more capable of escaping from local optima with a relatively high convergence speed. The main contributions of this paper are as follows.

  • 1)

    To generate sparse non-dominated solutions, we propose a sparse truncation operator (SparseTO), which is able to generate sparse solutions efficiently.

Inspired by gradient decent method, we use an accumulative gradient value as a criterion of truncation to set decision variables to zero. Specifically, the dimensions of decision variables with smaller gradient values are set to zero, because a smaller gradient value often indicates a low search speed, or it means the solution is trapped into a local optimum. Moreover, we adopt a pairwise competition strategy to avoid setting zero aggressively.

  • 2)

    In order to avoid premature convergence of PSO, we propose a cluster-based competitive particle swarm optimizer, CCPSO.

In this proposed strategy, three sub-swarms are clustered by k-means [49] to improve the diversity of swarms, and the competition strategy is employed in each sub-swarm separately. The winner using the velocity updating strategy of PSO conducts the exploitation task, while the velocity of the loser is updated by CSO strategy to achieve the exploration task. By taking advantage of both PSO and CSO, this optimizer reaches a good balance between exploitation and exploration.

  • 3)

    A model called ST-CCPSO is proposed to solve sparse multi-objective optimization problems by combining the sparse operator (SparseTO) and the particle updating strategy (CCPSO). To verify the competitive performance of ST-CCPSO, four state-of-the-art MOEAs are compared with the proposed method on 13 sparse instance test problems and 15 classification datasets for neural networks.

The outline of this paper is as follows. The framework and details of ST-CCPSO are presented in Section 2. Section 3 shows the results on both instance test problems and neural network training tasks. Conclusions are drawn in Section 4.

Section snippets

The framework of ST-CCPSO

As presented in Algorithm 1, positions and velocities of particles are initialized at the beginning, and the iteration starts. The objective values of each particle are calculated. Inspired by gradient descent method of convex optimization, a sparse truncation operator (SparseTO) is applied to generate sparse solutions according to the gradient of a randomly selected function fm with respect to the original solutions. Subsequently, pBest is updated, and the archive is maintained. Finally, the

Experimental Study

In this section, the experimental settings are presented at first, including the information of test problems, performance indicators, compared algorithms, and parameter settings. Then, the effectiveness of particle updating strategy and the sparsity strategy of ST-CCPSO are studied. In addition, to verify the promising performance of the proposed algorithm, ST-CCPSO and its peers are tested on instances of sparse MOPs. Finally, an application, i.e. neural network training, is applied to show

Conclusion

Many multi-objective optimization problems (like neural network training) in the real world are sparse MOPs. However, there are few algorithms specialized in sparse MOPs. Consequently, a novel algorithm, called ST-CCPSO, is proposed to solve sparse MOPs in this paper. With the help of the sparse operator, SparseTO, the goals of sparsity is achieved. In addition, a cluster-based competitive particle swarm optimizer (CCPSO) is proposed to avoid premature convergence of PSO, and it updates

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported in part by the National Key R&D Program of China under Grant 2019YFA0708700; in part by the National Natural Science Foundation of China under Grant 62173345; and in part by the China University of Petroleum (East China) Postgraduate Innovation Project under Grant YCX2021146.

References (75)

  • J. Zhou et al.

    An opposition-based learning competitive particle swarm optimizer

    2016 IEEE Congress on Evolutionary Computation (CEC)

    (2016)
  • Q. Lin et al.

    A novel multi-objective particle swarm optimization with multiple search strategies

    European Journal of Operational Research

    (2015)
  • V. Trivedi et al.

    A simplified multi-objective particle swarm optimization algorithm

    Swarm Intell.

    (2020)
  • J.D. Gibbons

    Nonparametric Statistical Inference

    (1985)
  • S. Boyd et al.

    Convex optimization

    IEEE Trans. Automat. Contr.

    (2006)
  • A. Rakhlin et al.

    Making gradient descent optimal for strongly convex stochastic optimization

    (2011)
  • M.A. Fernández et al.

    A newton method using exact jacobians for solving fluid-structure coupling

    Comput Struct.

    (2005)
  • J.D. Ser et al.

    Bio-inspired computation: Where we stand and what’s next

    Swarm Evol. Comput.

    (2019)
  • H. Han et al.

    Adaptive gradient multiobjective particle swarm optimization

    IEEE Trans. Cybern.

    (2018)
  • S. Yang et al.

    A gradient-guided evolutionary approach to training deep neural networks

    IEEE Trans. Neural Netw. Learn. Syst.

    (2021)
  • J.D. Schaffer

    Multiple objective optimization with vector evaluated genetic algorithms

    Proc. 1st Int. Conf. Genet.

    (1985)
  • Q. Zhang et al.

    MOEA/d: A multiobjective evolutionary algorithm based on decomposition

    IEEE Trans. Evol. Comput.

    (2007)
  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithm: NSGA-II

    IEEE Trans. Evol. Comput.

    (2002)
  • E. Zitzler et al.

    SPEA2: Improving the strength pareto evolutionary algorithm

    Proc. 5th Conf. Evol. Methods Design Optim. Control Appl. Ind. Probl.

    (2001)
  • A. Zhou et al.

    Multiobjective evolutionary algorithms: A survey of the state of the art

    Swarm Evol. Comput.

    (2011)
  • H. Duan et al.

    Pigeon-inspired optimization: A new swarm intelligence optimizer for air robot path planning

    Int. J. Intell. Comput. Cybern.

    (2014)
  • J. Kennedy et al.

    Particle swarm optimization

    IEEE Int. Conf. Neural Netw.

    (1995)
  • E.H. Houssein et al.

    Major advances in particle swarm optimization: Theory, analysis, and application

    Swarm Evol. Comput.

    (2021)
  • T. Blackwell et al.

    Impact of communication topology in particle swarm optimization

    IEEE Trans. Evol. Comput.

    (2019)
  • N. Lynn et al.

    Population topologies for particle swarm optimization and differential evolution

    Swarm Evol. Comput.

    (2018)
  • M. Lvbjerg et al.

    Hybrid particle swarm optimiser with breeding and subpopulations

    Proc. of the Genetic and Evolutionary Computation Conference, (GECCO 2001)

    (2001)
  • J. Kennedy et al.

    Population structure and particle swarm performance

    Proc. Congr. Evol. Comput.

    (2002)
  • G. Yen et al.

    Dynamic multiple swarms in multiobjective particle swarm optimization

    IEEE Trans. Syst. Man Cybern. Syst.

    (2009)
  • R. Cheng et al.

    A competitive swarm optimizer for large scale optimization

    IEEE Trans. Cybern.

    (2015)
  • G. Xiong et al.

    A simplified competitive swarm optimizer for parameter identification of solid oxide fuel cells

    Energy Convers. Manag.

    (2020)
  • W. Guo et al.

    A grouping particle swarm optimizer with personal-best-position guidance for large scale optimization

    IEEE/ACM Trans. Comput. Biol. Bioinform.

    (2018)
  • Y. Tian et al.

    Efficient large-scale multiobjective optimization based on a competitive swarm optimizer

    IEEE Trans. Cybern.

    (2020)
  • Cited by (44)

    View all citing articles on Scopus
    View full text