Skip to main content
Log in

A new sequence optimization algorithm based on particle swarm for machine learning

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

With the coming of 5Gera, the algorithm efficiency and accuracy of artificial intelligence and machine learning are facing more strict requirements and greater challenges. Although artificial neural network (ANN) and deep neural network (DNN) do well in classification and regression, a large amount of calculation leads to a slow response. And the speed of sequence maximum optimization (SMO) algorithm is very fast, but not accurate enough. In this paper, research attempts to put forward a sequential optimization algorithm based on the particle swarm optimization to balance the performance of accuracy and response speed. Compared with ANN, DNN and SMO, this approach not only has a high calculation efficiency and but also do well in the classification and regression.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Abbas G, Gu J, Farooq U, Asad MU, EI-Hawary M, (2017) Solution of an economic dispatch problem through particle swarm optimization: a detailed survey—part I. IEEE Access 5:15105–15141. https://doi.org/10.1109/ACCESS.2017.2723862

    Article  Google Scholar 

  • Baskar N, Asokan P, Saravanan R, Prabhaharan G (2003) Optimization of machining parameters for milling operations using particle swarm optimization algorithm. In: Proceedings of the MOSIM03, pp D13–D21

  • Bertsekas DP (1999) Nonlinear programming. Athena Scienticic, Nashua

    MATH  Google Scholar 

  • Botsinis P, Alanis D, Babar Z, Nguyen HV, Chandra D et al (2016) Quantum-aided multi-user transmission in non-orthogonal multiple access systems. IEEE Access 4:7402–7424. https://doi.org/10.1109/ACC-ESS.2016.259190

    Article  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Dauphin Y, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. arXiv:1–14. http://arxiv.org/abs/1406.2572

  • Dozat T (2016) Incorporating Nesterov momentum into ADAM. ICLR Workshop 1:2013–2016

  • Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  • Flake GW, Lawrence S (2002) Efficient SVM regression training with SMO. Mach Learn 46(1–3):271–290. https://doi.org/10.1023/A:1012474916001

    Article  MATH  Google Scholar 

  • Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. In: Proceedings of CVPR, pp 4700–4708

  • Ioffe S, Szegedy C (2015) Batch normalization : accelerating deep network training by reducing internal covariate shift. arXiv:1-11. https://arxiv.org/abs/1502.03167v3

  • Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Dean J (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv:1-17. https://arxiv.org/abs/1611.04558

  • Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. In: International conference on learning representations, pp 1–13

  • LoshchilovI, Hutter F (2019) Decoupled Weight decay regularization. In: Proceedings of ICLR 2019, pp 1–19

  • Lucas J, Sun S, Grosse R, Zemel IR (2019) Aggregated momentum: stability through passive damping. Proc ICLR 2019:1–22

    Google Scholar 

  • Ma J, Yarats D (2019) Quasi-hyperbolic momentum and ADAM for deep learning. Proc ICLR 2019:1–38

    Google Scholar 

  • MangasarianStreetWolberg OLWNWH (1995) Breast cancer diagnosis and prognosis via linear programming. Oper Res 43(4):570–577. https://doi.org/10.1287/opre.43.4.570

    Article  MathSciNet  MATH  Google Scholar 

  • Nesterov Y (1983) A method for unconstrained convex minimization problem with the rate of convergence o(1/k2). Soviet Math Docl 27:372–376

    Google Scholar 

  • Platt J (1999) Fast training of support vector machines using sequential minimal optimization. Advances in Kernel methods: support vector learning, pp 185–208

  • Qi K, LeiQidi WW (2007) A novel self-organizing particle swarm optimization based on gravitation field model. Am Control Conf 2007:528–533. https://doi.org/10.1109/ACC.2007.4282541

    Article  Google Scholar 

  • Reddi SJ, Kumar S, Kale S (2018) On the Convergence of ADAM and beyond. Proc ICLR 2018:1–23

    Google Scholar 

  • Ruder S (2016) An overview of gradient descent optimization algorithms [online]. http://ruder.io/optimizing-gradient-descent/index.html. Accessed 7 June 2020

  • Rumelhart DE, McClelland JL (1987) Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition: foundation, MIT Press, pp 318–362

  • Schlag S, Schmitt M, Schulz C (2019) Faster support vector machines. In: Proceedings of the twenty-first workshop on algorithm engineering and experiments (ALENEX), pp 199–210. https://doi.org/10.1137/1.9781611975499.16

  • Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11:1188–1193. https://doi.org/10.1109/72.870050

    Article  Google Scholar 

  • Shi Y, Eberhart RC (1999) Empirical study of particle swarm optimization. In: Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 1999, vol 3, pp 1945–1950. https://doi.org/10.1109/CEC.1999.785511

  • Street WN, Wolberg WH, Mangasarian OL (1993) Nuclear feature extraction for breast tumor diagnosis. In: IS&T/SPIE 1993 international symposium on electronic imaging: science and technology, pp 861–870

  • Ti N, Jun G, Tetsuo N (2008) Global convergence of SMO algorithm for support vector regression. IEEE Trans Neural Netw 19(6):971–982. https://doi.org/10.1109/TNN.2007.915116

    Article  Google Scholar 

  • Torres-Barrán A, Dorronsoro JR (2016) Nesterov acceleration for the SMO algorithm. In: Villa A, Masulli P, Pons Rivero A (eds) Artificial neural networks and machine learning—ICANN 2016. ICANN 2016. Lecture notes in computer science, vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_29

  • Torres-Barrán A, Alaíz C, Dorronsoro JR (2020) Faster SVM training via conjugate SMO. arXiv:1-20. https://arxiv.org/abs/2003.08719

  • Tsuruoka Y, Tsujii J, Ananiadou S (2009) Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, pp 477–485

  • UCI Machine Learning Repository (2020) [online]. http://archive.ics.uci.edu/ml/. Accessed 10 Mar 2020

  • Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv:1–6. https://arxiv.org/abs/1212.5701

  • Zhang S, Choromanska A, LeCun Y (2015) Deep learning with elastic averaging SGD. In: Neural information processing systems conference (NIPS 2015), pp 1–24. http://arxiv.org/abs/1412.6651

  • Zhang Q, Wang J, Aiguo L, Shouyang W (2018) An improved SMO algorithm for financial credit risk assessment—evidence from China’s banking. Neurocomputing 272(10):314–325. https://doi.org/10.1016/j.n-eucom.2017.07.002

    Article  Google Scholar 

Download references

Acknowledgements

This research was partially supported by Fujian Province Education Hall Youth Project (number: JAT170679), by the Fujian Natural Science Foundation Project (number: 2019J01887), by the Fujian Provincial Marine Economic Development Subsidy Fund Project (number: FJHJF-L-2019-7), by the Electronic information and engineering institute of Fujian Normal University, by the School of Economic of Fujian Normal University, Key Laboratory of Nondestructive Testing, Fuqing Branch of Fujian Normal University;

Funding

This research received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, FZ. and CX; methodology, CX; software, CX; validation, FZ and CX; formal analysis, CX; resources, FZ; data curation, CX; writing—original draft preparation, CX; writing—review and editing, CX; visualization, CX; supervision, FZ; project administration, FZ. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Fuquan Zhang.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

The determination of the range of three variables for classification is provided in the following:

Case 1: \(d_{1} = 1,d_{2} = 1,d_{3} = 1\).

  1. 1.

    \(r = 3C\). In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

  2. 2.

    \({2}C \le r < 3C\). In the case, the further search possible is shown in the Fig. 14a, feasible solution of convex space is \(\Delta P_{1} P_{2} P_{3}\), the coordinates of P1, P2, P3 are (r − 2C, C, C), (C, r − 2C, C), (C, C, r − 2C).

  3. 3.

    \(C < r < 2C\). In the case, there are six vertexes on boundary. The feasible solution of convex space is shown in the Fig. 14b, P1, P2, P3, P4, P5, P6 are vertex on boundary, the coordinates of P1, P2, P3, P4, P5, P6 are (0, r − C, C), (r − C, 0, C), (C, 0, r − C), (C, r − C, 0), (r − C, C, 0), (0, C, r − C).

  4. 4.

    \({0 < }r \le C\). In the case, the further search possible is shown in the Fig. 14c, the coordinates of P1, P2, P3 are (0, 0, r), (r, 0, 0), (0, r, 0).

  5. 5.

    \(r = {0}\). In the case \(z_{1} = {0}\)\(z_{{2}} = {0}\)\(z_{{3}} = {0}\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Fig. 14
figure 14

Illustration classification for case 1: a \({2}C \le r < 3C\); b \(C < r < 2C\); c \({0 < }r \le C\)

Case 2: \(d_{1} = 1,d_{2} = 1,d_{3} = - 1\).

  1. 1.

    \(r = {2}C\). In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = {0}\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

  2. 2.

    \(C \le r < {2}C\). In the case, the further search possible is shown in the Fig. 15a, feasible solution of convex space is \(\Delta P_{1} P_{2} P_{3}\), the coordinates of P1, P2, P3 are (C, C, 2C − r), (C, r − C, 0), (r − C, C, 0).

  3. 3.

    \({0} < r < C\). In the case, there are six vertexes on boundary. The feasible solution of convex space is shown in the Fig. 15b, P1, P2, P3, P4, P5, P6 are vertex on boundary, the coordinates of P1, P2, P3, P4, P5, P6 are (r, C, C), (C, r, C), (C, 0, C − r), (r, 0, 0), (0, r, 0), (0, C, C − r).

  4. 4.

    \(- C{ < }r \le 0\). In the case, the further search possible is shown in the Fig. 15c, the coordinates of P1, P2, P3 are (0, r + C, C), (r + C, 0, C), (0, 0, − r).

  5. 5.

    \(r = - C\). In the case \(z_{1} = {0}\)\(z_{{2}} = {0}\)\(z_{{3}} = C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Fig. 15
figure 15

Illustration classification for case 2: a \(C \le r < {2}C\); b \({0} < r < C\); c \(- C{ < }r \le 0\)

Case 3: \(d_{1} = 1,d_{2} = - 1,d_{3} = 1\)

  1. 1.

    \(r = {2}C\). In the case \(z_{1} = C\)\(z_{{2}} = {0}\)\(z_{{3}} = C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

  2. 2.

    \(C \le r < {2}C\). In the case, the further search possible is shown in the Fig. 16a, feasible solution of convex space is \(\Delta P_{1} P_{2} P_{3}\), the coordinates of P1, P2, P3 are (r − C, 0, C), (C, 0, r − C), (C, 2C − r, C).

  3. 3.

    \({0} < r < C\). In the case, there are six vertexes on boundary. The feasible solution of convex space is shown in the Fig. 16b P1, P2, P3, P4, P5, P6 are vertex on boundary, the coordinates of P1, P2, P3, P4, P5, P6 are (0, C − r, C), (0, 0, r), (r, 0, 0), (C, C − r, 0), (C, C, r), (r, C, C).

  4. 4.

    \(- C{ < }r \le 0\). In the case, the further search possible is shown in the Fig. 16c, the coordinates of P1, P2, P3 are (0, C, r + C), (0, − r, 0), (r + C, C, 0).

  5. 5.

    \(r = - C\). In the case \(z_{1} = {0}\)\(z_{{2}} = C\)\(z_{{3}} = 0\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Fig. 16
figure 16

Illustration classification for case 3: a \(C \le r < {2}C\); b \({0} < r < C\); c \(- C{ < }r \le 0\)

Case 4: \(d_{1} = - 1,d_{2} = 1,d_{3} = 1\).

  1. 1.

    \(r = {2}C\). In the case \(z_{1} = 0\)\(z_{{2}} = C\)\(z_{{3}} = C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

  2. 2.

    \(C \le r < {2}C\). In the case, the further search possible is shown in the Fig. 17a, feasible solution of convex space is \(\Delta P_{1} P_{2} P_{3}\), the coordinates of P1, P2, P3 are (0, r − C, C), (2C−r, C, C), (0, C, r − C).

  3. 3.

    \({0} < r < C\). In the case, there are six vertexes on boundary. The feasible solution of convex space is shown in the Fig. 17b, P1, P2, P3, P4, P5, P6 are vertex on boundary, the coordinates of P1, P2, P3, P4, P5, P6 are (C − r, 0, C), (C, r, C), (C, C, r), (C − r, C, 0) (0, r, 0), (0, 0, r).

  4. 4.

    \(- C{ < }r \le 0\). In the case, the further search possible is shown in the Fig. 17c, the coordinates of P1, P2, P3 are (C, 0, r + C), (− r, 0, 0), (C, r + C, 0).

  5. 5.

    \(r = - C\). In the case \(z_{1} = C\)\(z_{{2}} = 0\)\(z_{{3}} = 0\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Fig. 17
figure 17

Illustration classification for case 4: a \(C \le r < {2}C\); b \({0} < r < C\); c \(- C{ < }r \le 0\)

Case 5: \(d_{1} = 1,d_{2} = - 1,d_{3} = - 1\).

Due to \(r = z_{1} - z_{2} - z_{3}\), multiply − 1 on both sides can get \(- r = - z_{1} + z_{2} + z_{3}\), and \(- C \le - r \le 2C\), so it transform into Case 4.

Case 6: \(d_{1} = - 1,d_{2} = 1,d_{3} = - 1\).

Due to \(r = - z_{1} + z_{2} - z_{3}\), multiply − 1 on both sides can get \(- r = z_{1} - z_{2} + z_{3}\), and \(- C \le - r \le 2C\), so it transform into Case 3.

Case 7: \(d_{1} = - 1,d_{2} = - 1,d_{3} = 1\).

Due to \(r = - z_{1} - z_{2} + z_{3}\), multiply − 1 on both sides can get \(- r = z_{1} + z_{2} - z_{3}\), and \(- C \le - r \le 2C\), so it transform into Case 2.

Case 8: \(d_{1} = - 1,d_{2} = - 1,d_{3} = - 1\).

Due to \(r = - z_{1} - z_{2} - z_{3}\), multiply − 1 on both sides can get \(- r = z_{1} + z_{2} + z_{3}\), and \({0} \le - r \le {3}C\), so it transform into Case 1.

For example, if you select four variables as optimization.

Case 1: \(d_{1} = 1,d_{2} = 1,d_{3} = 1,d_{{4}} = 1\).

  1. 1.

    \(r = {4}C\). In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = C\)\(z_{{4}} = C\).

  2. 2.

    \({3}C \le r < {4}C\). The set of convex vertices are (r − 3C, C, C, C), (C, r − 3C, C, C), (C, C, r − 3C, C), (C, C, C, r − 3C).

  3. 3.

    \({2}C < r < {3}C\). The set of convex vertices are (0, r − 2C, C, C), (r − 2C, 0, C, C), (0, C, r − 2C, C), (r − 2C, C, 0, C), (0, C, C, r − 2C), (r − 2C, C, C, 0), (C, 0, r − 2C, C), (C, r − 2C, 0, C), (C, 0, C, r − 2C), (C, r − 2C, C, 0), (C, C, 0, r − 2C), (C, C, r − 2C, 0).

  4. 4.

    \({\text{C < }}r < 2C\). The set of convex vertices are (0, 0, r − C, C), (0, 0, C, r − C), (0, C, 0, r − C), (0, r − C, 0, C), (0, C, r − C, 0), (0, r − C, C, 0), (C, 0, 0, r − C), (r − C, 0, 0, C), (C, 0, r − C, 0), (r − C, 0, C, 0), (r − C, C, 0, 0), (C, r − C, 0, 0).

  5. 5.

    \({0 < }r \le C\). The set of convex vertices are (0, 0, 0, r), (r, 0, 0, 0), (0, r, 0, 0), (0, 0, r, 0).

  6. 6.

    \(r = {0}\). In the case \(z_{1} = {0}\)\(z_{{2}} = {0}\)\(z_{{3}} = {0}\)\(z_{{4}} = {0}\)

Case 2: \(d_{1} = 1,d_{2} = 1,d_{3} = 1,d_{{4}} = { - }1\).

  1. 1.

    \(r = {3}C\). In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = C\)\(z_{{4}} = {0}\)

  2. 2.

    \({2}C \le r < {3}C\). The set of convex vertices are (C, C, C, 3C − r), (C, C, r − 2C, 0), (C, r − 2C, C, 0), (r − 2C, C, C, 0).

  3. 3.

    \(C \le r < 2C\). The set of convex vertices are(0, r − C, C, 0), (r − C, 0, C, 0), (C, 0, r –C, 0), (C, r − C, 0, 0), (r − C, C, 0, 0), (0, C, r − C, 0), (0, r − C, C, C), (r − C, 0, C, C), (C, C, r − C, C), (0, C, C, 2C − r), (C, C, 0, 2C − r), (C, 0, C, 2C − r)

  4. 4.

    \({0} \le r < C\). The set of convex vertices are (0, 0, r, 0), (r, 0, 0, 0), (0, r, 0, 0), (0, r, C, C), (r, 0, C, C), (C, 0, r, C), (C, r, 0, C), (r, C, 0, C), (0, C, r, C), (C, 0, 0, C − r), (0, C, 0, C − r), (0, 0, C, C − r)

  5. 5.

    \(- C < r < 0\). The set of convex vertices are (r + C, 0, 0, C), (0, r + C, 0, C), (0, 0, r + C, C), (0, 0, 0, − r)

  6. 6.

    \(r = - C\). In the case \(z_{1} = {0}\)\(z_{{2}} = {0}\)\(z_{{3}} = {0}\)\(z_{{4}} = C\).

Case 3: \(d_{1} = 1,d_{2} = 1,d_{3} = - 1,d_{{4}} = - 1\).

  1. 1.

    \(r = {2}C\). In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = {0}\)\(z_{{4}} = {0}\)

  2. 2.

    \(C \le r < {2}C\). The set of convex vertices are (C, r, C, 0), (r, C, C, 0), (C, r, 0, C), (r, C, 0, C).

  3. 3.

    \(0 \le r < C\). The set of convex vertices are(C, C, C, C − r), (0, C, 0, C − r), (C, 0, 0, C − r), (C, C, C − r, C), (0, C, C − r, 0), (C, 0, C − r, 0), (r, 0, 0, 0), (0, r, 0, 0), (r, C, C, 0), (r, C, 0, C), (C, r, C, 0), (C, r, 0, C)

  4. 4.

    \(- C \le r < 0\). The set of convex vertices are (C + r, C, C, C), (C + r, 0, 0, C), (C + r, 0, C, 0), (C, C + r, C, C), (0, C + r, 0, C), (0, C + r, C, 0), (0, 0, − r, 0), (C, 0, − r, C), (0, C, − r, C), (0, 0, 0, − r), (C, 0, C, − r), (0, C, C, − r)

  5. 5.

    \(- 2C < r < - C\). The set of convex vertices are (2C + r, 0, C, C), (0, 2C + r, C, C), (0, 0, − r − C, C), (0, 0, C, − r − C)

  6. 6.

    \(r = - {2}C\). In the case \(z_{1} = {0}\)\(z_{{2}} = {0}\)\(z_{{3}} = C\)\(z_{{4}} = C\).

Other case can transform into above case which is similar with three parameters.

Appendix B

The determination of the range of three variables for regression is provided in the following:

Case 1: \(r = 3C\).In the case \(z_{1} = C\)\(z_{{2}} = C\)\(z_{{3}} = C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Case 2: \(C \le r < 3C\).In the case, the further search possible is shown in the Fig. 18, feasible solution of convex space is \(\Delta P_{1} P_{2} P_{3}\), P1, P2, P3 are vertex on boundary, the coordinates of P1, P2, P3 are (r − 2C, C, C), (C, r − 2C, C), (C, C, r − 2C).

Case 3: \(- C \le r < C\).In the case, there are six vertexes on boundary. The feasible solution of convex space is shown in the Fig. 19P1, P2, P3, P4, P5, P6 are vertex on boundary, the coordinates of P1, P2, P3, P4, P5, P6 are (− C, r, C), (− C, C, r), (r, C, − C), (C, r, − C), (C, − C, r), (r, − C, C).

Case 4: \(- {3}C < r \le - C\).In the case, the feasible solution of convex space is shown in the Fig. 20, P1, P2, P3 are vertex on boundary, the coordinates of P1, P2, P3 are (− C, − C, r), (− C, r, − C), (r, − C, − C).

Fig. 18
figure 18

Illustration regression for case 2 \(C \le r < 3C\)

Fig. 19
figure 19

Illustration regression for case 3 \(- C \le r < C\)

Fig. 20
figure 20

Illustration regression for case 4 \(- {3}C < r \le - C\)

Case 5: \(r = - 3C\). In the case, \(z_{1} = - C\) \(z_{{2}} = - C\) \(z_{{3}} = - C\). No further search is possible and we may find another to continue the algorithm or simply ignore this case.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, C., Zhang, F. A new sequence optimization algorithm based on particle swarm for machine learning. J Ambient Intell Human Comput 13, 2601–2619 (2022). https://doi.org/10.1007/s12652-021-03004-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-021-03004-3

Keywords

Navigation