Skip to main content

Reinforcement Learning Based Whale Optimizer

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12957))

Abstract

This work proposes a Reinforcement Learning based optimizer integrating SARSA and Whale Optimization Algorithm. SARSA determines the binarization operator required during the metaheuristic process. The hybrid instance is applied to solve benchmarks of the Set Covering Problem and it is compared with a Q-learning version, showing good results in terms of fitness, specifically, SARSA beats its Q-Learning version in 44 out of 45 instances evaluated. It is worth mentioning that the only instance where it does not win is a tie. Finally, thanks to graphs presented in our results analysis we can observe that not only does it obtain good results, it also obtains a correct exploration and exploitation balance as presented in the referenced literature.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bisong, E.: Google colaboratory. In: Bisong, E. (ed.) Building Machine Learning and Deep Learning Models on Google Cloud Platform, pp. 59–64. Springer, Heidelberg (2019). https://doi.org/10.1007/978-1-4842-4470-8_7

    Chapter  Google Scholar 

  2. Cisternas-Caneo, F., et al.: A data-driven dynamic discretization framework to solve combinatorial problems using continuous metaheuristics. In: Abraham, A., Sasaki, H., Rios, R., Gandhi, N., Singh, U., Ma, K. (eds.) IBICA 2020. AISC, vol. 1372, pp. 76–85. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73603-3_7

    Chapter  Google Scholar 

  3. Crawford, B., León de la Barra, C.: Los algoritmos ambidiestros (2020). https://www.mercuriovalpo.cl/impresa/2020/07/13/full/cuerpo-principal/15/. Acceded 12 Feb 2021

  4. Hussain, K., Zhu, W., Salleh, M.N.M.: Long-term memory Harris’ hawk optimization for high dimensional and optimal power flow problems. IEEE Access 7, 147596–147616 (2019)

    Article  Google Scholar 

  5. Lanza-Gutierrez, J.M., Crawford, B., Soto, R., Berrios, N., Gomez-Pulido, J.A., Paredes, F.: Analyzing the effects of binarization techniques when solving the set covering problem through swarm optimization. Expert Syst. Appl. 70, 67–82 (2017)

    Article  Google Scholar 

  6. Lemus-Romani, J., et al.: Ambidextrous socio-cultural algorithms. In: Gervasi, O., et al. (eds.) ICCSA 2020. LNCS, vol. 12254, pp. 923–938. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58817-5_65

    Chapter  Google Scholar 

  7. Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 50–60 (1947)

    Google Scholar 

  8. Mirjalili, S., Lewis, A.: The whale optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016)

    Article  Google Scholar 

  9. Misra, S.: A step by step guide for choosing project topics and writing research papers in ICT related disciplines. In: ICTA 2020. CCIS, vol. 1350, pp. 727–744. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69143-1_55

    Chapter  Google Scholar 

  10. Morales-Castañeda, B., Zaldivar, D., Cuevas, E., Fausto, F., Rodríguez, A.: A better balance in metaheuristic algorithms: does it exist? Swarm Evol. Comput. 100671 (2020)

    Google Scholar 

  11. Song, H., Triguero, I., Özcan, E.: A review on the self and dual interactions between machine learning and optimisation. Progress Artif. Intell. 8(2), 143–165 (2019). https://doi.org/10.1007/s13748-019-00185-z

    Article  Google Scholar 

  12. Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)

    Google Scholar 

  13. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  14. Sutton, R.: Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in Neural Information Processing Systems, vol. 8 (1996)

    Google Scholar 

  15. Talbi, E.G.: Metaheuristics: From Design to Implementation, vol. 74. Wiley, Hoboken (2009)

    Book  Google Scholar 

  16. Talbi, E.G.: Machine learning into metaheuristics: a survey and taxonomy of data-driven metaheuristics (2020)

    Google Scholar 

  17. Tapia, D., et al.: A Q-learning hyperheuristic binarization framework to balance exploration and exploitation. In: Florez, H., Misra, S. (eds.) ICAI 2020. CCIS, vol. 1277, pp. 14–28. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61702-8_2

    Chapter  Google Scholar 

  18. Tapia, D., et al.: Embedding q-learning in the selection of metaheuristic operators: the enhanced binary grey wolf optimizar case. In: Proceeding of 2021 IEEE International Conference on Automation/XXIV Congress of the Chilean Association of Automatic Control (ICA-ACCA), IEEE ICA/ACCA 2021, Article in Press (2021)

    Google Scholar 

  19. Taylor, M.E., Stone, P., Liu, Y.: Transfer learning via inter-task mappings for temporal difference learning. J. Mach. Learn. Res. 8(9) (2007)

    Google Scholar 

  20. Valdivia, S., et al.: Bridges reinforcement through conversion of tied-arch using crow search algorithm. In: Misra, S., et al. (eds.) ICCSA 2019. LNCS, vol. 11623, pp. 525–535. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24308-1_42

    Chapter  Google Scholar 

  21. Vásquez, C., et al.: Galactic swarm optimization applied to reinforcement of bridges by conversion in cable-stayed arch. In: Misra, S., et al. (eds.) ICCSA 2019. LNCS, vol. 11623, pp. 108–119. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24308-1_10

    Chapter  Google Scholar 

  22. Vásquez, C., et al.: Solving the 0/1 Knapsack problem using a galactic swarm optimization with data-driven binarization approaches. In: Gervasi, O., et al. (eds.) ICCSA 2020. LNCS, vol. 12254, pp. 511–526. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58817-5_38

    Chapter  Google Scholar 

  23. Wang, F.Y., Zhang, H., Liu, D.: Adaptive dynamic programming: an introduction. IEEE Comput. Intell. Mag. 4(2), 39–47 (2009)

    Article  Google Scholar 

  24. Xu, Y., Pi, D.: A reinforcement learning-based communication topology in particle swarm optimization. Neural Comput. Appl. 32(14), 10007–10032 (2019). https://doi.org/10.1007/s00521-019-04527-9

    Article  Google Scholar 

  25. Zhao, D., Zhu, Y.: MEC-a near-optimal online reinforcement learning algorithm for continuous deterministic systems. IEEE Trans. Neural Netw. Learn. Syst. 26(2), 346–356 (2014)

    Article  MathSciNet  Google Scholar 

  26. Zhu, Y., Zhao, D., Li, X.: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics. IET Control Theory Appl. 10(12), 1339–1347 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Broderick Crawford is supported by Grant CONICYT/FONDECYT/REGULAR/1210810. Ricardo Soto is supported by Grant CONICYT/FONDECYT/REGULAR/1190129. José Lemus-Romani is supported by National Agency for Research and Development (ANID)/Scholarship Program/DOCTORADO NACIONAL/2019-21191692. Marcelo Becerra-Rozas is supported by National Agency for Research and Development (ANID)/Scholarship Program/DOCTORADO NACIONAL/2021-21210740.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcelo Becerra-Rozas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Becerra-Rozas, M. et al. (2021). Reinforcement Learning Based Whale Optimizer. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12957. Springer, Cham. https://doi.org/10.1007/978-3-030-87013-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87013-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87012-6

  • Online ISBN: 978-3-030-87013-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics