skip to main content
10.1145/3449726.3463187acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Using deep Q-network for selection hyper-heuristics

Published:08 July 2021Publication History

ABSTRACT

Hyper-Heuristics is an active research field that aims to automatically select (or generate) the best low-level heuristic in each step of the search process. This work investigates a Hyper-Heuristic with a Deep Q-Network (DQN) selection strategy and compares it with two state-of-the-art approaches, namely the Dynamic MAB and the Fitness-Rate-Rank MAB. The experiments conducted on two domains from the HyFlex framework showed that the DQN approach outperformed the others on the Vehicle Routing Problem and was competitive on the Traveling Salesman Problem. This indicates that the DQN is a robust selection strategy that is less sensitive to the domain than the MAB based approaches.

References

  1. Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2 (2002), 235--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Christian Blum, Jakob Puchinger, Günther R. Raidl, and Andrea Roli. 2011. Hybrid metaheuristics in combinatorial optimization: A survey. Applied Soft Computing 11, 6 (2011), 4135 -- 4151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Edmund K. Burke, Matthew Hyde, Graham Kendall, Gabriela Ochoa, Ender Özcan, and John R. Woodward. 2010. A Classification of Hyper-heuristic Approaches. Springer US, Boston, MA, 449--468. Google ScholarGoogle ScholarCross RefCross Ref
  4. Luis DaCosta, Alvaro Fialho, Marc Schoenauer, and Michèle Sebag. 2008. Adaptive Operator Selection with Dynamic Multi-Armed Bandits. In Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08). Association for Computing Machinery, New York, NY, USA, 913--920. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Carolina P. de Almeida, Richard A. Gonçalves, Sandra M. Venske, Ricardo Lüders, and Myriam Regattieri Delgado. 2018. Multi-armed Bandit Based Hyper-Heuristics for the Permutation Flow Shop Problem. In 7th Brazilian Conference on Intelligent Systems, BRACIS 2018, São Paulo, Brazil, October 22-25, 2018. IEEE Computer Society, 139--144. Google ScholarGoogle ScholarCross RefCross Ref
  6. John H. Drake, Ahmed Kheiri, Ender Özcan, and Edmund K. Burke. 2020. Recent advances in selection hyper-heuristics. European Journal of Operational Research 285, 2 (2020), 405--428. Google ScholarGoogle ScholarCross RefCross Ref
  7. A. S. Ferreira, R. A. Gonçalves, and A. Pozo. 2017. A Multi-Armed Bandit selection strategy for Hyper-heuristics. In 2017 IEEE Congress on Evolutionary Computation (CEC). 525--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Álvaro Fialho. 2010. Adaptive Operator Selection for Optimization. Theses. Université Paris Sud - Paris XI. https://tel.archives-ouvertes.fr/tel-00578431Google ScholarGoogle Scholar
  9. Stephanus Daniel Handoko, Duc Thien Nguyen, Zhi Yuan, and Hoong Chuin Lau. 2014. Reinforcement Learning for Adaptive Operator Selection in Memetic Search Applied to Quadratic Assignment Problem. In Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO Comp '14). Association for Computing Machinery, New York, NY, USA, 193--194.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Li, Á. Fialho, S. Kwong, and Q. Zhang. 2014. Adaptive Operator Selection With Bandits for a Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Computation 18, 1 (2014), 114--130. Google ScholarGoogle ScholarCross RefCross Ref
  11. G. Ochoa, M. Hyde, T. Curtois, J.A. Vazquez-Rodriguez, J. Walker, M. Gendreau, G. Kendall, B. McCollum, A.J. Parkes, S. Petrovic, and E.K. Burke. 2012. HyFlex: A Benchmark Framework for Cross-domain Heuristic Search. 7245 (2012), 136--147.Google ScholarGoogle Scholar
  12. Martin L. Puterman. 1990. Chapter 8 Markov Decision Processes. In Handbooks in Operations Research and Management Science. Stochastic Models, Vol. 2. Elsevier, 331--434.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jorge A. Soria-Alcaraz, Gabriela Ochoa, Marco A. Sotelo-Figeroa, and Edmund K. Burke. 2017. A methodology for determining an effective subset of heuristics in selection hyper-heuristics. European Journal of Operational Research 260, 3 (2017), 972--983. Google ScholarGoogle ScholarCross RefCross Ref
  14. Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning, Second Edition: An Introduction. MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Teck-Hou Teng, Stephanus Daniel Handoko, and Hoong Chuin Lau. 2016. Self-Organizing Neural Network for Adaptive Operator Selection in Evolutionary Search. In Learning and Intelligent Optimization (Lecture Notes in Computer Science), Paola Festa, Meinolf Sellmann, and Joaquin Vanschoren (Eds.). Springer International Publishing, Cham, 187--202.Google ScholarGoogle Scholar
  16. Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-Learning. Machine Learning 8, 3 (May 1992), 279--292.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. H. Wolpert and W. G. Macready. 1997. No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1, 1 (April 1997), 67--82.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using deep Q-network for selection hyper-heuristics

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion
        July 2021
        2047 pages
        ISBN:9781450383516
        DOI:10.1145/3449726

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 July 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader