ABSTRACT
Hyper-Heuristics is an active research field that aims to automatically select (or generate) the best low-level heuristic in each step of the search process. This work investigates a Hyper-Heuristic with a Deep Q-Network (DQN) selection strategy and compares it with two state-of-the-art approaches, namely the Dynamic MAB and the Fitness-Rate-Rank MAB. The experiments conducted on two domains from the HyFlex framework showed that the DQN approach outperformed the others on the Vehicle Routing Problem and was competitive on the Traveling Salesman Problem. This indicates that the DQN is a robust selection strategy that is less sensitive to the domain than the MAB based approaches.
- Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2 (2002), 235--256.Google ScholarDigital Library
- Christian Blum, Jakob Puchinger, Günther R. Raidl, and Andrea Roli. 2011. Hybrid metaheuristics in combinatorial optimization: A survey. Applied Soft Computing 11, 6 (2011), 4135 -- 4151.Google ScholarDigital Library
- Edmund K. Burke, Matthew Hyde, Graham Kendall, Gabriela Ochoa, Ender Özcan, and John R. Woodward. 2010. A Classification of Hyper-heuristic Approaches. Springer US, Boston, MA, 449--468. Google ScholarCross Ref
- Luis DaCosta, Alvaro Fialho, Marc Schoenauer, and Michèle Sebag. 2008. Adaptive Operator Selection with Dynamic Multi-Armed Bandits. In Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08). Association for Computing Machinery, New York, NY, USA, 913--920. Google ScholarDigital Library
- Carolina P. de Almeida, Richard A. Gonçalves, Sandra M. Venske, Ricardo Lüders, and Myriam Regattieri Delgado. 2018. Multi-armed Bandit Based Hyper-Heuristics for the Permutation Flow Shop Problem. In 7th Brazilian Conference on Intelligent Systems, BRACIS 2018, São Paulo, Brazil, October 22-25, 2018. IEEE Computer Society, 139--144. Google ScholarCross Ref
- John H. Drake, Ahmed Kheiri, Ender Özcan, and Edmund K. Burke. 2020. Recent advances in selection hyper-heuristics. European Journal of Operational Research 285, 2 (2020), 405--428. Google ScholarCross Ref
- A. S. Ferreira, R. A. Gonçalves, and A. Pozo. 2017. A Multi-Armed Bandit selection strategy for Hyper-heuristics. In 2017 IEEE Congress on Evolutionary Computation (CEC). 525--532. Google ScholarDigital Library
- Álvaro Fialho. 2010. Adaptive Operator Selection for Optimization. Theses. Université Paris Sud - Paris XI. https://tel.archives-ouvertes.fr/tel-00578431Google Scholar
- Stephanus Daniel Handoko, Duc Thien Nguyen, Zhi Yuan, and Hoong Chuin Lau. 2014. Reinforcement Learning for Adaptive Operator Selection in Memetic Search Applied to Quadratic Assignment Problem. In Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO Comp '14). Association for Computing Machinery, New York, NY, USA, 193--194.Google ScholarDigital Library
- K. Li, Á. Fialho, S. Kwong, and Q. Zhang. 2014. Adaptive Operator Selection With Bandits for a Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Computation 18, 1 (2014), 114--130. Google ScholarCross Ref
- G. Ochoa, M. Hyde, T. Curtois, J.A. Vazquez-Rodriguez, J. Walker, M. Gendreau, G. Kendall, B. McCollum, A.J. Parkes, S. Petrovic, and E.K. Burke. 2012. HyFlex: A Benchmark Framework for Cross-domain Heuristic Search. 7245 (2012), 136--147.Google Scholar
- Martin L. Puterman. 1990. Chapter 8 Markov Decision Processes. In Handbooks in Operations Research and Management Science. Stochastic Models, Vol. 2. Elsevier, 331--434.Google ScholarCross Ref
- Jorge A. Soria-Alcaraz, Gabriela Ochoa, Marco A. Sotelo-Figeroa, and Edmund K. Burke. 2017. A methodology for determining an effective subset of heuristics in selection hyper-heuristics. European Journal of Operational Research 260, 3 (2017), 972--983. Google ScholarCross Ref
- Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning, Second Edition: An Introduction. MIT Press.Google ScholarDigital Library
- Teck-Hou Teng, Stephanus Daniel Handoko, and Hoong Chuin Lau. 2016. Self-Organizing Neural Network for Adaptive Operator Selection in Evolutionary Search. In Learning and Intelligent Optimization (Lecture Notes in Computer Science), Paola Festa, Meinolf Sellmann, and Joaquin Vanschoren (Eds.). Springer International Publishing, Cham, 187--202.Google Scholar
- Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-Learning. Machine Learning 8, 3 (May 1992), 279--292.Google ScholarDigital Library
- D. H. Wolpert and W. G. Macready. 1997. No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1, 1 (April 1997), 67--82.Google ScholarDigital Library
Index Terms
- Using deep Q-network for selection hyper-heuristics
Recommendations
Online Selection of Heuristic Operators with Deep Q-Network: A Study on the HyFlex Framework
Intelligent SystemsAbstractGeneral and adaptive strategies have been a highly pursued goal of the optimization community, due to the domain-dependent set of configurations (operators and parameters) that is usually required for achieving high quality solutions. This work ...
Hyper-heuristics and cross-domain optimization
GECCO '12: Proceedings of the 14th annual conference companion on Genetic and evolutionary computationHyper-heuristics comprise a set of approaches which are motivated (at least in part) by the goal of automating the design of heuristic methods to solve hard computational search problems. An underlying strategic research challenge is to develop more ...
Iterated local search using an add and delete hyper-heuristic for university course timetabling
Graphical abstractDisplay Omitted HighlightsAdd and delete operations are encoded as a list/string of integers (ADL).An effective hyper-heuristic approach operating with ADLs is proposed.Low level heuristics perform search over the space of feasible ...
Comments