Abstract
Reinforcement Learning (RL) has made several advances in the machine learning domain especially Deep Reinforcement Learning. AlphaGo developed by DeepMind is a good example of how the deep neural network can train an agent to play and outperform professional Go players. Auto optimizing parameter is a relatively challenging research area. To automatically tune the parameters of the system, we formalize the problem into the setting of reinforcement learning model, where the aim is to train an agent whose goal is to find the optimum parameters by observing the state of the system and taking actions to maximize a cumulative reward. In this work, we solve this problem with the matrix version of Q-learning and the neural network version of Q-learning. We developed a new heuristic of Q-learning algorithm, called Dynamic Action Space (DAS), to further improve the robustness of the algorithm in finding the optimum state. The DAS approach significantly improved the convergence and stability of the algorithm. Then we tested the approach on three deep neural network variants, namely Deep Q-Networks (DQN), Double Deep Q-Networks (DDQN) and Dueling Networks. We show that the heuristic DAS model helps the Deep RL networks to converge better than the baseline Q-Learning model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abraham, E., et al.: Preparing HPC applications for exascale: challenges and recommendations. In: 18th International Conference on Network-Based Information Systems (NBiS) (2015)
Gainaru, A., Cappello, F., Snir, M., Krammer, W.: Failure prediction for HPC systems and applications: current situation and open issues. Int. J. High Perform. Comput. Appl. 27, 273–282 (2013)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Pham, T.T., Pister, M., Couvee, P.: Recurrent neural network for classifying of HPC applications. In: Proceedings of HPC - Spring Simulation Conference (2019)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Li, Y., Chang, K., Bel, O., Miller, E.L., Long, D.: CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning (2017)
Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1989)
Chenyang, S., Liyuan, C., Steve, J., Xun, J.: Intelligent parameter tuning in optimization-based iterative CT reconstruction via deep reinforcement learning. IEEE Trans. Med. Imaging 37(6), 1430–1439 (2018)
Mnih, V., et al.: Playing atari with deep reinforcement learning (2013)
Hasselt, V., Guez, H., Silver, D.: Deep reinforcement learning with double Q-learning (2016)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Proceedings of ICLR (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1106–1114 (2012)
Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR 2013). IEEE (2013)
Sutton, S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)
Mnih, V., et al.: Humanlevel control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th Symposium on Operating Systems Design and Implementation (OSDI 2016). USENIX Association, Savannah (2016)
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_26
Bengio, Y.H., Louradour, Y., Lamblin, P.: Exploring strategies for training deep neural networks. Mach. Learn. Res. 10, 1–40 (2009)
Wang, Z., de Freitas, N., Lanctot, M.: Dueling network architectures for deep reinforcement learning. In: ICLR (2016)
Larsson, H.: Deep reinforcement using deep Q-learning to control optimization hyperparameters learning for cavity filter tuning (2018)
Liessner, R., Schmitt, J., Dietermann, A., Baker, B.: Hyperparameter optimization for deep reinforcement learning in vehicle energy management (2019)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In Proceedings of ICLR (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Pham, TT., Djan, D.M. (2020). Deep Reinforcement Learning for Auto-optimization of I/O Accelerator Parameters. In: Gao, W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds) Benchmarking, Measuring, and Optimizing. Bench 2019. Lecture Notes in Computer Science(), vol 12093. Springer, Cham. https://doi.org/10.1007/978-3-030-49556-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-49556-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49555-8
Online ISBN: 978-3-030-49556-5
eBook Packages: Computer ScienceComputer Science (R0)