Skip to main content

Deep Reinforcement Learning for Auto-optimization of I/O Accelerator Parameters

  • Conference paper
  • First Online:
Benchmarking, Measuring, and Optimizing (Bench 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12093))

Included in the following conference series:

  • 1229 Accesses

Abstract

Reinforcement Learning (RL) has made several advances in the machine learning domain especially Deep Reinforcement Learning. AlphaGo developed by DeepMind is a good example of how the deep neural network can train an agent to play and outperform professional Go players. Auto optimizing parameter is a relatively challenging research area. To automatically tune the parameters of the system, we formalize the problem into the setting of reinforcement learning model, where the aim is to train an agent whose goal is to find the optimum parameters by observing the state of the system and taking actions to maximize a cumulative reward. In this work, we solve this problem with the matrix version of Q-learning and the neural network version of Q-learning. We developed a new heuristic of Q-learning algorithm, called Dynamic Action Space (DAS), to further improve the robustness of the algorithm in finding the optimum state. The DAS approach significantly improved the convergence and stability of the algorithm. Then we tested the approach on three deep neural network variants, namely Deep Q-Networks (DQN), Double Deep Q-Networks (DDQN) and Dueling Networks. We show that the heuristic DAS model helps the Deep RL networks to converge better than the baseline Q-Learning model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abraham, E., et al.: Preparing HPC applications for exascale: challenges and recommendations. In: 18th International Conference on Network-Based Information Systems (NBiS) (2015)

    Google Scholar 

  2. Gainaru, A., Cappello, F., Snir, M., Krammer, W.: Failure prediction for HPC systems and applications: current situation and open issues. Int. J. High Perform. Comput. Appl. 27, 273–282 (2013)

    Article  Google Scholar 

  3. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  4. Pham, T.T., Pister, M., Couvee, P.: Recurrent neural network for classifying of HPC applications. In: Proceedings of HPC - Spring Simulation Conference (2019)

    Google Scholar 

  5. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  6. Li, Y., Chang, K., Bel, O., Miller, E.L., Long, D.: CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning (2017)

    Google Scholar 

  7. Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1989)

    MATH  Google Scholar 

  8. Chenyang, S., Liyuan, C., Steve, J., Xun, J.: Intelligent parameter tuning in optimization-based iterative CT reconstruction via deep reinforcement learning. IEEE Trans. Med. Imaging 37(6), 1430–1439 (2018)

    Article  Google Scholar 

  9. Mnih, V., et al.: Playing atari with deep reinforcement learning (2013)

    Google Scholar 

  10. Hasselt, V., Guez, H., Silver, D.: Deep reinforcement learning with double Q-learning (2016)

    Google Scholar 

  11. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Proceedings of ICLR (2015)

    Google Scholar 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1106–1114 (2012)

    Google Scholar 

  13. Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR 2013). IEEE (2013)

    Google Scholar 

  14. Sutton, S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)

    Book  Google Scholar 

  15. Mnih, V., et al.: Humanlevel control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  16. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th Symposium on Operating Systems Design and Implementation (OSDI 2016). USENIX Association, Savannah (2016)

    Google Scholar 

  17. Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 437–478. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_26

    Chapter  Google Scholar 

  18. Bengio, Y.H., Louradour, Y., Lamblin, P.: Exploring strategies for training deep neural networks. Mach. Learn. Res. 10, 1–40 (2009)

    MATH  Google Scholar 

  19. Wang, Z., de Freitas, N., Lanctot, M.: Dueling network architectures for deep reinforcement learning. In: ICLR (2016)

    Google Scholar 

  20. Larsson, H.: Deep reinforcement using deep Q-learning to control optimization hyperparameters learning for cavity filter tuning (2018)

    Google Scholar 

  21. Liessner, R., Schmitt, J., Dietermann, A., Baker, B.: Hyperparameter optimization for deep reinforcement learning in vehicle energy management (2019)

    Google Scholar 

  22. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In Proceedings of ICLR (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trong-Ton Pham .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pham, TT., Djan, D.M. (2020). Deep Reinforcement Learning for Auto-optimization of I/O Accelerator Parameters. In: Gao, W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds) Benchmarking, Measuring, and Optimizing. Bench 2019. Lecture Notes in Computer Science(), vol 12093. Springer, Cham. https://doi.org/10.1007/978-3-030-49556-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49556-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49555-8

  • Online ISBN: 978-3-030-49556-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics