Abstract
We propose a novel approach to optimize the performance of a large scale physical system by mapping the performance optimization problem into a reinforcement learning framework. A reasonably efficient manual bandwidth control for large storage servers seems to be a difficult task for system administrators, but a dynamic bandwidth control can be effectively learned by a reinforcement learning agent. We adopt a combination of Double Deep Q-Network and a Recurrent Neural Network as our function approximator to identify the extent of bandwidth control (actions) given the state representation of a storage server. Allowing the agent to control the amount of allowable bandwidth to each logical unit within a filer has shown to enhance throughput as-well-as reduce the overload duration of storage servers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson, E., Spence, S., Swaminathan, R., Kallahalla, M., Wang, Q.: Quickly finding near-optimal storage designs. ACM Trans. Comput. Syst. 23(4), 337–374 (2005)
Boyan, J.A., Littman, M.L.: Packet routing in dynamically changing networks: a reinforcement learning approach. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS 1993, pp. 671–678 (1993)
Crites, R., Barto, A.: Improving elevator performance using reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 1017–1023 (1996)
Deshpande, S., Dheenadayalan, K., Srinivasaraghavan, G., Muralidhara, V.N.: Filer response time prediction using adaptively-learned forecasting models based on counter time series data. In: 15th IEEE International Conference on Machine Learning and Applications, ICMLA 2016, pp. 13–18 (2016)
Dheenadayalan, K., Srinivasaraghavan, G., Muralidhara, V.N.: Self-tuning filers — overload prediction and preventive tuning using pruned random forest. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10235, pp. 495–507. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57529-2_39
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 2016, pp. 2094–2100 (2016)
Hausknecht, M.J., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. CoRR abs/1507.06527 (2015)
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30 (2013)
Mahadevan, S., Marchalleck, N., Das, T.K., Gosavi, A.: Self-improving factory simulation using continuous-time average-reward reinforcement learning. In: 14th International Conference on Machine Learning, pp. 202–210 (1997)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
NetApp Inc.: Storage limits. https://library.netapp.com/ecmdocs/ECMP1196906/html/GUID-AA1419CF-50AB-41FF-A73C-C401741C847C.html
Singh, S., Bertsekas, D.: Reinforcement learning for dynamic channel allocation in cellular telephone systems. In: Proceedings of the 9th International Conference on Neural Information Processing Systems, NIPS 1996, pp. 974–980 (1996)
Tang, H., Gulbeden, A., Zhou, J., Strathearn, W., Yang, T., Chu, L.: A self-organizing storage cluster for parallel data-intensive applications. In: SC 2004: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 52, November 2004
Vengerov, D.: A reinforcement learning framework for online data migration in hierarchical storage systems. J. Supercomput. 43(1), 1–19 (2008)
Wang, M., Au, K., Ailamaki, A., Brockwell, A., Faloutsos, C., Ganger, G.R.: Storage device performance prediction with cart models. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, pp. 412–413 (2004)
Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M., de Freitas, N.: Dueling network architectures for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning, ICML 2016, pp. 1995–2003 (2016)
Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Dheenadayalan, K., Srinivasaraghavan, G., Muralidhara, V.N. (2018). Dynamic Control of Storage Bandwidth Using Double Deep Recurrent Q-Network. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11307. Springer, Cham. https://doi.org/10.1007/978-3-030-04239-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-04239-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04238-7
Online ISBN: 978-3-030-04239-4
eBook Packages: Computer ScienceComputer Science (R0)