Multi-tier storage systems are becoming more and more widespread in the industry. They have more tunable parameters and built-in policies than traditional storage systems, and an adequate configuration of these parameters and policies is crucial for achieving high performance. A very important performance indicator for such systems is the response time of the file I/O requests. The response time can be minimized if the most frequently accessed (“hot”) files are located in the fastest storage tiers. Unfortunately, it is impossible to know a priori which files are going to be hot, especially because the file access patterns change over time. This paper presents a policy-based framework for dynamically deciding which files need to be upgraded and which files need to be downgraded based on their recent access pattern and on the system’s current state. The paper also presents a reinforcement learning (RL) algorithm for automatically tuning the file migration policies in order to minimize the average request response time. A multi-tier storage system simulator was used to evaluate the migration policies tuned by RL, and such policies were shown to achieve a significant performance improvement over the best hand-crafted policies found for this domain.
Similar content being viewed by others
Das R, Tesauro GJ, Walsh WE (2005) Model-based and model-free approaches to autonomic resource allocation. IBM Technical Report RC23802
Hasinoff SW (2002) Reinforcement learning for problems with hidden state. Technical Report, University of Toronto, Department of Computer Science
Howard RA (1960) Dynamic programming and Markov processes. Wiley, New York
Kretchmar RM, Anderson CW (1997) Comparison of CMACs and RBFs for local function approximators in reinforcement learning. In: Proceedings of the IEEE international conference on machine learning, Houston, TX, pp 834–837
Lin LJ, Mitchell TM (1992) Memory approaches to reinforcement learning in non-Markovian domain. Carnegie Mellon School of Computer Science Technical Report CMU-CS-92-138
Lu C, Alvarez GA, Wilkes J (2002) Aqueduct: online data migration with performance guarantees. In: Conference on file and storage technology (FAST’02), Monterey, CA. USENIX, Berkeley, pp 219–230
McCallum AK (1995) Reinforcement learning with selective perception and hidden state, PhD thesis, University of Rochester, Department of Computer Science
Menon J, Pease DA, Rees B, Duyanovich LM, Hillsber BL (2003) IBM storage tank—a heterogeneous scalable SAN file system. IBM Syst J 42(2):250–267
Meuleau N, Peshkin L, Kim K-E, Kaelbling LP (1999) Learning finite-state controllers for partially observable environments. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence (UAI), pp 427–436
Paxson V, Floyd S (1995) Wide-area traffic: the failure of poisson modeling. IEEE/ACM Trans Netw 3(3):226–244
Sinnwell M, Weikum G (1997) A cost-model-based online method for distributed caching. In: Proceedings of the thirteenth international conference on data engineering (ICDE), Birmingham, UK, pp 532–541
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT, Cambridge
Sun Microsystems Inc. (2004) Sun StorEdge QFS and SAM-FS software. Technical white paper. Available electronically at http://www.sun.com/storage/white-papers/qfs-samfs.pdf
Tsitsiklis JN, Van Roy B (1997) An Analysis of Temporal-Difference Learning with Function Approximation. IEEE Trans Autom Control 42(5):674–690
Vamplew P, Ollington R (2005) Global versus local constructive function approximation for on-line reinforcement learning. In: 18th Australian joint conference on artificial intelligence, Sydney, Australia, 5–9 December 2005. Lecture notes in computer science, vol 3809. Springer, Berlin, pp 113–121
Vengerov D (2005) Reinforcement learning framework for utility-based scheduling in resource-constrained systems. Sun Microsystems Laboratories Technical Report TR-2005-141
Vengerov D (2005) A reinforcement learning approach to dynamic resource allocation. Sun Microsystems Laboratories Technical Report TR-2005-148
Vengerov D (2006) Dynamic tuning of online data migration policies in hierarchical storage systems using reinforcement learning. Sun Microsystems Laboratories Technical Report TR-2006-157
Verma A, Sharma U, Rubas J, Pease D, Kaplan M, Jain R, Devarakonda M, Beigi M (2005) An architecture for lifecycle management in very large file systems. In: Proceedings of the 22nd IEEE/13th NASA Goddard conference on mass storage systems and technologies (MSST), pp 160–168
Wang L-X (1992) Fuzzy systems are universal approximators. In: Proceedings of the IEEE international conference on fuzzy systems (FUZZ-IEEE ’92), pp 1163–1169
Author information
Authors and Affiliations
Corresponding author
Additional information
This material is based upon work supported by DARPA under Contract No. NBCH3039002.
Rights and permissions
About this article
Cite this article
Vengerov, D. A reinforcement learning framework for online data migration in hierarchical storage systems. J Supercomput 43, 1–19 (2008). https://doi.org/10.1007/s11227-007-0135-3
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0135-3