Caching Suggestions Using Reinforcement Learning

Tracolli, Mirco; Baioletti, Marco; Poggioni, Valentina; Spiga, Daniele

doi:10.1007/978-3-030-64583-0_57

Mirco Tracolli^16,17,18,
Marco Baioletti¹⁶,
Valentina Poggioni¹⁶ &
Daniele Spiga¹⁸
on behalf of the CMS Collaboration

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1546 Accesses

Abstract

Big data is usually processed in a decentralized computational environment with a number of distributed storage systems and processing facilities to enable both online and offline data analysis. In such a context, data access is fundamental to enhance processing efficiency as well as the user experience inspecting the data and the caching system is a solution widely adopted in many diverse domains. In this context, the optimization of cache management plays a central role to sustain the growing demand for data. In this article, we propose an autonomous approach based on a Reinforcement Learning technique to implement an agent to manage the file storing decisions. Moreover, we test the proposed method in a real context using the information on data analysis workflows of the CMS experiment at CERN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adhikari, V.K., et al.: Unreeling netflix: understanding and improving multi-CDN movie delivery. In: 2012 Proceedings IEEE INFOCOM, pp. 1620–1628. IEEE (2012)
Google Scholar
Ali, W., Shamsuddin, S.M., Ismail, A.S., et al.: A survey of web caching and prefetching. Int. J. Adv. Soft Comput. Appl. 3(1), 18–44 (2011)
Google Scholar
Bird, I., Campana, S., Girone, M., Espinal, X., McCance, G., Schovancová, J.: Architecture and prototype of a WLCG data lake for HL-LHC. In: EPJ Web of Conferences, vol. 214, p. 04024. EDP Sciences (2019)
Google Scholar
Chen, T.: Obtaining the optimal cache document replacement policy for the caching system of an EC website. Eur. J. Oper. Res. 181(2), 828–841 (2007)
Article Google Scholar
Collaboration, C., et al.: The CMS experiment at the CERN LHC (2008)
Google Scholar
Fanfani, A., et al.: Distributed analysis in CMS. J. Grid Comput. 8(2), 159–179 (2010)
Article Google Scholar
Fang, H.: Managing data lakes in big data era: what’s a data lake and why has it became popular in data management ecosystem. In: 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pp. 820–824. IEEE (2015)
Google Scholar
Herodotou, H.: Autocache: employing machine learning to automate caching in distributed file systems. In: International Conference on Data Engineering Workshops (ICDEW), pp. 133–139 (2019)
Google Scholar
Koskela, T., Heikkonen, J., Kaski, K.: Web cache optimization with nonlinear model using object features. Comput. Netw. 43(6), 805–817 (2003)
Article Google Scholar
Kuznetsov, V., Li, T., Giommi, L., Bonacorsi, D., Wildish, T.: Predicting dataset popularity for the CMS experiment. arXiv preprint arXiv:1602.07226, 2016
Lei, L., You, L., Dai, G., Vu, T.X., Yuan, D., Chatzinotas, S.: A deep learning approach for optimizing content delivering in cache-enabled HetNet. In: 2017 International Symposium on Wireless Communication Systems (ISWCS), pp. 449–453. IEEE (2017)
Google Scholar
Madera, C., Laurent, A.: The next information architecture evolution: the data lake wave. In: Proceedings of the 8th International Conference on Management of Digital EcoSystems, pp. 174–180 (2016)
Google Scholar
Meoni, M., Perego, R., Tonellotto, N.: Dataset popularity prediction for caching of CMS big data. J. Grid Comput. 16(2), 211–228 (2018)
Article Google Scholar
Narayanan, A., Verma, S., Ramadan, E., Babaie, P., Zhang, Z.-L.: Deepcache: a deep learning based framework for content caching. In: Proceedings of the 2018 Workshop on Network Meets AI & ML, pp. 48–53 (2018)
Google Scholar
Sadeghi, A., Wang, G., Giannakis, G.B.: Deep reinforcement learning for adaptive caching in hierarchical content delivery networks. IEEE Trans. Cognit. Commun. Netw. 5(4), 1024–1033 (2019)
Article Google Scholar
Skluzacek, T.J., Chard, K., Foster, I.: Klimatic: a virtual data lake for harvesting and distribution of geospatial data. In: 2016 1st Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS), pp. 31–36. IEEE (2016)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Google Scholar
Terrizzano, I.G., Schwarz, P.M., Roth, M., Colino, J.E.: Data wrangling: the challenging yourney from the wild to the lake. In: CIDR (2015)
Google Scholar
Tian, G., Liebelt, M.: An effectiveness-based adaptive cache replacement policy. Microprocess. Microsyst. 38(1), 98–111 (2014)
Article Google Scholar
Zhong, C., Gursoy, M.C., Velipasalar, S.: A deep reinforcement learning-based framework for content caching. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS), pp. 1–6. IEEE (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Università degli Studi di Perugia, Perugia, Italy
Mirco Tracolli, Marco Baioletti & Valentina Poggioni
Università degli Studi di Firenze, Florence, Italy
Mirco Tracolli
INFN Sezione di Perugia, Perugia, Italy
Mirco Tracolli & Daniele Spiga

Authors

Mirco Tracolli
View author publications
You can also search for this author in PubMed Google Scholar
Marco Baioletti
View author publications
You can also search for this author in PubMed Google Scholar
Valentina Poggioni
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Spiga
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

on behalf of the CMS Collaboration

Corresponding authors

Correspondence to Mirco Tracolli , Marco Baioletti , Valentina Poggioni or Daniele Spiga .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Giorgio Jansen
Almawave, Rome, Italy
Vincenzo Sciacca
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tracolli, M., Baioletti, M., Poggioni, V., Spiga, D., on behalf of the CMS Collaboration. (2020). Caching Suggestions Using Reinforcement Learning. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_57

Download citation

DOI: https://doi.org/10.1007/978-3-030-64583-0_57
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64582-3
Online ISBN: 978-3-030-64583-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics