Skip to main content

Caching Suggestions Using Reinforcement Learning

  • Conference paper
  • First Online:
Book cover Machine Learning, Optimization, and Data Science (LOD 2020)

Abstract

Big data is usually processed in a decentralized computational environment with a number of distributed storage systems and processing facilities to enable both online and offline data analysis. In such a context, data access is fundamental to enhance processing efficiency as well as the user experience inspecting the data and the caching system is a solution widely adopted in many diverse domains. In this context, the optimization of cache management plays a central role to sustain the growing demand for data. In this article, we propose an autonomous approach based on a Reinforcement Learning technique to implement an agent to manage the file storing decisions. Moreover, we test the proposed method in a real context using the information on data analysis workflows of the CMS experiment at CERN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adhikari, V.K., et al.: Unreeling netflix: understanding and improving multi-CDN movie delivery. In: 2012 Proceedings IEEE INFOCOM, pp. 1620–1628. IEEE (2012)

    Google Scholar 

  2. Ali, W., Shamsuddin, S.M., Ismail, A.S., et al.: A survey of web caching and prefetching. Int. J. Adv. Soft Comput. Appl. 3(1), 18–44 (2011)

    Google Scholar 

  3. Bird, I., Campana, S., Girone, M., Espinal, X., McCance, G., Schovancová, J.: Architecture and prototype of a WLCG data lake for HL-LHC. In: EPJ Web of Conferences, vol. 214, p. 04024. EDP Sciences (2019)

    Google Scholar 

  4. Chen, T.: Obtaining the optimal cache document replacement policy for the caching system of an EC website. Eur. J. Oper. Res. 181(2), 828–841 (2007)

    Article  Google Scholar 

  5. Collaboration, C., et al.: The CMS experiment at the CERN LHC (2008)

    Google Scholar 

  6. Fanfani, A., et al.: Distributed analysis in CMS. J. Grid Comput. 8(2), 159–179 (2010)

    Article  Google Scholar 

  7. Fang, H.: Managing data lakes in big data era: what’s a data lake and why has it became popular in data management ecosystem. In: 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pp. 820–824. IEEE (2015)

    Google Scholar 

  8. Herodotou, H.: Autocache: employing machine learning to automate caching in distributed file systems. In: International Conference on Data Engineering Workshops (ICDEW), pp. 133–139 (2019)

    Google Scholar 

  9. Koskela, T., Heikkonen, J., Kaski, K.: Web cache optimization with nonlinear model using object features. Comput. Netw. 43(6), 805–817 (2003)

    Article  Google Scholar 

  10. Kuznetsov, V., Li, T., Giommi, L., Bonacorsi, D., Wildish, T.: Predicting dataset popularity for the CMS experiment. arXiv preprint arXiv:1602.07226, 2016

  11. Lei, L., You, L., Dai, G., Vu, T.X., Yuan, D., Chatzinotas, S.: A deep learning approach for optimizing content delivering in cache-enabled HetNet. In: 2017 International Symposium on Wireless Communication Systems (ISWCS), pp. 449–453. IEEE (2017)

    Google Scholar 

  12. Madera, C., Laurent, A.: The next information architecture evolution: the data lake wave. In: Proceedings of the 8th International Conference on Management of Digital EcoSystems, pp. 174–180 (2016)

    Google Scholar 

  13. Meoni, M., Perego, R., Tonellotto, N.: Dataset popularity prediction for caching of CMS big data. J. Grid Comput. 16(2), 211–228 (2018)

    Article  Google Scholar 

  14. Narayanan, A., Verma, S., Ramadan, E., Babaie, P., Zhang, Z.-L.: Deepcache: a deep learning based framework for content caching. In: Proceedings of the 2018 Workshop on Network Meets AI & ML, pp. 48–53 (2018)

    Google Scholar 

  15. Sadeghi, A., Wang, G., Giannakis, G.B.: Deep reinforcement learning for adaptive caching in hierarchical content delivery networks. IEEE Trans. Cognit. Commun. Netw. 5(4), 1024–1033 (2019)

    Article  Google Scholar 

  16. Skluzacek, T.J., Chard, K., Foster, I.: Klimatic: a virtual data lake for harvesting and distribution of geospatial data. In: 2016 1st Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS), pp. 31–36. IEEE (2016)

    Google Scholar 

  17. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  18. Terrizzano, I.G., Schwarz, P.M., Roth, M., Colino, J.E.: Data wrangling: the challenging yourney from the wild to the lake. In: CIDR (2015)

    Google Scholar 

  19. Tian, G., Liebelt, M.: An effectiveness-based adaptive cache replacement policy. Microprocess. Microsyst. 38(1), 98–111 (2014)

    Article  Google Scholar 

  20. Zhong, C., Gursoy, M.C., Velipasalar, S.: A deep reinforcement learning-based framework for content caching. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS), pp. 1–6. IEEE (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Consortia

Corresponding authors

Correspondence to Mirco Tracolli , Marco Baioletti , Valentina Poggioni or Daniele Spiga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tracolli, M., Baioletti, M., Poggioni, V., Spiga, D., on behalf of the CMS Collaboration. (2020). Caching Suggestions Using Reinforcement Learning. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64583-0_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64582-3

  • Online ISBN: 978-3-030-64583-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics