skip to main content
article

Analysis and Workload Characterization of the CERN EOS Storage System

Published: 14 June 2022 Publication History

Abstract

Modern, large-scale scientific computing runs on complex exascale storage systems that support even more complex data workloads. Understanding the data access and movement patterns is vital for informing the design of future iterations of existing systems and next-generation systems. Yet we are lacking in publicly available traces and tools to help us understand even one system in depth, let alone correlate long-term cross-system trends.

References

[1]
CERN Annual report 2017. Tech. rep., CERN, Geneva, 2018.
[2]
Adams, I., Madden, B., Frank, J., Storer, M. W., and Miller, E. L. Usage behavior of a large-scale scientific archive. In Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC12) (Nov. 2012).
[3]
Adams, I. F. Understanding Long-Term Storage Access Patterns. PhD thesis, University of California, Santa Cruz, 2013.
[4]
Adams, I. F., Storer, M. W., and Miller, E. L. Analysis of workload behavior in scientific and historical long-term data repositories. ACM Transactions on Storage 8, 2 (2012).
[5]
Agrawal, N., Bolosky, W. J., Douceur, J. R., and Lorch, J. R. A five-year study of file-system metadata. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07) (Feb. 2007), pp. 31--45.
[6]
Bel, O., Chang, K., Tallent, N., Duellman, D., Miller, E. L., Nawab, F., and Long, D. D. E. Geomancy: Automated performance enhancement through data layout optimization. In Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20) (Oct. 2020).
[7]
Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. On the Implications of Zipf's Law for Web Caching. In 3rd International WWW Caching Workshop (June 1998).
[8]
Colarelli, D., and Grunwald, D. Massive arrays of idle disks for storage archives. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing (SC '02) (Nov. 2002).
[9]
Grawinkel, M., Nagel, L., Masker, M., Padua, F., Brinkmann, A., and Sorth, L. Analysis of the ECMWF storage landscape. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST '15) (Feb. 2015), pp. 15--26.
[10]
Grawinkel, M., Pargmann, M., Domer, H., and Brinkmann, A. Lonestar: an energy-aware disk based long-term archival storage system. In Proceedings of the 17th International Conference on Parallel and Distributed Systems (ICPADS '11) (2011), pp. 380--387.
[11]
Jaffe, E., and Kirkpatrick, S. Architecture of the Internet Archive. In Proceedings of The Israeli Experimental Systems Conference (SYSTOR '09) (May 2009).
[12]
Jensen, D.W., and Reed, D. A. File archive activity in a supercomputer environment. Tech. Rep. UIUCDCS-R-91--1672, University of Illinois at Urbana-Champaign, Apr. 1991.
[13]
Lamanna, M. The LHC computing grid project at CERN. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 534, 1--2 (2004), 1--6.
[14]
Li, Y., Bel, O., Chang, K., Miller, E. L., and Long, D. D. E. CAPES: Unsupervised storage performance tuning using neural network-based deep reinforcement learning. In Proceedings of the 2015 International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) (Nov. 2017).
[15]
Miller, E., and Katz, R. An analysis of file migration in a Unix supercomputing environment. In Proceedings of the Winter 1993 USENIX Technical Conference (Jan. 1993), pp. 421--433.
[16]
Peters, A. J., and Janyst, L. Exabyte scale storage at CERN. Journal of Physics: Conference Series 331, 5 (dec 2011), 052015.
[17]
Storer, M. W., Greenan, K. M., Miller, E. L., and Voruganti, K. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST '08) (Feb. 2008).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 56, Issue 1
SIGOPS
June 2022
76 pages
ISSN:0163-5980
DOI:10.1145/3544497
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2022
Published in SIGOPS Volume 56, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 55
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media