skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Impact of Data Placement on Resilience in Large-Scale Object Storage Systems

Conference ·

Distributed object storage architectures have become the de facto standard for high-performance storage in big data, cloud, and HPC computing. Object storage deployments using commodity hardware to reduce costs often employ object replication as a method to achieve data resilience. Repairing object replicas after failure is a daunting task for systems with thousands of servers and billions of objects, however, and it is increasingly difficult to evaluate such scenarios at scale on realworld systems. Resilience and availability are both compromised if objects are not repaired in a timely manner. In this work we leverage a high-fidelity discrete-event simulation model to investigate replica reconstruction on large-scale object storage systems with thousands of servers, billions of objects, and petabytes of data. We evaluate the behavior of CRUSH, a well-known object placement algorithm, and identify configuration scenarios in which aggregate rebuild performance is constrained by object placement policies. After determining the root cause of this bottleneck, we then propose enhancements to CRUSH and the usage policies atop it to enable scalable replica reconstruction. We use these methods to demonstrate a simulated aggregate rebuild rate of 410 GiB/s (within 5% of projected ideal linear scaling) on a 1,024-node commodity storage system. We also uncover an unexpected phenomenon in rebuild performance based on the characteristics of the data stored on the system.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science - Office of Advanced Scientific Computing Research
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1366305
Resource Relation:
Conference: 32nd International Conference on Massive Storage Systems and Technology, 05/02/16 - 05/02/16, Santa Clara, CA, US
Country of Publication:
United States
Language:
English

Similar Records

MarFS, a Near-POSIX Interface to Cloud Objects
Journal Article · Sun Jan 01 00:00:00 EST 2017 · ;Login · OSTI ID:1366305

MLEC-Sim: A Simulator for Evaluating Multi-Level Erasure Coding
Software · Fri Apr 12 00:00:00 EDT 2024 · OSTI ID:1366305

Design and Implementation of Ceph: A Scalable Distributed File System
Conference · Wed Apr 19 00:00:00 EDT 2006 · OSTI ID:1366305

Related Subjects