Abstract
H3 is an embedded object store, backed by a high-performance key-value store. H3 provides a user-friendly object API, similar to Amazon’s S3, but is especially tailored for use in “converged” Cloud-HPC environments, where HPC applications expect from the underlying storage services to meet strict latency requirements—even for high-level object operations. By embedding the object store in the application, thus avoiding the REST layer, we show that data operations gain significant performance benefits, especially for smaller sized objects. Additionally, H3’s pluggable back-end architecture allows adapting the object store’s scale and performance to a variety of deployment requirements. H3 supports several key-value stores, ranging from in-memory services to distributed, RDMA-based implementations. The core of H3 is H3lib, a C library with Python and Java bindings. The H3 ecosystem also includes numerous utilities and compatibility layers: The H3 FUSE filesystem allows object access using file semantics, the CSI H3 implementation uses H3 FUSE for attaching H3-backed persistent volumes in Docker and Kubernetes, while an S3proxy plug-in offers an S3 protocol-compatible endpoint for legacy applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The 1 MB limit is used here as an example. The maximum part size is configurable.
References
Amazon S3. https://aws.amazon.com/s3/
Apache jclouds. http://jclouds.apache.org
Apache Spark. https://spark.apache.org/
Apache Zookeeper. https://zookeeper.apache.org
Argo Workflows. https://argoproj.github.io/projects/argo
CSI H3: CSI driver for H3. https://github.com/CARV-ICS-FORTH/csi-h3
DigitalOcean Spaces. https://www.digitalocean.com/products/spaces/
FUSE (Filesystem in USErspace). https://www.kernel.org/doc/html/latest/filesystems/fuse.html
Google Cloud Storage. https://cloud.google.com/storage
H3: An embedded object store. https://github.com/CARV-ICS-FORTH/H3
H3 Benchmark: Performance test for H3. https://github.com/CARV-ICS-FORTH/h3-benchmark
Java Native Access. https://github.com/java-native-access/jna
Kubernetes: Production-Grade Container Orchestration. https://kubernetes.io
Microsoft Azure Blob Storage. https://azure.microsoft.com/en-us/services/storage/blobs/
MinIO. https://min.io/
Redis. https://redis.io/
RocksDB. https://rocksdb.org/
s3-benchmark. https://github.com/wasabi-tech/s3-benchmark
S3Proxy. https://github.com/gaul/s3proxy
TensorFlow. https://www.tensorflow.org/
Wasabi. https://wasabi.com
Aghayev, A., Weil, S., Kuchnik, M., Nelson, M., Ganger, G.R., Amvrosiadis, G.: File systems unfit as distributed storage backends: lessons from 10 years of ceph evolution. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP ’19, pp. 353–369. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3341301.3359656
Chandrasekar, R.R., Evans, L., Wespetal, R.: An exploration into object storage for exascale supercomputers. In: Proceedings of the 2017 Cray User Group. CUG2017 (2017)
Duwe, K., Kuhn, M.: Using ceph’s bluestore as object storage in HPC storage framework. In: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, CHEOPS ’21. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3439839.3458734
Lofstead, J., Jimenez, I., Maltzahn, C., Koziol, Q., Bent, J., Barton, E.: Daos and friends: a proposal for an exascale storage system. In: SC ’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 585–596 (2016). https://doi.org/10.1109/SC.2016.49
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996). https://doi.org/10.1007/s002360050048
Papagiannis, A., Saloustros, G., González-Férez, P., Bilas, A.: An efficient memory-mapped key-value store for flash storage. In: Proceedings of the ACM Symposium on Cloud Computing, SoCC ’18, pp. 490–502. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3267809.3267824
Vangoor, B.K.R., Tarasov, V., Zadok, E.: To fuse or not to fuse: performance of user-space file systems. In: Proceedings of the 15th Usenix Conference on File and Storage Technologies, FAST’17. pp. 59–72. USENIX Association, USA (2017)
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI ’06, pp. 307–320. USENIX Association, USA (2006)
Acknowledgements
We thankfully acknowledge the support of the European Commission under the Horizon 2020 Framework Programme for Research and Innovation through the project EVOLVE (Grant Agreement No. 825061).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chazapis, A., Politis, E., Kalaentzis, G., Kozanitis, C., Bilas, A. (2021). H3: An Application-Level, Low-Overhead Object Store. In: Jagode, H., Anzt, H., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12761. Springer, Cham. https://doi.org/10.1007/978-3-030-90539-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-90539-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90538-5
Online ISBN: 978-3-030-90539-2
eBook Packages: Computer ScienceComputer Science (R0)