research-article

Public Access

Ultrafast Error-bounded Lossy Compression for Scientific Datasets

Authors:

Franck CappelloAuthors Info & Claims

HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

Pages 159 - 171

https://doi.org/10.1145/3502181.3531473

Published: 27 June 2022 Publication History

Abstract

Today's scientific high-performance computing applications and advanced instruments are producing vast volumes of data across a wide range of domains, which impose a serious burden on data transfer and storage. Error-bounded lossy compression has been developed and widely used in the scientific community because it not only can significantly reduce the data volumes but also can strictly control the data distortion based on the user-specified error bound. Existing lossy compressors, however, cannot offer ultrafast compression speed, which is highly demanded by numerous applications or use cases (such as in-memory compression and online instrument data compression). In this paper, we propose a novel ultrafast error-bounded lossy compressor that can obtain fairly high compression performance on both CPUs and GPUs and with reasonably high compression ratios. The key contributions are threefold. (1) We propose a generic error-bounded lossy compression framework---called SZx---that achieves ultrafast performance through its novel design comprising only lightweight operations such as bitwise and addition/subtraction operations, while still keeping a high compression ratio. (2) We implement SZx on both CPUs and GPUs and optimize the performance according to their architectures. (3) We perform a comprehensive evaluation with six real-world production-level scientific datasets on both CPUs and GPUs. Experiments show that SZx is 2~16x faster than the second-fastest existing error-bounded lossy compressor (either SZ or ZFP) on CPUs and GPUs, with respect to both compression and decompression.

References

[1]

[n. d.]. Hurricane ISABEL simulation dataset in IEEE Visualization 2004 Test. http://vis.computer.org/vis2004contest/data.html. Online.

[2]

[n. d.]. The Local Ensemble Transform Kalman Filter (LETKF) data assimilation package for the SCALE-RM weather model. https://github.com/gylien/scale-letkf.

[3]

[n. d.]. Miranda turbulence simulation. https://wci.llnl.gov/simulation/ computer-codes/miranda. Online.

[4]

[n. d.]. NYX simulation. https://amrex-astro.github.io/Nyx. Online.

[5]

[n. d.]. Scientific Data Reduction Benchmark. https://sdrbench.github.io/. Online.

[6]

Mark Ainsworth, Ozan Tugluk, Ben Whitney, and Scott Klasky. 2018. Multilevel techniques for compression and reduction of scientific data--the univariate case. Computing and Visualization in Science, Vol. 19, 5 (01 Dec 2018), 65--76.

Digital Library

[7]

Rafael Ballester-Ripoll, Peter Lindstrom, and Renato Pajarola. 2018. TTHRESH: Tensor Compression for Multidimensional Visual Data. CoRR, Vol. abs/1806.05952 (2018). http://arxiv.org/abs/1806.05952

[8]

Franck Cappello, Sheng Di, Sihuan Li, Xin Liang, Gok M. Ali, Dingwen Tao, Chun Yoon Hong, Xin-chuan Wu, Yuri Alexeev, and T. Frederic Chong. 2019. Use cases of lossy compression for floating-point data in scientific datasets. International Journal of High Performance Computing Applications (IJHPCA), Vol. 33 (2019), 1201--1220.

Digital Library

[9]

Yann Collet. 2015. Zstandard -- Real-time data compression algorithm. http://facebook.github.io/zstd/ (2015).

[10]

cuZFP. 2020. https://github.com/LLNL/zfp/tree/develop/src/cuda_zfp. Online.

[11]

L Peter Deutsch. 1996. GZIP file format specification version 4.3.

[12]

Sheng Di and Franck Cappello. 2016. Fast error-bounded lossy HPC data compression with SZ. In IEEE International Parallel and Distributed Processing Symposium. 730--739.

[13]

Sheng Di, Dingwen Tao, Xin Liang, and Franck Cappello. 2019. Efficient Lossy Compression for Scientific Data Based on Pointwise Relative Error Bound. IEEE Transactions on Parallel and Distributed Systems, Vol. 30, 2 (2019), 331--345. https://doi.org/10.1109/TPDS.2018.2859932

Digital Library

[14]

Ali Murat Gok, Sheng Di, Yuri Alexeev, Dingwen Tao, Vladimir Mironov, Xin Liang, and Franck Cappello. 2018. PaSTRI: Error-Bounded Lossy Compression for Two-Electron Integrals in Quantum Chemistry. In 2018 IEEE International Conference on Cluster Computing (CLUSTER). 1--11. https://doi.org/10.1109/CLUSTER.2018.00013

[15]

Salman Habib, Vitali Morozov, Nicholas Frontiere, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Venkatram Vishwanath, Tom Peterka, Joe Insley, et al. 2016. HACC: Extreme scaling and performance across diverse architectures. Commun. ACM, Vol. 60, 1 (2016), 97--104.

Digital Library

[16]

Dewan Ibtesham, Dorian Arnold, Patrick G Bridges, Kurt B Ferreira, and Ron Brightwell. 2012. On the viability of compression for reducing the overheads of checkpoint/restart-based fault tolerance. In 2012 41st international conference on parallel processing. IEEE, 148--157.

Digital Library

[17]

JE Kay, C Deser, A Phillips, A Mai, C Hannay, G Strand, JM Arblaster, SC Bates, G Danabasoglu, J Edwards, et al. 2015. The Community Earth System Model (CESM), large ensemble project: A community resource for studying climate change in the presence of internal climate variability. Bulletin of the American Meteorological Society, Vol. 96, 8 (2015), 1333--1349.

[18]

Jeongnim Kim and et al. 2018. QMCPACK: an open source ab initio quantum Monte Carlo package for the electronic structure of atoms, molecules and solids., Vol. 30, 19 (apr 2018), 195901.

[19]

Sriram Lakshminarasimhan, Neil Shah, Stephane Ethier, Scott Klasky, Rob Latham, Rob Ross, and Nagiza F. Samatova. 2011. Compressing the Incompressible with ISABELA: In-situ Reduction of Spatio-temporal Data. In Euro-Par 2011 Parallel Processing, Emmanuel Jeannot, Raymond Namyst, and Jean Roman (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 366--379.

[20]

LCRC. 2021. ThetaGPU Machine Overview. https://www.alcf.anl.gov/support-center/theta/theta-thetagpu-overview. Online.

[21]

Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Shaomeng Li, Hanqi Guo, Zizhong Chen, and Franck Cappello. 2018. Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets. In 2018 IEEE International Conference on Big Data. IEEE.

[22]

Peter Lindstrom. 2014. Fixed-rate compressed floating-point arrays. IEEE Transactions on Visualization and Computer Graphics, Vol. 20, 12 (2014), 2674--2683.

[23]

Peter Lindstrom and Martin Isenburg. 2006. Fast and efficient compression of floating-point data. IEEE Transactions on Visualization and Computer Graphics, Vol. 12, 5 (2006), 1245--1250.

Digital Library

[24]

Mark Harris, Shubhabrata Sengupta and John D. Owens. [n. d.]. Parallel Prefix Sum (Scan) with CUDA.

[25]

Marziyeh Nourian, Xiang Wang, Xiaodong Yu, Wu-chun Feng, and Michela Becchi. 2017. Demystifying automata processing: GPUs, FPGAs or Micron's AP?. In Proceedings of the International Conference on Supercomputing. 1--11.

Digital Library

[26]

Cody Rivera, Sheng Di, Jiannan Tian, Xiaodong Yu, Dingwen Tao, and Franck Cappello. 2022. Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs. arXiv preprint arXiv:2201.09118 (2022).

[27]

SLAC National Accelerator Laboratory. 2017. Linac Coherent Light Source (LCLS-II). https://lcls.slac.stanford.edu/. Online.

[28]

Summit. [n. d.]. https://www.olcf.ornl.gov/summit/.

[29]

Dingwen Tao, Sheng Di, Zizhong Chen, and Franck Cappello. 2017. Significantly improving lossy compression for scientific data sets based on multidimensional prediction and error-controlled quantization. In 2017 IEEE International Parallel and Distributed Processing Symposium. IEEE, 1129--1139.

[30]

Dingwen Tao, Sheng Di, Hanqi Guo, Zizhong Chen, and Franck Cappello. 2019. Z-checker: A framework for assessing lossy compression of scientific data. The International Journal of High Performance Computing Applications, Vol. 33, 2 (2019), 285--303. https://doi.org/10.1177/1094342017737147

Digital Library

[31]

Jiannan Tian et al. 2020. CuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data. In Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (PACT '20). 3--15.

[32]

Jiannan Tian, Sheng Di, Xiaodong Yu, Cody Rivera, Kai Zhao, Sian Jin, Yunhe Feng, Xin Liang, Dingwen Tao, and Franck Cappello. 2021. Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 283--293.

[33]

Robert Underwood, Sheng Di, Jon C. Calhoun, and Franck Cappello. 2020. FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data. https://arxiv.org/abs/2001.06139. Online.

[34]

Zang Wang, Alan C. Bovick, Hamid R. Sheikh, and Eero P. Simoncelli. [n. d.]. The SSIM Index for Image Quality Assessment. https://www.cns.nyu.edu/ lcv/ssim/

[35]

Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Hal Finkel, Yuri Alexeev, and Frederic T. Chong. 2019. Full-State Quantum Circuit Simulation by Using Data Compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'19). Association for Computing Machinery, New York, USA, Article 80, 24 pages.

[36]

Xiaodong Yu and Michela Becchi. 2013. Exploring different automata representations for efficient regular expression matching on GPUs. ACM SIGPLAN Notices, Vol. 48, 8 (2013), 287--288.

Digital Library

[37]

Xiaodong Yu, Tekin Bicer, Rajkumar Kettimuthu, and Ian Foster. 2021 a. Topology-aware optimizations for multi-GPU ptychographic image reconstruction. In Proceedings of the ACM International Conference on Supercomputing. 354--366.

Digital Library

[38]

Xiaodong Yu, Sheng Di, Ali Murat Gok, Dingwen Tao, and Franck Cappello. 2021 b. cuZ-checker: A GPU-Based Ultra-Fast Assessment System for Lossy Compressions. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 307--319.

[39]

Xiaodong Yu, Viktor Nikitin, Daniel J Ching, Selin Aslan, Doug a Gürsoy, and Tekin Bicc er. 2022. Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data. Scientific Reports, Vol. 12, 1 (2022), 1--16.

[40]

Xiaodong Yu, Hao Wang, Wu-chun Feng, Hao Gong, and Guohua Cao. 2017. An enhanced image reconstruction tool for computed tomography on GPUs. In Proceedings of the Computing Frontiers Conference. 97--106.

Digital Library

[41]

Xiaodong Yu, Hao Wang, Wu-chun Feng, Hao Gong, and Guohua Cao. 2019. GPU-based iterative medical CT image reconstructions. Journal of Signal Processing Systems, Vol. 91, 3 (2019), 321--338.

Digital Library

[42]

Xiaodong Yu, Fengguo Wei, Xinming Ou, Michela Becchi, Tekin Bicer, and Danfeng (Daphne) Yao. 2020. GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting. In The 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE.

[43]

Kai Zhao, Sheng Di, Xin Liang, Sihuan Li, Dingwen Tao, Zizhong Chen, and Franck Cappello. 2020. Significantly Improving Lossy Compression for HPC Datasets with Second-Order Prediction and Parameter Optimization. In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '20). Association for Computing Machinery, New York, NY, USA, 89--100.

Digital Library

[44]

Zlib. [n. d.]. https://www.zlib.net/. Online.

Cited By

Su ZAhmed AWang ZAnwar ACheng Y(2024)Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to AskProceedings of the VLDB Endowment10.14778/3659437.365945617:8(2036-2049)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659456
Nguyen TRahman MDi SBecchi M(2024)Significantly Improving Fixed-Ratio Compression Framework for Resource-limited ApplicationsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673092(845-855)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673092
Song SHuang YJiang PYu XZheng WDi SCao QFeng YXie ZCappello FMencagli GDazzi PLowenthal DBadia R(2024)CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658691(309-321)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3625549.3658691
Show More Cited By

Index Terms

Ultrafast Error-bounded Lossy Compression for Scientific Datasets
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel algorithms
      1. Massively parallel algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Data structures design and analysis
      1. Data compression

Recommendations

cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance
SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Modern scientific applications and supercomputing systems are generating large amounts of data in various fields, leading to critical challenges in data storage footprints and communication times. To address this issue, error-bounded GPU lossy ...
FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs
HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing

Today's large-scale scientific applications running on high-performance computing (HPC) systems generate vast data volumes. Thus, data compression is becoming a critical technique to mitigate the storage burden and data-movement cost. However, existing ...
cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data
PACT '20: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques

Error-bounded lossy compression is a state-of-the-art data reduction technique for HPC applications because it not only significantly reduces storage overhead but also can retain high fidelity for postanalysis. Because supercomputers and HPC ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

June 2022

314 pages

ISBN:9781450391993

DOI:10.1145/3502181

General Chairs:
Jon Weissman
University of Minnesota, MN, USA
,
Abhishek Chandra
University of Minnesota, MN, USA
,
Program Chairs:
Ada Gavrilovska
Georgia Institute of Technology, GA, USA
,
Devesh Tiwari
Northeastern University, MA, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ARAMCO
U.S. Department of Energy Office of Science and Office of Advanced Scientific Computing Research (ASCR)
U.S. Department of Energy

Conference

HPDC '22

Sponsor:

HPDC '22: The 31st International Symposium on High-Performance Parallel and Distributed Computing

June 27 - July 1, 2022

MN, Minneapolis, USA

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
453
Total Downloads

Downloads (Last 12 months)235
Downloads (Last 6 weeks)38

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Su ZAhmed AWang ZAnwar ACheng Y(2024)Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to AskProceedings of the VLDB Endowment10.14778/3659437.365945617:8(2036-2049)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659456
Nguyen TRahman MDi SBecchi M(2024)Significantly Improving Fixed-Ratio Compression Framework for Resource-limited ApplicationsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673092(845-855)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673092
Song SHuang YJiang PYu XZheng WDi SCao QFeng YXie ZCappello FMencagli GDazzi PLowenthal DBadia R(2024)CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658691(309-321)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3625549.3658691
Naraparaju RZhao THu YZhao DGuo LTallent N(2024)Shifting Between Compute and Memory Bounds: A Compression-Enabled Roofline ModelProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00047(309-316)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00047
Liu YJia WYang TYin MJin S(2024)Enhancing Lossy Compression Through Cross-Field Information for Scientific ApplicationsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00046(300-308)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00046
Agarwal TDi SHuang JHuang YGopalakrishnan GUnderwood RZhao KLiang XLi GCappello F(2024)SZOps: Scalar Operations for Error-bounded Lossy Compressor for Scientific DataProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00042(260-269)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00042
Huang JDi SYu XZhai YLiu JJian ZLiang XZhao KLu XChen ZCappello FGuo YThakur R(2024)hZCCL: Accelerating Collective Communication with Co-Designed Homomorphic CompressionSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00110(1-15)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SC41406.2024.00110
Huang YDi SLi GCappello F(2024)CUSZP2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression RatioSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00021(1-18)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SC41406.2024.00021
Jian ZDi SLiu JZhao KLiang XXu HUnderwood RWu SHuang JChen ZCappello F(2024)CliZ: Optimizing Lossy Compression for Climate Datasets with Adaptive Fine-tuned Data Prediction2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00044(417-429)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00044
Li YKashyap AChen WGuo YLu X(2024)Accelerating Lossy and Lossless Compression on Emerging BlueField DPU Architectures2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00040(373-385)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00040
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten