skip to main content
10.1145/3577193.3593736acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Lightweight Huffman Coding for Efficient GPU Compression

Published: 21 June 2023 Publication History

Abstract

Lossy compression is often deployed in scientific applications to reduce data footprint and improve data transfers and I/O performance. Especially for applications requiring on-the-flight compression, it is essential to minimize compression's runtime. In this paper, we design a scheme to improve the performance of cuSZ, a GPU-based lossy compressor. We observe that Huffman coding - used by cuSZ to compress metadata generated during compression - incurs a performance overhead that can be significant, especially for smaller datasets. Our work seeks to reduce the Huffman coding runtime with minimal-to-no impact on cuSZ's compression efficiency.
Our contributions are as follows. First, we examine a variety of probability distributions to determine which distributions closely model the input to cuSZ's Huffman coding stage. From these distributions, we create a dictionary of pre-computed codebooks such that during compression, a codebook is selected from the dictionary instead of computing a custom codebook. Second, we explore three codebook selection criteria to be applied at runtime. Finally, we evaluate our scheme on real-world datasets and in the context of two important application use cases, HDF5 and MPI, using an NVIDIA A100 GPU. Our evaluation shows that our method can reduce the Huffman coding penalty by a factor of 78--92×, translating to a total speedup of up to 5× over baseline cuSZ. Smaller HDF5 chunk sizes enjoy over an 8× speedup in compression and MPI messages on the scale of tens of MB have a 1.4--30.5× speedup in communication time.

References

[1]
Bulent Abali, Bartholomew Balner, Hubertus Franke, and John J. Reilly. 2017. Creating a dynamic Huffman table.
[2]
M. Ainsworth, O. Tugluk, B. Whitney, and S. Klasky. 2017. MGARD: A Multilevel Technique for Compression of Floating-Point Data. In DRBSD-2 Workshop at Supercomputing.
[3]
BlosC compressor. [n. d.]. http://blosc.org/. Online.
[4]
M. Burtscher and P. Ratanaworabhan. 2009. FPC: A High-Speed Compressor for Double-Precision Floating-Point Data. IEEE Trans. Comput. 58, 1 (Jan 2009), 18--31.
[5]
Franck Cappello, Sheng Di, Sihuan Li, Xin Liang, Ali Murat Gok, Dingwen Tao, Chun Hong Yoon, Xin-Chuan Wu, Yuri Alexeev, and Frederic T Chong. 2019. Use cases of lossy compression for floating-point data in scientific data sets. The International Journal of High Performance Computing Applications 33, 6 (2019), 1201--1220. arXiv:https://doi.org/10.1177/1094342019853336
[6]
Yann Collet. 2015. Zstandard - Real-time data compression algorithm. http://facebook.github.io/zstd/ (2015).
[7]
HDF5. [n. d.]. https://portal.hdfgroup.org/display/HDF5/HDF5. Online.
[8]
David A. Huffman. 1952. A Method for the Construction of Minimum-Redundancy Codes. Proceedings of the IRE 40, 9 (1952), 1098--1101.
[9]
Sian Jin, Dingwen Tao, Houjun Tang, Sheng Di, Suren Byna, Zarija Lukic, and Franck Cappello. 2022. Accelerating Parallel Write via Deeply Integrating Predictive Lossy Compression with HDF5. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (Dallas, Texas) (SC '22). IEEE Press, Article 61, 15 pages.
[10]
Xin Liang, Sheng Di, Dingwen Tao, Sihuan Li, Shaomeng Li, Hanqi Guo, Zizhong Chen, and Franck Cappello. 2018. Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets. In IEEE Big Data. 438--447.
[11]
OpenMPI. [n. d.]. https://www.open-mpi.org/. Online.
[12]
SA Ostadzadeh, B Maryam Elahi, ZZ Tabrizi, M Amir Moulavi, and K Bertels. 2007. A two-phase practical parallel algorithm for construction of huffman codes. In PDPTA 2007. CSREA Press, 284--291.
[13]
Ritesh A. Patel, Yao Zhang, Jason Mak, Andrew Davidson, and John D. Owens. 2012. Parallel lossless data compression on the GPU. In 2012 Innovative Parallel Computing (InPar). 1--9.
[14]
Roman Schutski, Danil Lykov, and Ivan Oseledets. 2020. Adaptive algorithm for quantum circuit simulation. Phys. Rev. A 101 (Apr 2020), 042335. Issue 4.
[15]
Eugene S. Schwartz and Bruce Kallick. 1964. Generating a Canonical Prefix Encoding. Commun. ACM 7, 3 (mar 1964), 166--169.
[16]
Jiannan Tian, Sheng Di, Xiaodong Yu, Cody Rivera, Kai Zhao, Sian Jin, Yunhe Feng, Xin Liang, Dingwen Tao, and Franck Cappello. 2021. Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). 283--293.
[17]
Jiannan Tian and et al. 2020. cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data (PACT '20). Association for Computing Machinery, New York, NY, USA, 3--15.
[18]
Jiannan Tian, Cody Rivera, Sheng Di, Jieyang Chen, Xin Liang, Dingwen Tao, and Franck Cappello. 2021. Revisiting Huffman Coding: Toward Extreme Performance on Modern GPU Architectures. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 881--891.
[19]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedrcegosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17 (2020), 261--272.
[20]
Chengming Zhang, Sian Jin, Tong Geng, Jiannan Tian, Ang Li, and Dingwen Tao. 2022. CEAZ: Accelerating Parallel I/O via Hardware-Algorithm Co-Designed Adaptive Lossy Compression. In Proceedings of the 36th ACM International Conference on Supercomputing (Virtual Event) (ICS '22). Association for Computing Machinery, New York, NY, USA, Article 12, 13 pages.
[21]
Kai Zhao, Sheng Di, Maxim Dmitriev, Thierry-Laurent D. Tonellot, Zizhong Chen, and Franck Cappello. 2021. Optimizing Error-Bounded Lossy Compression for Scientific Data by Dynamic Spline Interpolation. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). 1643--1654.
[22]
Kai Zhao, Sheng Di, Xin Lian, Sihuan Li, Dingwen Tao, Julie Bessac, Zizhong Chen, and Franck Cappello. 2020. SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors. In 2020 IEEE International Conference on Big Data (Big Data). 2716--2724.
[23]
Q. Zhou, C. Chu, N. S. Kumar, P. Kousha, S. M. Ghazimirsaeed, H. Subramoni, and D. K. Panda. 2021. Designing High-Performance MPI Libraries with On-the-fly Compression for Modern GPU Clusters. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 444--453.
[24]
Zlib. [n. d.]. https://www.zlib.net/. Online.

Cited By

View all
  • (2024)Real-Time Decompression and Rasterization of Massive Point CloudsProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36753737:3(1-15)Online publication date: 9-Aug-2024
  • (2024)CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658691(309-321)Online publication date: 3-Jun-2024
  • (2024)CUSZP2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression RatioSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00021(1-18)Online publication date: 17-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '23: Proceedings of the 37th ACM International Conference on Supercomputing
June 2023
505 pages
ISBN:9798400700569
DOI:10.1145/3577193
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. compression
  2. Huffman coding
  3. GPU

Qualifiers

  • Research-article

Conference

ICS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)104
  • Downloads (Last 6 weeks)6
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Real-Time Decompression and Rasterization of Massive Point CloudsProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36753737:3(1-15)Online publication date: 9-Aug-2024
  • (2024)CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658691(309-321)Online publication date: 3-Jun-2024
  • (2024)CUSZP2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression RatioSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00021(1-18)Online publication date: 17-Nov-2024
  • (2024)cuSZ-i: High-Ratio Scientific Lossy Compression on GPUs with Optimized Multi-Level InterpolationProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00019(1-15)Online publication date: 17-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media