Skip to main content
Log in

Czip: A Fast Lossless Compression Algorithm for Climate Data

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Climate data have been dramatically increasing in volume in recent years. This huge volume of climate data poses considerable challenges for data storage, archiving and sharing. In this paper, we propose a lossless compression algorithm for climate data, named czip. We efficiently eliminate data redundancy through several new methods, including adaptive prediction, eXclusive OR differencing, multiway compression and static regions. To utilize the multiple cores available on modern computers, czip is implemented in parallel. Experimental results show that czip can achieve outstanding compression ratios as well as deflating and inflating throughputs; czip can achieve 800 MB/s deflating throughputs and over 2600 MB/s inflating throughputs on a server with 16 cores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Overpeck, J.T., Meehl, G.A., Bony, S., Easterling, D.R.: Climate data challenges in the 21 st century. Science (Washington) 331(6018), 700–702 (2011)

    Article  Google Scholar 

  2. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  3. Ziv, J., Lempel, A.: Compression of individual sequences via variable rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  4. zlib. http://www.zlib.net (Online)

  5. lz4: Extremely Fast Compression algorithm. http://code.google.com/p/lz4/ (Online)

  6. bzip2. http://www.bzip.org (Online)

  7. Isenburg, M., Lindstrom, P., Snoeyink, J.: Lossless compression of predicted floating-point geometry. IEEE Trans. Inf. Theory 37(8), 869–877 (2005)

    MATH  Google Scholar 

  8. Burtscher, M., Ratanaworabhan, P.: FPC: a high-speed compressor for double-precision floating-point data. IEEE Trans. Comput. 58(1), 18–31 (2009)

    Article  MathSciNet  Google Scholar 

  9. C. 120.0-G-2: Lossless data compression. In: Report Concerning Space Data System Standards. Green Book (Issue 2) (2006)

  10. Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. IEEE Trans. Comput. 12(5), 1245–1250 (2006)

    Google Scholar 

  11. Ibarria, L., Lindstrom, P., Rossignac, J., Szymczak, A.: Out-of-core compression and decompression of large n-dimensional scalar fields. Comput. Graph. Forum 22(3), 343–348 (2003)

    Article  Google Scholar 

  12. Wheeler, D., Burrows, M.: A block-sorting lossless data compression algorithm. Digital Systems Research Center Report, vol. 124 (1994)

  13. LZO: real-time data compression library. http://www.oberhumer.com/opensource/lzo/ (Online)

  14. Yeh, P.-S., Xia-Serafino, W., Miles, L., Kobler, B., Menasce, D.: Implementation of ccsds lossless data compression in hdf. In: Earth Science Technology Conference (2002)

  15. O’Neil, M.A., Burtscher, M.: Floating-point data compression at 75 gb/s on a gpu. In: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, p. 7. ACM (2011)

  16. Sanchez, V., Nasiopoulos, P., Abugharbieh, R.: Lossless compression of 4d medical images using h. 264/avc. In: 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2. pp. II–II, IEEE (2006)

  17. Woodring, J., Mniszewski, S., Brislawn, C., DeMarle, D., Ahrens, J.: Revisiting wavelet compression for large-scale climate data using jpeg, 2000 and ensuring data precision. In: 2011 IEEE Symposium Large Data Analysis and Visualization (LDAV), pp. 31–38 (2011)

  18. Ma, K.-L., Shen, H.-W.: Compression and accelerated rendering of time-varying volume data. In: Proceedings of the 2000 International Computer Symposium-Workshop on Computer Graphics and Virtual Reality, pp. 82–89 (2000)

  19. Fout, N., Ma, K.-L., Ahrens, J.: Time-varying, multivariate volume data reduction. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM, pp. 1224–1230 (2005)

  20. Fout, N., Ma, K.-L.: An adaptive prediction-based approach to lossless compression of floating-point volume data. IEEE Trans. Comput. 18(12), 2295–2304 (2012)

    Google Scholar 

  21. Engelson, V., Fritzson, D., Fritzson, P.: Lossless compression of high-volume numerical data from simulations. In: Data Compression Conference. Citeseer (2000)

  22. Robinson, T.: Simple Lossless and Near-Lossless Waveform Compression. Cambridge University Engineering Department, Cambridge (1995)

  23. Hans, M., Schafer, R.W.: Lossless compression of digital audio. IEEE Trans. Comput. 18(4), 21–32 (2001)

    Google Scholar 

  24. Taylor, K., Stouffer, R., Meehl, G.: An overview of CMIP5 and the experiment design. IEEE Trans. Comput. 93(4), 485 (2012)

    Google Scholar 

  25. Network Common Data Form. http://www.unidata.ucar.edu/software/netcdf/ (Online)

  26. CMIP5 Output Requirements. http://cmip-pcmdi.llnl.gov/cmip5/output-req.html (Online)

  27. Earth System Grid Federation. http://pcmdi9.llnl.gov/esgf-web-fe/ (Online)

  28. Songbin, L., Xiaomeng, H., Haohuan, F.: Data reduction analysis for climate data sets. In: 10th IFIP International Conference on Network and Parallel Computing (2013)

  29. Rice, R.F.: Practical universal noiseless coding. In: 23rd Annual Technical Symposium. International Society for Optics and Photonics, pp. 247–267 (1979)

  30. pigz. http://zlib.net/pigz. (Online)

  31. Homepage of Martin Isenburg. http://www.cs.unc.edu/~isenburg/ (Online)

  32. SZIP 2.1. http://www.hdfgroup.org/ftp/lib-external/szip/ (Online)

Download references

Acknowledgments

The authors would like to thank the editor and the anonymous reviewers for their valuable comments. This study was supported by funding from the National Natural Science Foundation of China (41375102), the National Grand Fundamental Research 973 Program of China (No. 2014CB347800), and the National High Technology Development Program of China (2011AA01A203).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaomeng Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Ni, Y., Chen, D. et al. Czip: A Fast Lossless Compression Algorithm for Climate Data. Int J Parallel Prog 44, 1248–1267 (2016). https://doi.org/10.1007/s10766-016-0403-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-016-0403-z

Keywords

Navigation