Loading [a11y]/accessibility-menu.js
GPUFASTQLZ: An Ultra Fast Compression Methodology for Fastq Sequence Data on GPUs | IEEE Conference Publication | IEEE Xplore

GPUFASTQLZ: An Ultra Fast Compression Methodology for Fastq Sequence Data on GPUs


Abstract:

The rapid growth of next-generation sequencing (NGS) technology has led to an exponential increase in the volume of genomic data, creating significant challenges in data ...Show More

Abstract:

The rapid growth of next-generation sequencing (NGS) technology has led to an exponential increase in the volume of genomic data, creating significant challenges in data storage and transfer. Existing sequence data compression solutions often suffer from low throughput and moderate compression ratios, making them inadequate for large-scale genomic data management. We present GPUFASTQLZ, an ultra-fast compression methodology for FASTQ sequence data on GPUs. Leveraging the high parallelism capabilities of GPUs, GPUFASTQLZ incorporates several optimizations, including a fast algorithm for field separation, a 2-bit encoding scheme for base fields, and the implementation of Illumina binning and GPULZ compression algorithms. We evaluate GPUFASTQLZ on three datasets, across 324 hyperparameter settings, which shows that GPUFASTQLZ outperforms existing compressors, achieving up to a 1300x speedup in compression throughput and a 1.1x improvement in compression ratio compared to GZIP and exceeds the state-of-the-art FASTQ compressor GENOZIP by up to 18X throughput.
Date of Conference: 17-22 November 2024
Date Added to IEEE Xplore: 08 January 2025
ISBN Information:
Conference Location: Atlanta, GA, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.