Abstract:
In this paper, we present a new reference-free and lossless approach to compress next-generation sequencing (NGS) data in FASTQ format, splitting the input FASTQ data int...Show MoreMetadata
Abstract:
In this paper, we present a new reference-free and lossless approach to compress next-generation sequencing (NGS) data in FASTQ format, splitting the input FASTQ data into three parts of metadata, short reads and quality scores, and eliminating their redundancy independently according to their own characteristics. Experiments were conducted on five real-world NGS data. The results show that the proposed algorithm has better compression gain as compared to the previous state of the art compression algorithms.
Published in: 2017 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)
Date of Conference: 23-25 August 2017
Date Added to IEEE Xplore: 05 October 2017
ISBN Information: