Abstract
Next-Generation Sequencing (NGS) technologies have reshaped the landscape of life sciences. The massive amount of data generated by NGS is rapidly transforming biological research from traditional wet-lab work into a data- intensive analytical discipline (Koboldt et al., Cell 155(1):27–38, 2013). The Illumina “sequencing by synthesis” technique (Mardis, Annu Rev Genomics Hum Genet 9:387–402, 2008) is one of the most popular and widely used NGS technologies.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Koboldt, D.C., Steinberg, K.M., Larson, D.E., Wilson, R.K., Mardis, E.R.: The next-generation sequencing revolution and its impact on genomics. Cell 155(1), 27–38 (2013)
Mardis, E.R.: Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet. 9, 387–402 (2008)
Zhang, J., Kobert, K., Flouri, T., Stamatakis, A.: PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics (Oxford, England) 30(5), 614–620 (2014)
Masella, A.P., Bartram, A.K., Truszkowski, J.M., Brown, D.G., Neufeld, J.D.: PANDAseq: paired-end assembler for illumina sequences. BMC Bioinf. 13(1), 31 (2012)
Magoč, T., Salzberg, S.L.: FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics (Oxford, England) 27(21), 2957–2963 (2011)
Rognes, T., Flouri, T., Nichols, B., Quince, C., Mahé, F.: VSEARCH: a versatile open source tool for metagenomics. PeerJ 4, e2584 (2016)
Paszkiewicz, K., Studholme, D.J.: De novo assembly of short sequence reads. Brief. Bioinform. 11(5), 457–472 (2010). [Online] Available: http://bib.oxfordjournals.org/content/11/5/457.abstract
Nakamura, K., Oshima, T., Morimoto, T., Ikeda, S., Yoshikawa, H., Shiwa, Y., Ishikawa, S., Linak, M.C., Hirai, A., Takahashi, H., Altaf-Ul-Amin, M., Ogasawara, N., Kanaya, S.: Sequence-specific error profile of Illumina sequencers. Nucleic Acids Res. 39(13), e90 (2011)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)
Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Dokl. Akad. Nauk SSSR 163(4), 845–848 (1965)
Hamming, R.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950)
Rognes, T., Seeberg, E.: Six-fold speed-up of smith-waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8), 699–706 (2000)
Altschul, S., Gish, W.: Local alignment statistics. Methods Enzymol. 266, 460–480 (1996)
Langmead, B., Salzberg, S.L.: Fast gapped-read alignment with Bowtie 2. Nat. Methods 9(4), 357–359 (2012)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences – Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Quail, M.A., Smith, M., Coupland, P., Otto, T.D., Harris, S.R., Connor, T.R., Bertoni, A., Swerdlow, H.P., Gu, Y.: A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13(1), 341 (2012)
Ewing, B., Green, P.: Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8(3), 186–194 (1998)
Edgar, R.C., Flyvbjerg, H.: Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics 31(21), 3476 (2015)
Acknowledgements
T.F is supported by DFG project STA/860-4. L.C, K.K and J.Z are funded by a HITS scholarship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Flouri, T., Zhang, J., Czech, L., Kobert, K., Stamatakis, A. (2017). An Efficient Approach to Merging Paired-End Reads and Incorporation of Uncertainties. In: Elloumi, M. (eds) Algorithms for Next-Generation Sequencing Data. Springer, Cham. https://doi.org/10.1007/978-3-319-59826-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-59826-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59824-6
Online ISBN: 978-3-319-59826-0
eBook Packages: Computer ScienceComputer Science (R0)