Abstract
The recent advances in DNA sequencing technology, from first-generation sequencing (FGS) to third-generation sequencing (TGS), have constantly transformed the genome research landscape. Its data throughput is unprecedented and severalfold as compared with past technologies. DNA sequencing technologies generate sequencing data that are big, sparse, and heterogeneous. This results in the rapid development of various data protocols and bioinformatics tools for handling sequencing data.
In this review, a historical snapshot of DNA sequencing is taken with an emphasis on data manipulation and tools. The technological history of DNA sequencing is described and reviewed in thorough detail. To manipulate the sequencing data generated, different data protocols are introduced and reviewed. In particular, data compression methods are highlighted and discussed to provide readers a practical perspective in the real-world setting. A large variety of bioinformatics tools are also reviewed to help readers extract the most from their sequencing data in different aspects, such as sequencing quality control, genomic visualization, single-nucleotide variant calling, INDEL calling, structural variation calling, and integrative analysis. Toward the end of the article, we critically discuss the existing DNA sequencing technologies for their pitfalls and potential solutions.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools
- A. Abyzov, A. E. Urban, M. Snyder, and M. Gerstein. 2011. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Research 21, 6, 974--984.Google ScholarCross Ref
- C. A. Albers, G. Lunter, D. G. MacArthur, G. McVean, W. H. Ouwehand, and R. Durbin. 2011. Dindel: Accurate INDEL calls from short-read data. Genome Research 21, 6, 961--973.Google ScholarCross Ref
- Susan Aldridge, Brady Huggett, K. S. Jayaraman, Lisa Melton, Mark Ratner, and Nayanah Siva. 2008. 1000 Genomes project. Nature Biotechnology 26, 3, 256--256.Google ScholarCross Ref
- Can Alkan, Jeffrey M. Kidd, Tomas Marques-Bonet, Gozde Aksay, Francesca Antonacci, Fereydoun Hormozdiari, Jacob O. Kitzman, Carl Baker, Maika Malig, Onur Mutlu, S. Cenk Sahinalp, Richard A. Gibbs, and Evan E. Eichler. 2009. Personalized copy number and segmental duplication maps using next-generation sequencing. Nature Genetics 41, 10, 1061--1067.Google ScholarCross Ref
- Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman. 1990. Basic local alignment search tool. Journal of Molecular Biology 215, 3, 403--410.Google ScholarCross Ref
- Riyue Bao, Lei Huang, Jorge Andrade, Wei Tan, Warren A. Kibbe, Hongmei Jiang, and Gang Feng. 2014. Review of current methods, applications, and data management for the bioinformatics analysis of whole exome sequencing. Cancer Informatics 13s2 (2014), 67--83.Google Scholar
- Robert W. Bauman. 2013. Microbiology with Diseases by Taxonomy. Pearson Higher Ed.Google Scholar
- S. Bennett. 2004. Solexa Ltd. Pharmacogenomics 5, 4 (2014), 433--438.Google ScholarCross Ref
- James K. Bonfield. 2014. The Scramble conversion tool. Bioinformatics 30, 19 (Oct. 2014), 2818--2819.Google ScholarCross Ref
- Jayson Bowers, Judith Mitchell, Eric Beer, Philip R. Buzby, Marie Causey, J. William Efcavitch, Mirna Jarosz, Edyta Krzymanska-Olejnik, Li Kung, Doron Lipson, et al. 2009. Virtual terminator nucleotides for next-generation DNA sequencing. Nature Methods 6, 8, 593--595.Google ScholarCross Ref
- Ido Braslavsky, Benedict Hebert, Emil Kartalov, and Stephen R. Quake. 2003. Sequence information can be obtained from single DNA molecules. Proceedings of the National Academy of Sciences 100, 7, 3960--3964.Google ScholarCross Ref
- William Brockman, Pablo Alvarez, Sarah Young, Manuel Garber, Georgia Giannoukos, William L. Lee, Carsten Russ, Eric S. Lander, Chad Nusbaum, and David B. Jaffe. 2008. Quality scores and SNP detection in sequencing-by-synthesis systems.Genome Research 18, 5, 763--70.Google Scholar
- Yana Bromberg and Burkhard Rost. 2007. SNAP: Predict effect of non-synonymous polymorphisms on function. Nucleic Acids Research 35, 11, 3823--3835.Google ScholarCross Ref
- Tim Carver, Simon R. Harris, Thomas D. Otto, Matthew Berriman, Julian Parkhill, and Jacqueline A. McQuillan. 2013. BamView: Visualizing and interpretation of next-generation sequencing read alignments. Briefings in Bioinformatics 14, 2, 203--212.Google ScholarCross Ref
- Ken Chen, John W. Wallis, Michael D. McLellan, David E. Larson, Joelle M. Kalicki, Craig S. Pohl, Sean D. McGrath, Michael C. Wendl, Qunyuan Zhang, Devin P. Locke, Xiaoqi Shi, Robert S. Fulton, Timothy J. Ley, Richard K. Wilson, Li Ding, and Elaine R. Mardis. 2009. BreakDancer: An algorithm for high-resolution mapping of genomic structural variation. Nature Methods 6, 9, 677--681.Google ScholarCross Ref
- A. Y. Cheng, Y.-Y. Teo, and R. T.-H. Ong. 2014. Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. Bioinformatics 30, 12, 1707--1713.Google ScholarCross Ref
- Bastien Chevreux, Thomas Pfisterer, Bernd Drescher, Albert J. Driesel, Werner E. G. Müller, Thomas Wetter, and Sándor Suhai. 2004. Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Research 14, 6, 1147--59.Google ScholarCross Ref
- Chen-Shan Chin, Jon Sorenson, Jason B. Harris, William P. Robins, Richelle C. Charles, Roger R. Jean-Charles, James Bullard, Dale R. Webster, Andrew Kasarskis, Paul Peluso, et al. 2011. The origin of the Haitian cholera outbreak strain. New England Journal of Medicine 364, 1, 33--42.Google ScholarCross Ref
- R. H. Chung, W. Y. Tsai, C. Y. Kang, P. J. Yao, H. J. Tsai, and C. H. Chen. 2016. FamPipe: An automatic analysis pipeline for analyzing sequencing data in families for disease studies. PLoS Comput. Biol. 12, 6, e1004980.Google ScholarCross Ref
- Kristian Cibulskis, Michael S. Lawrence, Scott L. Carter, Andrey Sivachenko, David Jaffe, Carrie Sougnez, Stacey Gabriel, Matthew Meyerson, Eric S. Lander, and Gad Getz. 2013. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature Biotechnology 31, 3, 213--219.Google ScholarCross Ref
- James Clarke, Hai-Chen Wu, Lakmal Jayasinghe, Alpesh Patel, Stuart Reid, and Hagan Bayley. 2009. Continuous base identification for single-molecule nanopore DNA sequencing. Nature Nanotechnology 4, 4, 265--270.Google ScholarCross Ref
- Peter J. A. Cock, Christopher J. Fields, Naohisa Goto, Michael L. Heuer, and Peter M. Rice. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research 38, 6, 1767--1771.Google ScholarCross Ref
- ENCODE Project Consortium et al. 2004. The ENCODE (ENCyclopedia of DNA elements) project. Science 306, 5696, 636--640.Google Scholar
- David Cyranoski. 2016. China’s bid to be a DNA superpower.Nature 534, 7608, 462--463.Google Scholar
- Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, et al. 2011. The variant call format and VCFtools. Bioinformatics 27, 15, 2156--2158. Google ScholarDigital Library
- Matei David, Lewis Jonathan Dursi, Delia Yao, Paul C. Boutros, and Jared T. Simpson. 2016. Nanocall: An open source basecaller for Oxford Nanopore sequencing data. Bioinformatics 33, 1 (2016), 49--55.Google ScholarCross Ref
- Cees Dekker. 2007. Solid-state nanopores. Nature Nanotechnology 2, 4, 209--215.Google ScholarCross Ref
- Mark A. DePristo, Eric Banks, Ryan Poplin, Kiran V. Garimella, Jared R. Maguire, Christopher Hartl, Anthony A. Philippakis, Guillermo del Angel, Manuel A. Rivas, Matt Hanna, Aaron McKenna, Tim J. Fennell, Andrew M. Kernytsky, Andrey Y. Sivachenko, Kristian Cibulskis, Stacey B. Gabriel, David Altshuler, and Mark J. Daly. 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics 43, 5, 491--498.Google ScholarCross Ref
- John Eid, Adrian Fehr, Jeremy Gray, Khai Luong, John Lyle, Geoff Otto, Paul Peluso, David Rank, Primo Baybayan, Brad Bettman, et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science 323, 5910, 133--138.Google Scholar
- Michael Eisenstein. 2012. The battle for sequencing supremacy. Nature Biotechnology 30, 11, 1023.Google ScholarCross Ref
- R. Ekblom, L. Smeds, and H. Ellegren. 2014. Patterns of sequencing coverage bias revealed by ultra-deep sequencing of vertebrate mitochondria. BMC Genomics 15, 1 (2014), 467.Google ScholarCross Ref
- Brent Ewing and Phil Green. 1998. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Research 8, 3, 186--194.Google ScholarCross Ref
- Brent Ewing, LaDeana Hillier, Michael C. Wendl, and Phil Green. 1998. Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Research 8, 3, 175--185.Google ScholarCross Ref
- Gregory G. Faust and Ira M. Hall. 2014. SAMBLASTER: Fast duplicate marking and structural variant read extraction.Bioinformatics (Oxford, England) 30, 17, 2503--5.Google Scholar
- Y. Fei. 2014. DNA sequencing, Sanger and next-generation sequencing. Applications of Molecular Genetics in Personalized Medicine. USA: OMICS Group eBooks.Google Scholar
- Nowlan H. Freese, David C. Norris, and Ann E. Loraine. 2016. Integrated genome browser: Visual analytics platform for genomics. Bioinformatics 32, 14, 2089--2095.Google ScholarCross Ref
- Huanying Ge, Kejun Liu, Todd Juan, Fang Fang, Matthew Newman, and Wolfgang Hoeck. 2011. FusionMap: Detecting fusion genes from next-generation sequencing data at base-pair resolution.Bioinformatics (Oxford, England) 27, 14, 1922--8. Google ScholarDigital Library
- Lewis Y. Geer, Aron Marchler-Bauer, Renata C. Geer, Lianyi Han, Jane He, Siqian He, Chunlei Liu, Wenyao Shi, and Stephen H. Bryant. 2009. The NCBI biosystems database. Nucleic Acids Research 38, suppl_1 (2009), D492--D496.Google Scholar
- André Gilles, Emese Meglécz, Nicolas Pech, Stéphanie Ferreira, Thibaut Malausa, and Jean-François Martin. 2011. Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing. BMC Genomics 12, 1, 245.Google ScholarCross Ref
- Sante Gnerre, Iain MacCallum, Dariusz Przybylski, Filipe J. Ribeiro, Joshua N. Burton, Bruce J. Walker, Ted Sharpe, Giles Hall, Terrance P. Shea, Sean Sykes, Aaron M. Berlin, Daniel Aird, Maura Costello, Riza Daza, Louise Williams, Robert Nicol, Andreas Gnirke, Chad Nusbaum, Eric S. Lander, and David B. Jaffe. 2011. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proceedings of the National Academy of Sciences 108, 4, 1513--1518.Google ScholarCross Ref
- Sara Goodwin, John D. McPherson, and W. Richard McCombie. 2016. Coming of age: Ten years of next-generation sequencing technologies. Nature Reviews Genetics 17, 6, 333--351.Google ScholarCross Ref
- Anthony J. F. Griffiths, Jeffrey H. Miller, David T. Suzuki, Richard C. Lewontin, and William M. Gelbart. 2000. Somatic versus germinal mutation. In An Introduction to Genetic Analysis (7th ed.). W. H. Freeman.Google Scholar
- SAM/BAM Format Specification Working Group et al. 2013. Sequence alignment/map format specification. Retrieved August 3, 2019 from https://github.com/samtools/hts-specs.Google Scholar
- Y. Guo, X. Ding, Y. Shen, G. J. Lyon, and K. Wang. 2015. SeqMule: Automated pipeline for analysis of human exome/genome sequencing data. Sci Rep 5 (2015), 14283.Google ScholarCross Ref
- Ivo Glynne Gut. 2013. New sequencing technologies. Clinical and Translational Oncology 15, 11, 879--881.Google ScholarCross Ref
- G. Ha, A. Roth, D. Lai, A. Bashashati, J. Ding, R. Goya, R. Giuliany, J. Rosner, A. Oloumi, K. Shumansky, S.-F. Chin, G. Turashvili, M. Hirst, C. Caldas, M. A. Marra, S. Aparicio, and S. P. Shah. 2012. Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Research 22, 10, 1995--2007.Google ScholarCross Ref
- Thomas Hackl, Rainer Hedrich, Jörg Schultz, and Frank Förster. 2014. Proovread: Large-scale high-accuracy PacBio correction through iterative short read consensus. Bioinformatics 30, 21, 3004--3011.Google ScholarCross Ref
- Timothy D. Harris, Phillip R. Buzby, Hazen Babcock, Eric Beer, Jayson Bowers, Ido Braslavsky, Marie Causey, Jennifer Colonell, James DiMeo, J. William Efcavitch, et al. 2008. Single-molecule DNA sequencing of a viral genome. Science 320, 5872, 106--109.Google Scholar
- V. J. Henry, A. E. Bandrowski, A. S. Pepin, B. J. Gonzalez, and A. Desfeux. 2014. OMICtools: An informative directory for multi-omic data analysis. Database 2014, Article bau069 (2014).Google Scholar
- Eran Hodis, Ian R. Watson, Gregory V. Kryukov, Stefan T. Arold, Marcin Imielinski, Jean-Philippe Theurillat, Elizabeth Nickerson, Daniel Auclair, Liren Li, Chelsea Place, Daniel DiCara, Alex H. Ramos, Michael S. Lawrence, Kristian Cibulskis, Andrey Sivachenko, Douglas Voet, Gordon Saksena, Nicolas Stransky, Robert C. Onofrio, Wendy Winckler, Kristin Ardlie, Nikhil Wagle, Jennifer Wargo, Kelly Chong, Donald L. Morton, Katherine Stemke-Hale, Guo Chen, Michael Noble, Matthew Meyerson, John E. Ladbury, Michael A. Davies, Jeffrey E. Gershenwald, Stephan N. Wagner, Dave S. B. Hoon, Dirk Schadendorf, Eric S. Lander, Stacey B. Gabriel, Gad Getz, Levi A. Garraway, and Lynda Chin. 2012. A landscape of driver mutations in melanoma. Cell 150, 2, 251--263.Google ScholarCross Ref
- Mark Hollmer. 2013. Roche to close 454 Life Sciences as it reduces gene sequencing focus. Retrieved August 3, 2019 from http://www.fiercebiotech.com/medical-devices/roche-to-close-454-life-sciences-as-it-reduces-gene-sequencing-focus.Google Scholar
- Inc Illumina. 2008. Sequencing analysis software user guide for pipeline version 1.3 and CASAVA version 1.0 Illumina Inc. San Diego, CA.Google Scholar
- Zamin Iqbal, Mario Caccamo, Isaac Turner, Paul Flicek, and Gil McVean. 2012. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics 44, 2, 226--232.Google ScholarCross Ref
- Miten Jain, Ian T. Fiddes, Karen H. Miga, Hugh E. Olsen, Benedict Paten, and Mark Akeson. 2015. Improved data analysis for the MinION Nanopore sequencer. Nature Methods 12, 4, 351--356.Google ScholarCross Ref
- Miten Jain, Hugh E. Olsen, Benedict Paten, and Mark Akeson. 2016. The Oxford Nanopore MinION: Delivery of Nanopore sequencing to the genomics community. Genome Biology 17, 1, 239.Google Scholar
- Scott D. Kahn. 2011. On the future of genomic data. Science 331, 6018 (2011), 728--729.Google ScholarCross Ref
- John J. Kasianowicz, Eric Brandin, Daniel Branton, and David W. Deamer. 1996. Characterization of individual polynucleotide molecules using a membrane channel. Proceedings of the National Academy of Sciences 93, 24, 13770--13773.Google ScholarCross Ref
- W. James Kent. 2002. BLAT — the BLAST-like alignment tool. Genome Research 12, 4, 656--664.Google ScholarCross Ref
- Daniel C. Koboldt, David E. Larson, Richard K. Wilson, Daniel C. Koboldt, David E. Larson, and Richard K. Wilson. 2013. Using VarScan 2 for germline variant calling and somatic mutation detection. In Current Protocols in Bioinformatics. John Wiley and Sons, Inc., Hoboken, NJ, 15.4.1--15.4.17.Google Scholar
- Daniel C. Koboldt, Karyn Meltz Steinberg, David E. Larson, Richard K. Wilson, and Elaine R. Mardis. 2013. The next-generation sequencing revolution and its impact on genomics. Cell 155, 1, 27--38.Google ScholarCross Ref
- Jan O. Korbel, Alexej Abyzov, Xinmeng Mu, Nicholas Carriero, Philip Cayting, Zhengdong Zhang, Michael Snyder, Mark B. Gerstein, E. Pennisi, L. Feuk, A. R. Carson, S. W. Scherer, R. Redon, S. Ishikawa, K. R. Fitch, L. Feuk, T. Borodina, H. Himmelbauer, E. S. Lander, M. S. Waterman, S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 2009. PEMer: A computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biology 10, 2, R23.Google ScholarCross Ref
- Sergey Koren, Michael C. Schatz, Brian P. Walenz, Jeffrey Martin, Jason T. Howard, Ganeshkumar Ganapathy, Zhong Wang, David A. Rasko, W. Richard McCombie, Erich D. Jarvis, et al. 2012. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nature Biotechnology 30, 7, 693--700.Google ScholarCross Ref
- Hugo Y. K. Lam, Xinmeng Jasmine Mu, Adrian M. Stütz, Andrea Tanzer, Philip D. Cayting, Michael Snyder, Philip M. Kim, Jan O. Korbel, and Mark B. Gerstein. 2010. Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nature Biotechnology 28, 47--55.Google ScholarCross Ref
- Ben Langmead, Cole Trapnell, Mihai Pop, Steven L. Salzberg, et al. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, 3, R25.Google ScholarCross Ref
- Ben Langmead, Cole Trapnell, Mihai Pop, Steven L. Salzberg, T. A. Down, V. K. Rakyan, D. J. Turner, P. Flicek, H. Li, E. Kulesha, S. Graf, N. Johnson, J. Herrero, E. M. Tomazou, N. P. Thorne, L. Backdahl, M. Herberth, K. L. Howe, D. K. Jackson, M. M. Miretti, J. C. Marioni, E. Birney, T. J. Hubbard, R. Durbin, S. Tavare, S. Beck, D. S. Johnson, A. Mortazavi, R. M. Myers, D. Weese, T. Rausch, and K. Reinert. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10, 3, R25.Google ScholarCross Ref
- Ilkka Lappalainen, Jeff Almeida-King, Vasudev Kumanduri, Alexander Senf, John Dylan Spalding, Gary Saunders, Jag Kandasamy, Mario Caccamo, Rasko Leinonen, Brendan Vaughan, et al. 2015. The European Genome-phenome archive of human data consented for biomedical research. Nature Genetics 47, 7, 692--695.Google ScholarCross Ref
- David E. Larson, Christopher C. Harris, Ken Chen, Daniel C. Koboldt, Travis E. Abbott, David J. Dooling, Timothy J. Ley, Elaine R. Mardis, Richard K. Wilson, and Li Ding. 2012. SomaticSniper: Identification of somatic point mutations in whole genome sequencing data.Bioinformatics 28, 3, 311--7. Google ScholarDigital Library
- Michael S. Lawrence, Petar Stojanov, Paz Polak, Gregory V. Kryukov, Kristian Cibulskis, Andrey Sivachenko, Scott L. Carter, Chip Stewart, Craig H. Mermel, Steven A. Roberts, Adam Kiezun, Peter S. Hammerman, Aaron McKenna, Yotam Drier, Lihua Zou, Alex H. Ramos, Trevor J. Pugh, Nicolas Stransky, Elena Helman, Jaegil Kim, Carrie Sougnez, Lauren Ambrogio, Elizabeth Nickerson, Erica Shefler, Maria L. Cortés, Daniel Auclair, Gordon Saksena, Douglas Voet, Michael Noble, Daniel DiCara, Pei Lin, Lee Lichtenstein, David I. Heiman, Timothy Fennell, Marcin Imielinski, Bryan Hernandez, Eran Hodis, Sylvan Baca, Austin M. Dulak, Jens Lohr, Dan-Avi Landau, Catherine J. Wu, Jorge Melendez-Zajgla, Alfredo Hidalgo-Miranda, Amnon Koren, Steven A. McCarroll, Jaume Mora, Ryan S. Lee, Brian Crompton, Robert Onofrio, Melissa Parkin, Wendy Winckler, Kristin Ardlie, Stacey B. Gabriel, Charles W. M. Roberts, Jaclyn A. Biegel, Kimberly Stegmaier, Adam J. Bass, Levi A. Garraway, Matthew Meyerson, Todd R. Golub, Dmitry A. Gordenin, Shamil Sunyaev, Eric S. Lander, and Gad Getz. 2013. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 7457, 214--218.Google Scholar
- Seunghak Lee, Fereydoun Hormozdiari, Can Alkan, and Michael Brudno. 2009. MoDIL: Detecting small INDELs from clone-end sequencing with mixtures of distributions. Nature Methods 6, 7, 473--474.Google ScholarCross Ref
- R. Leinonen, H. Sugawara, and M. Shumway. 2010. The sequence read archive. Nucleic Acids Research 39, Database, D19--D21.Google Scholar
- R. Leinonen, H. Sugawara, and M. Shumway. 2011. The sequence read archive. Nucleic Acids Research 39, Database, D19--D21.Google Scholar
- Michael J. Levene, Jonas Korlach, Stephen W. Turner, Mathieu Foquet, Harold G. Craighead, and Watt W. Webb. 2003. Zero-mode waveguides for single-molecule analysis at high concentrations. Science 299, 5607, 682--686.Google Scholar
- Heng Li. 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 21, 2987--2993. Google ScholarDigital Library
- Heng Li and Richard Durbin. 2009. Fast and accurate short read alignment with Burrows--Wheeler transform. Bioinformatics 25, 14, 1754--1760. Google ScholarDigital Library
- Heng Li and Richard Durbin. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform.Bioinformatics (Oxford, England) 25, 14, 1754--60. Google ScholarDigital Library
- Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin, et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25, 16, 2078--2079. Google ScholarDigital Library
- Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin, and 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools.Bioinformatics (Oxford, England) 25, 16, 2078--9. Google ScholarDigital Library
- Heng Li, Jue Ruan, and Richard Durbin. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research 18, 11, 1851--1858.Google ScholarCross Ref
- Heng Li, Jue Ruan, and Richard Durbin. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores.Genome Research 18, 11, 1851--8.Google Scholar
- Jian Li, Aarif Mohamed Nazeer Batcha, Björn Grüning, and Ulrich R. Mansmann. 2015. An NGS workflow blueprint for DNA sequencing data and its application in individualized molecular oncology. Cancer Informatics 14, Suppl 5, 87.Google Scholar
- Jiali Li, Derek Stein, Ciaran McMullan, Daniel Branton, Michael J. Aziz, and Jene A. Golovchenko. 2001. Ion-beam sculpting at nanometre length scales. Nature 412, 6843, 166--169.Google Scholar
- M. Li, Magnus Nordborg, and Lei M. Li. 2004. Adjust quality scores from alignment and improve sequencing accuracy. Nucleic Acids Research 32, 17, 5183--5191.Google ScholarCross Ref
- Ruiqiang Li, Wei Fan, Geng Tian, Hongmei Zhu, Lin He, Jing Cai, Quanfei Huang, Qingle Cai, Bo Li, Yinqi Bai, Zhihe Zhang, Yaping Zhang, Wen Wang, Jun Li, Fuwen Wei, Heng Li, Min Jian, Jianwen Li, Zhaolei Zhang, Rasmus Nielsen, Dawei Li, Wanjun Gu, Zhentao Yang, Zhaoling Xuan, Oliver A. Ryder, Frederick Chi-Ching Leung, Yan Zhou, Jianjun Cao, Xiao Sun, Yonggui Fu, Xiaodong Fang, Xiaosen Guo, Bo Wang, Rong Hou, Fujun Shen, Bo Mu, Peixiang Ni, Runmao Lin, Wubin Qian, Guodong Wang, Chang Yu, Wenhui Nie, Jinhuan Wang, Zhigang Wu, Huiqing Liang, Jiumeng Min, Qi Wu, Shifeng Cheng, Jue Ruan, Mingwei Wang, Zhongbin Shi, Ming Wen, Binghang Liu, Xiaoli Ren, Huisong Zheng, Dong Dong, Kathleen Cook, Gao Shan, Hao Zhang, Carolin Kosiol, Xueying Xie, Zuhong Lu, Hancheng Zheng, Yingrui Li, Cynthia C. Steiner, Tommy Tsan-Yuk Lam, Siyuan Lin, Qinghui Zhang, Guoqing Li, Jing Tian, Timing Gong, Hongde Liu, Dejin Zhang, Lin Fang, Chen Ye, Juanbin Zhang, Wenbo Hu, Anlong Xu, Yuanyuan Ren, Guojie Zhang, Michael W. Bruford, Qibin Li, Lijia Ma, Yiran Guo, Na An, Yujie Hu, Yang Zheng, Yongyong Shi, Zhiqiang Li, Qing Liu, Yanling Chen, Jing Zhao, Ning Qu, Shancen Zhao, Feng Tian, Xiaoling Wang, Haiyin Wang, Lizhi Xu, Xiao Liu, Tomas Vinar, Yajun Wang, Tak-Wah Lam, Siu-Ming Yiu, Shiping Liu, Hemin Zhang, Desheng Li, Yan Huang, Xia Wang, Guohua Yang, Zhi Jiang, Junyi Wang, Nan Qin, Li Li, Jingxiang Li, Lars Bolund, Karsten Kristiansen, Gane Ka-Shu Wong, Maynard Olson, Xiuqing Zhang, Songgang Li, Huanming Yang, Jian Wang, and Jun Wang. 2010. The sequence and de novo assembly of the giant panda genome. Nature 463, 7279, 311--317.Google Scholar
- R. Li, Y. Li, X. Fang, H. Yang, J. Wang, K. Kristiansen, and J. Wang. 2009. SNP detection for massively parallel whole-genome resequencing. Genome Research 19, 6, 1124--1132.Google ScholarCross Ref
- Ruiqiang Li, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li, Shengting Li, Gao Shan, Karsten Kristiansen, Songgang Li, Huanming Yang, Jian Wang, and Jun Wang. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20 (2010), 265--272.Google ScholarCross Ref
- Lin Liu, Yinhu Li, Siliang Li, Ni Hu, Yimin He, Ray Pong, Danni Lin, Lihua Lu, and Maggie Law. 2012. Comparison of next-generation sequencing systems. BioMed Research International 2012, Article 251364 (2012), 11 pages.Google Scholar
- Yongchao Liu, Bernt Popp, Bertil Schmidt, A. D. Smith, Z. Xuan, M. Q. Zhang, H. Li, J. Ruan, R. Durbin, N. Homer, B. Merriman, S. F. Nelson, B. Langmead, C. Trapnell, L. Li, J. R. Myers, G. T. Marth, B. Ewing, P. Green, A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, P. Ferragina, G. Manzini, T. F. Smith, and M. S. Waterman. 2014. CUSHAW3: Sensitive and accurate base-space and color-space short-read alignment with hybrid seeding. PLoS ONE 9, 1, e86869.Google ScholarCross Ref
- Po-Ru Loh, Michael Baym, and Bonnie Berger. 2012. Compressive genomics. Nature Biotechnology 30, 7, 627--630.Google ScholarCross Ref
- Nicholas J. Loman and Aaron R. Quinlan. 2014. Poretools: A toolkit for analyzing Nanopore sequence data. Bioinformatics 30, 23, 3399--3401.Google ScholarCross Ref
- G. Lunter and M. Goodson. 2011. Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Research 21, 6, 936--939.Google ScholarCross Ref
- P. L. Luu, D. Gerovska, M. Arrospide-Elgarresta, S. Retegi-Carrion, H. R. Scholer, and M. J. Arauzo-Bravo. 2017. P3BSseq: Parallel processing pipeline software for automatic analysis of bisulfite sequencing data. Bioinformatics 33, 3, 428--431.Google Scholar
- Elaine R. Mardis. 2008. The impact of next-generation sequencing technology on genetics. Trends in Genetics 24, 3, 133--141.Google ScholarCross Ref
- Elaine R. Mardis. 2011. A decade’s perspective on DNA sequencing technology. Nature 470, 7333, 198--203.Google Scholar
- Elaine R. Mardis. 2013. Next-generation sequencing platforms. Annual Review of Analytical Chemistry (Palo Alto Calif) 6 (2013), 287--303.Google ScholarCross Ref
- Marcel Margulies, Michael Egholm, William E. Altman, Said Attiya, Joel S. Bader, Lisa A. Bemben, Jan Berka, Michael S. Braverman, Yi-Ju Chen, Zhoutao Chen, et al. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 7057, 376--380.Google Scholar
- Allan M. Maxam and Walter Gilbert. 1977. A new method for sequencing DNA. Proceedings of the National Academy of Sciences 74, 2, 560--564.Google ScholarCross Ref
- A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. A. DePristo. 2010. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 9, 1297--1303.Google ScholarCross Ref
- Alexander Mellmann, Dag Harmsen, Craig A. Cummings, Emily B. Zentz, Shana R. Leopold, Alain Rico, Karola Prior, Rafael Szczepanowski, Yongmei Ji, Wenlan Zhang, et al. 2011. Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104: H4 outbreak by rapid next generation sequencing technology. PLoS One 6, 7, e22751.Google ScholarCross Ref
- BIG Data Center Members. 2017. The BIG data center: From deposition to integration to translation. Nucleic Acids Research 45, Database issue, D18.Google Scholar
- C. A. Meyer and X. S. Liu. 2014. Identifying and mitigating bias in next-generation sequencing methods for chromatin biology. Nat. Rev. Genet. 15, 11, 709--721.Google ScholarCross Ref
- Huaiyu Mi, Sagar Poudel, Anushya Muruganujan, John T. Casagrande, and Paul D. Thomas. 2016. PANTHER version 10: Expanded protein families and functions, and analysis tools. Nucleic Acids Research 44, D1, D336--D342.Google ScholarCross Ref
- Jason R. Miller, Sergey Koren, and Granger Sutton. 2010. Assembly algorithms for next-generation sequencing data. Genomics 95, 6, 315--327.Google ScholarCross Ref
- Iain Milne, Gordon Stephen, Micha Bayer, Peter J. A. Cock, Leighton Pritchard, Linda Cardle, Paul D. Shaw, and David Marshall. 2013. Using Tablet for visual exploration of second-generation sequencing data.Briefings in Bioinformatics 14, 2, 193--202.Google Scholar
- S. B. Montgomery, D. L. Goode, E. Kvikstad, C. A. Albers, Z. D. Zhang, X. J. Mu, G. Ananda, B. Howie, K. J. Karczewski, K. S. Smith, V. Anaya, R. Richardson, J. Davis, D. G. MacArthur, A. Sidow, L. Duret, M. Gerstein, K. D. Makova, J. Marchini, G. McVean, G. Lunter, and Gerton Lunter. 2013. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Research 23, 5, 749--761.Google ScholarCross Ref
- Elizabeth P. Murchison, Ole B. Schulz-Trieglaff, Zemin Ning, Ludmil B. Alexandrov, Markus J. Bauer, Beiyuan Fu, Matthew Hims, Zhihao Ding, Sergii Ivakhno, Caitlin Stewart, Bee Ling Ng, Wendy Wong, Bronwen Aken, Simon White, Amber Alsop, Jennifer Becq, Graham R. Bignell, R. Keira Cheetham, William Cheng, Thomas R. Connor, Anthony J. Cox, Zhi-Ping Feng, Yong Gu, Russell J. Grocock, Simon R. Harris, Irina Khrebtukova, Zoya Kingsbury, Mark Kowarsky, Alexandre Kreiss, Shujun Luo, John Marshall, David J. McBride, Lisa Murray, Anne-Maree Pearse, Keiran Raine, Isabelle Rasolonjatovo, Richard Shaw, Philip Tedder, Carolyn Tregidgo, Albert J. Vilella, David C. Wedge, Gregory M. Woods, Niall Gormley, Sean Humphray, Gary Schroth, Geoffrey Smith, Kevin Hall, Stephen M. J. Searle, Nigel P. Carter, Anthony T. Papenfuss, P. Andrew Futreal, Peter J. Campbell, Fengtang Yang, David R. Bentley, Dirk J. Evers, and Michael R. Stratton. 2012. Genome sequencing and analysis of the Tasmanian Devil and its transmissible cancer. Cell 148, 4, 780--791.Google ScholarCross Ref
- Joseph A. Neuman, Ofer Isakov, and Noam Shomron. 2013. Analysis of insertion-deletion from deep-sequencing data: Software evaluation for optimal detection.Briefings in Bioinformatics 14, 1, 46--55.Google Scholar
- Thomas P. Niedringhaus, Denitsa Milanova, Matthew B. Kerby, Michael P. Snyder, and Annelise E. Barron. 2011. Landscape of next-generation sequencing technologies. Analytical Chemistry 83, 12, 4327--4341.Google ScholarCross Ref
- Beifang Niu, Limin Fu, Shulei Sun, Weizhong Li, D. B. Rusch, A. L. Halpern, G. Sutton, K. B. Heidelberg, S. Williamson, S. Yooseph, D. Wu, J. A. Eisen, J. M. Hoffman, K. Remington, J. C. Venter, K. Remington, J. F. Heidelberg, A. L. Halpern, D. Rusch, J. A. Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, S. G. Tringe, C. von Mering, A. Kobayashi, A. A. Salamov, K. Chen, H. W. Chang, M. Podar, J. M. Short, E. J. Mathur, J. C. Detter, S. R. Gill, M. Pop, R. T. Deboy, P. B. Eckburg, P. J. Turnbaugh, B. S. Samuel, J. I. Gordon, D. A. Relman, C. M. Fraser-Liggett, K. E. Nelson, G. W. Tyson, J. Chapman, P. Hugenholtz, E. E. Allen, R. J. Ram, P. M. Richardson, V. V. Solovyev, E. M. Rubin, D. S. Rokhsar, J. F. Banfield, E. A. Dinsdale, R. A. Edwards, D. Hall, F. Angly, M. Breitbart, J. M. Brulc, M. Furlan, C. Desnues, M. Haynes, L. Li, J. Frias-Lopez, Y. Shi, G. W. Tyson, M. L. Coleman, S. C. Schuster, S. W. Chisholm, E. F. Delong, P. J. Turnbaugh, M. Hamady, T. Yatsunenko, B. L. Cantarel, A. Duncan, R. E. Ley, M. L. Sogin, W. J. Jones, B. A. Roe, J. P. Affourtit, J. Shendure, H. Ji, V. Gomez-Alvarez, T. K. Teal, T. M. Schmidt, W. Li, L. Jaroszewski, A. Godzik, W. Li, L. Jaroszewski, A. Godzik, W. Li, A. Godzik, M. Margulies, M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben, J. Berka, M. S. Braverman, Y. J. Chen, Z. Chen, S. M. Huse, J. A. Huber, H. G. Morrison, M. L. Sogin, D. M. Welch, A. R. Quinlan, D. A. Stewart, M. P. Stromberg, G. T. Marth, Z. Zhang, S. Schwartz, L. Wagner, W. Miller, K. Mavromatis, N. Ivanova, K. Barry, H. Shapiro, E. Goltsman, A. C. McHardy, I. Rigoutsos, A. Salamov, F. Korzeniewski, M. Land, R. S. Poretsky, I. Hewson, S. Sun, A. E. Allen, J. P. Zehr, M. A. Moran, J. A. Gilbert, D. Field, Y. Huang, R. Edwards, W. Li, P. Gilna, I. Joint, J. D. Thompson, D. G. Higgins, and T. J. Gibson. 2010. Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinformatics 11, 1, 187.Google ScholarCross Ref
- Jeongsu Oh, Byung Kwon Kim, Wan-Sup Cho, Soon Gyu Hong, and Kyung Mo Kim. 2012. PyroTrimmer: A software with GUI for pre-processing 454 amplicon sequences. Journal of Microbiology 50, 5, 766--769.Google ScholarCross Ref
- Yukiteru Ono, Kiyoshi Asai, and Michiaki Hamada. 2013. PBSIM: PacBio reads simulator toward accurate genome assembly. Bioinformatics 29, 1, 119--121. Google ScholarDigital Library
- Fatih Ozsolak, Philipp Kapranov, Sylvain Foissac, Sang Woo Kim, Elane Fishilevich, A. Paula Monaghan, Bino John, and Patrice M. Milos. 2010. Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 143, 6, 1018--1029.Google ScholarCross Ref
- Fatih Ozsolak, Adam R. Platt, Dan R. Jones, Jeffrey G. Reifenberger, Lauryn E. Sass, Peter McInerney, John F. Thompson, Jayson Bowers, Mirna Jarosz, and Patrice M. Milos. 2009. Direct RNA sequencing. Nature 461, 7265, 814--818.Google Scholar
- Stephan Pabinger, Andreas Dander, Maria Fischer, Rene Snajder, Michael Sperk, Mirjana Efremova, Birgit Krabichler, Michael R. Speicher, Johannes Zschocke, and Zlatko Trajanoski. 2014. A survey of tools for variant analysis of next-generation genome sequencing data. Briefings in Bioinformatics 15, 2, 256--278.Google ScholarCross Ref
- Swati Parekh, Christoph Ziegenhain, Beate Vieth, Wolfgang Enard, and Ines Hellmann. 2016. The impact of amplification on differential expression analyses by RNA-seq. Scientific Reports 6 (2016), 25533.Google ScholarCross Ref
- Ravi K. Patel, Mukesh Jain, E. R. Mardis, Z. Wang, M. Gerstein, M. Snyder, R. Garg, R. K. Patel, A. K. Tyagi, M. Jain, R. Garg, R. K. Patel, S. Jhanwar, P. Priya, A. Bhattacharjee, A. Martinez-Alcantara, E. Ballesteros, F. M. Rojas, H. Koshinsky, V. Y. Fofanov, D. Blankenberg, A. Gordon, G. V. Kuster, N. Coraor, J. Taylor, M. P. Cox, D. A. Peterson, P. J. Biggs, R. Schmieder, Y. Lim, F. Rohwer, R. Edwards, R. Schmieder, R. Edwards, P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer, P. M. Rice, M. Margulies, M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, T. Lassmann, Y. Hayashizaki, C. O. Daub, M. Morgan, S. Anders, M. Lawrence, P. Aboyoun, H. Pages, R. V. Pandey, V. Nolte, and C. Schlotterer. 2012. NGS QC toolkit: A toolkit for quality control of next generation sequencing data. PLoS ONE 7, 2, e30619.Google ScholarCross Ref
- William R. Pearson and David J. Lipman. 1988. Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences 85, 8, 2444--2448.Google ScholarCross Ref
- Mihai Pop and Steven L. Salzberg. 2008. Bioinformatics challenges of new sequencing technology. Trends in Genetics 24, 3, 142--149.Google ScholarCross Ref
- J. Quick, N. J. Loman, S. Duraffour, J. T. Simpson, E. Severi, L. Cowley, J. A. Bore, R. Koundouno, G. Dudas, A. Mikhail, N. Ouedraogo, B. Afrough, A. Bah, J. H. Baum, B. Becker-Ziaja, J. P. Boettcher, M. Cabeza-Cabrerizo, A. Camino-Sanchez, L. L. Carter, J. Doerrbecker, T. Enkirch, I. Garcia-Dorival, N. Hetzelt, J. Hinzmann, T. Holm, L. E. Kafetzopoulou, M. Koropogui, A. Kosgey, E. Kuisma, C. H. Logue, A. Mazzarelli, S. Meisel, M. Mertens, J. Michel, D. Ngabo, K. Nitzsche, E. Pallasch, L. V. Patrono, J. Portmann, J. G. Repits, N. Y. Rickett, A. Sachse, K. Singethan, I. Vitoriano, R. L. Yemanaberhan, E. G. Zekeng, T. Racine, A. Bello, A. A. Sall, O. Faye, O. Faye, N. Magassouba, C. V. Williams, V. Amburgey, L. Winona, E. Davis, J. Gerlach, F. Washington, V. Monteil, M. Jourdain, M. Bererd, A. Camara, H. Somlare, A. Camara, M. Gerard, G. Bado, B. Baillet, D. Delaune, K. Y. Nebie, A. Diarra, Y. Savane, R. B. Pallawo, G. J. Gutierrez, N. Milhano, I. Roger, C. J. Williams, F. Yattara, K. Lewandowski, J. Taylor, P. Rachwal, D. J. Turner, G. Pollakis, J. A. Hiscox, D. A. Matthews, M. K. O’Shea, A. M. Johnston, D. Wilson, E. Hutley, E. Smit, A. Di Caro, R. Wolfel, K. Stoecker, E. Fleischmann, M. Gabriel, S. A. Weller, L. Koivogui, B. Diallo, S. Keita, A. Rambaut, P. Formenty, S. Gunther, and M. W. Carroll. 2016. Real-time, portable genome sequencing for Ebola surveillance. Nature 530, 7589, 228--232.Google Scholar
- A. R. Quinlan, R. A. Clark, S. Sokolova, M. L. Leibowitz, Y. Zhang, M. E. Hurles, J. C. Mell, and I. M. Hall. 2010. Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Research 20, 5, 623--635.Google ScholarCross Ref
- Richard Redon, Shumpei Ishikawa, Karen R. Fitch, Lars Feuk, George H. Perry, T. Daniel Andrews, Heike Fiegler, Michael H. Shapero, Andrew R. Carson, Wenwei Chen, Eun Kyung Cho, Stephanie Dallaire, Jennifer L. Freeman, Juan R. González, Mònica Gratacòs, Jing Huang, Dimitrios Kalaitzopoulos, Daisuke Komura, Jeffrey R. MacDonald, Christian R. Marshall, Rui Mei, Lyndal Montgomery, Kunihiro Nishimura, Kohji Okamura, Fan Shen, Martin J. Somerville, Joelle Tchinda, Armand Valsesia, Cara Woodwark, Fengtang Yang, Junjun Zhang, Tatiana Zerjal, Jane Zhang, Lluis Armengol, Donald F. Conrad, Xavier Estivill, Chris Tyler-Smith, Nigel P. Carter, Hiroyuki Aburatani, Charles Lee, Keith W. Jones, Stephen W. Scherer, and Matthew E. Hurles. 2006. Global variation in copy number in the human genome. Nature 444, 7118, 444--454.Google Scholar
- A. Rhoads and K. F. Au. 2015. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13, 5, 278--289.Google ScholarCross Ref
- Manuel A. Rivas, Mélissa Beaudoin, Agnes Gardet, Christine Stevens, Yashoda Sharma, Clarence K. Zhang, Gabrielle Boucher, Stephan Ripke, David Ellinghaus, Noel Burtt, Tim Fennell, Andrew Kirby, Anna Latiano, Philippe Goyette, Todd Green, Jonas Halfvarson, Talin Haritunians, Joshua M. Korn, Finny Kuruvilla, Caroline Lagacé, Benjamin Neale, Ken Sin Lo, Phil Schumm, Leif Törkvist, Marla C. Dubinsky, Steven R. Brant, Mark S. Silverberg, Richard H. Duerr, David Altshuler, Stacey Gabriel, Guillaume Lettre, Andre Franke, Mauro D’Amato, Dermot P. B. McGovern, Judy H. Cho, John D. Rioux, Ramnik J. Xavier, Mark J. Daly, John D. Rioux, Ramnik J. Xavier, and Mark J. Daly. 2011. Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nature Genetics 43, 11, 1066--1073.Google ScholarCross Ref
- N. D. Roberts, R. D. Kortschak, W. T. Parker, A. W. Schreiber, S. Branford, H. S. Scott, G. Glonek, and D. L. Adelson. 2013. A comparative analysis of algorithms for somatic SNV detection in cancer. Bioinformatics 29, 18, 2223--2230.Google ScholarCross Ref
- Holger Rohde, Junjie Qin, Yujun Cui, Dongfang Li, Nicholas J. Loman, Moritz Hentschke, Wentong Chen, Fei Pu, Yangqing Peng, Junhua Li, et al. 2011. Open-source genomic analysis of Shiga-toxin--producing E. coli O104: H4. New England Journal of Medicine 365, 8, 718--724.Google ScholarCross Ref
- M. G. Ross, C. Russ, M. Costello, A. Hollinger, N. J. Lennon, R. Hegarty, C. Nusbaum, and D. B. Jaffe. 2013. Characterizing and measuring bias in sequence data. Genome Biol. 14, 5, R51.Google ScholarCross Ref
- M. Rubio-Camarillo, G. Gomez-Lopez, J. M. Fernandez, A. Valencia, and D. G. Pisano. 2013. RUbioSeq: A suite of parallelized pipelines to automate exome variation and bisulfite-seq analyses. Bioinformatics 29, 13, 1687--1689.Google ScholarCross Ref
- Nicole Rusk. 2009. Cheap third-generation sequencing. Nature Methods 6, 4, 244--244.Google ScholarCross Ref
- Nicole Rusk. 2011. Torrents of sequence. Nature Methods 8, 1, 44--44.Google Scholar
- Frederick Sanger, Steven Nicklen, and Alan R. Coulson. 1977. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences 74, 12, 5463--5467.Google ScholarCross Ref
- Christopher T. Saunders, Wendy S. W. Wong, Sajani Swamy, Jennifer Becq, Lisa J. Murray, and R. Keira Cheetham. 2012. Strelka: Accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics (Oxford, England) 28, 14, 1811--7. Google ScholarDigital Library
- Eric E. Schadt, Steve Turner, and Andrew Kasarskis. 2010. A window into third-generation sequencing. Human Molecular Genetics 19, R2 (2010), R227--R240.Google ScholarCross Ref
- Michael C. Schatz. 2009. CloudBurst: Highly sensitive read mapping with MapReduce. Bioinformatics (Oxford, England) 25, 11, 1363--9. Google ScholarDigital Library
- Stephan C. Schuster. 2007. Next-generation sequencing transforms today’s biology. Nature 200, 8, 16--18.Google Scholar
- Jana Marie Schwarz, Christian Rödelsperger, Markus Schuelke, and Dominik Seelow. 2010. MutationTaster evaluates disease-causing potential of sequence alterations. Nature Methods 7, 8, 575--576.Google ScholarCross Ref
- Jay Shendure and Hanlee Ji. 2008. Next-generation DNA sequencing. Nature Biotechnology 26, 10, 1135--1145.Google ScholarCross Ref
- C. Sloggett, N. Goonasekera, and E. Afgan. 2013. BioBlend: Automating pipeline analyses within Galaxy and CloudMan. Bioinformatics 29, 13, 1685--1686.Google ScholarCross Ref
- L. F. Stead, K. M. Sutton, G. R. Taylor, P. Quirke, and P. Rabbitts. 2013. Accurately identifying low-allelic fraction variants in single samples with next-generation sequencing: Applications in tumor subclone resolution. Hum. Mutat. 34, 10, 1432--1438.Google ScholarCross Ref
- Zachary D. Stephens, Skylar Y. Lee, Faraz Faghri, Roy H. Campbell, Chengxiang Zhai, Miles J. Efron, Ravishankar Iyer, Michael C. Schatz, Saurabh Sinha, and Gene E. Robinson. 2015. Big data: Astronomical or genomical? PLoS Biology 13, 7, e1002195.Google ScholarCross Ref
- Bianca Stöcker, Johannes Köster, and Sven Rahmann. 2016. SimLoRD--Simulation of long read data. Bioinformatics 32, 17 (2016), 2704--2706.Google ScholarCross Ref
- Michael R. Stratton, Peter J. Campbell, and P. Andrew Futreal. 2009. The cancer genome. Nature 458, 7239, 719--724.Google Scholar
- Peter H. Sudmant, Tobias Rausch, Eugene J. Gardner, Robert E. Handsaker, Alexej Abyzov, John Huddleston, Yan Zhang, Kai Ye, Goo Jun, Markus Hsi-Yang Fritz, Miriam K. Konkel, Ankit Malhotra, Adrian M. Stütz, Xinghua Shi, Francesco Paolo Casale, Jieming Chen, Fereydoun Hormozdiari, Gargi Dayama, Ken Chen, Maika Malig, Mark J. P. Chaisson, Klaudia Walter, Sascha Meiers, Seva Kashin, Erik Garrison, Adam Auton, Hugo Y. K. Lam, Xinmeng Jasmine Mu, Can Alkan, Danny Antaki, Taejeong Bae, Eliza Cerveira, Peter Chines, Zechen Chong, Laura Clarke, Elif Dal, Li Ding, Sarah Emery, Xian Fan, Madhusudan Gujral, Fatma Kahveci, Jeffrey M. Kidd, Yu Kong, Eric-Wubbo Lameijer, Shane McCarthy, Paul Flicek, Richard A. Gibbs, Gabor Marth, Christopher E. Mason, Androniki Menelaou, Donna M. Muzny, Bradley J. Nelson, Amina Noor, Nicholas F. Parrish, Matthew Pendleton, Andrew Quitadamo, Benjamin Raeder, Eric E. Schadt, Mallory Romanovitch, Andreas Schlattl, Robert Sebra, Andrey A. Shabalin, Andreas Untergasser, Jerilyn A. Walker, Min Wang, Fuli Yu, Chengsheng Zhang, Jing Zhang, Xiangqun Zheng-Bradley, Wanding Zhou, Thomas Zichner, Jonathan Sebat, Mark A. Batzer, Steven A. McCarroll, Ryan E. Mills, Mark B. Gerstein, Ali Bashir, Oliver Stegle, Scott E. Devine, Charles Lee, Evan E. Eichler, Jan O. Korbel, and Jan O. Korbel. 2015. An integrated map of structural variation in 2,504 human genomes. Nature 526, 7571, 75--81.Google Scholar
- Tamas Szalay and Jene A. Golovchenko. 2015. De novo sequencing and variant calling with Nanopores using PoreSeq. Nature Biotechnology 33, 10, 1087--1091.Google ScholarCross Ref
- Y. Tateno, T. Imanishi, S. Miyazaki, K. Fukami-Kobayashi, N. Saitou, H. Sugawara, and T. Gojobori. 2002. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Research 30, 1, 27--30.Google ScholarCross Ref
- GB Editorial Team. 2011. Closure of the NCBI SRA and implications for the long-term future of genomics data storage. 1--3.Google Scholar
- Helga Thorvaldsdóttir, James T. Robinson, and Jill P. Mesirov. 2013. Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration. Briefings in Bioinformatics 14, 2, 178--192.Google ScholarCross Ref
- Erwin L. van Dijk, Hélène Auger, Yan Jaszczyszyn, and Claude Thermes. 2014. Ten years of next-generation sequencing technology. Trends in Genetics 30, 9, 418--426.Google ScholarCross Ref
- Yanqing Wang, Fuhai Song, Junwei Zhu, Sisi Zhang, Yadong Yang, Tingting Chen, Bixia Tang, Lili Dong, Nan Ding, Qian Zhang, et al. 2017. GSA: Genome sequence archive. Genomics, Proteomics and Bioinformatics 15, 1 (2017), 14--18.Google ScholarCross Ref
- Mick Watson, Marian Thomson, Judith Risse, Richard Talbot, Javier Santoyo-Lopez, Karim Gharbi, and Mark Blaxter. 2015. poRe: An R package for the visualization and analysis of Nanopore sequencing data. Bioinformatics 31, 1, 114--115.Google ScholarCross Ref
- Simon J. Watson, Matthijs R. A. Welkers, Daniel P. Depledge, Eve Coulter, Judith M. Breuer, Menno D. de Jong, Paul Kellam, D. D. Richman, E. M. Bunnik, A. Moya, E. Holmes, F. González-Candelas, C. Wang, Y. Mitsuya, B. Gharizadeh, M. Ronaghi, R. W. Shafer, J. Archer, M. S. Braverman, B. E. Taillon, B. Desany, I. James, P. R. Harrigan, M. Lewis, D. L. Robertson, N. Eriksson, L. Pachter, Y. Mitsuya, S-Y. Rhee, C. Wang, B. Gharizadeh, M. Ronaghi, R. W. Shafer, N. Beerenwinkel, J. Archer, G. Baillie, S. J. Watson, P. Kellam, A. Rambaut, D. L. Robertson, K. Nakamura, S. M. Huse, J. A. Huber, H. G. Morrison, M. L. Sogin, D. M. Welch, A. R. Quinian, D. A. Stewart, M. P. Strömberg, G. T. Marth, R. V. Pandey, V. Nolte, J. Boenigk, C. Schlötterer, R. Schmieder, R. Edwards, R. V. Patel, M. Jain, Z. Ning, A. J. Cox, J. C. Mullikin, H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, G. Baillie, M. L. Metzker, and A. McKenna. 2013. Viral population analysis and minority-variant detection using short read next-generation sequencing. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 368, 1614, 20120205.Google Scholar
- Joachim Weischenfeldt, Orsolya Symmons, François Spitz, and Jan O. Korbel. 2013. Phenotypic impact of genomic structural variation: Insights from and for human disease. Nature Reviews Genetics 14, 2, 125--138.Google ScholarCross Ref
- David A. Wheeler, Maithreyan Srinivasan, Michael Egholm, Yufeng Shen, Lei Chen, Amy McGuire, Wen He, Yi-Ju Chen, Vinod Makhijani, G. Thomas Roth, et al. 2008. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 7189, 872--876.Google Scholar
- K. Wong, T. M. Keane, J. Stalker, and D. J. Adams. 2010. Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly. Genome Biol. 11, 12, R128.Google ScholarCross Ref
- Ka-Chun Wong and Zhaolei Zhang. 2014. SNPdryad: Predicting deleterious non-synonymous human SNPs using only orthologous protein sequences. Bioinformatics (Oxford, England) 30, 8, 1112--1119.Google Scholar
- Chao Xie, Martti T. Tammi, J. Sebat, B. Lakshmi, J. Troge, J. Alexander, J. Young, P. Lundin, S. Månér, H. Massa, M. Walker, M. Chi, N. Navin, R. Lucito, J. Healy, J. Hicks, K. Ye, A. Reiner, T. C. Gilliam, B. Trask, N. Patterson, A. Zetterberg, M. Wigler, A. J. Iafrate, L. Feuk, M. N. Rivera, M. L. Listewnik, P. K. Donahoe, Y. Qi, S. W. Scherer, K. C. Woodwark, G. Cameron, R. Durbin, A. Cox, T. Hubbard, M. Clamp, and W. J. Kent. 2009. CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics 10, 1, 80.Google ScholarCross Ref
- Haibin Xu, Xiang Luo, Jun Qian, Xiaohui Pang, Jingyuan Song, Guangrui Qian, Jinhui Chen, and Shilin Chen. 2012. FastUniq: A fast de novo duplicates removal tool for paired short reads. PLoS ONE 7, 12 (2012), e52249.Google ScholarCross Ref
- Kai Ye, Marcel H. Schulz, Quan Long, Rolf Apweiler, and Zemin Ning. 2009. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 21, 2865--2871. Google ScholarDigital Library
- Ming Yi, Yongmei Zhao, Li Jia, Mei He, Electron Kebebew, and Robert M. Stephens. 2014. Performance comparison of SNP detection tools with Illumina exome sequencing data an assessment using both family pedigree information and sample matched SNP array data. Nucleic Acids Research 42, 12, e101--e101.Google ScholarCross Ref
- Yongchao Yongchao Liu and Bertil Schmidt. 2014. CUSHAW2-GPU: Empowering faster gapped short-read alignment using GPU computing. IEEE Design and Test 31, 1, 31--39.Google ScholarCross Ref
- S. Yoon, Z. Xuan, V. Makarov, K. Ye, and J. Sebat. 2009. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Research 19, 9, 1586--1592.Google ScholarCross Ref
- Y. William Yu, Deniz Yorukoglu, Jian Peng, and Bonnie Berger. 2015. Quality score compression improves genotyping accuracy. Nature Biotechnology 33, 3, 240--243.Google ScholarCross Ref
- Peng Yue, Eugene Melamud, John Moult, P. D. Stenson, E. V. Ball, M. Mort, A. D. Phillips, J. A. Shiel, N. S. Thomas, S. Abeysinghe, M. Krawczak, D. N. Cooper, S. T. Sherry, M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski, K. Sirotkin, G. D. Bader, D. Betel, C. W. Hogue, M. Kanehisa, S. Goto, S. Kawashima, Y. Okuno, M. Hattori, B. J. Stapley, G. Benoit, N. Daraselia, A. Yuryev, S. Egorov, S. Novichkova, M. K. Halushka, J. B. Fan, K. Bentley, L. Hsie, N. Shen, A. Weder, R. Cooper, R. Lipshutz, and A. Chakravarti. 2006. SNPs3D: Candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 1, 166.Google ScholarCross Ref
- Daniel R. Zerbino and Ewan Birney. 2008. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs.Genome Research 18, 5, 821--9.Google Scholar
- Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller. 2000. A greedy algorithm for aligning DNA sequences. Journal of Computational Biology 7, 1--2, 203--214.Google ScholarCross Ref
- Qian Zhou, Xiaoquan Su, Anhui Wang, Jian Xu, and Kang Ning. 2013. QC-chain: Fast and holistic quality control method for next-generation sequencing data. PLoS ONE 8, 4, e60234.Google ScholarCross Ref
Index Terms
DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools
Recommendations
Computational complexity of isothermic DNA sequencing by hybridization
Special issue: IV ALIO/EURO workshop on applied combinatorial optimizationIn the paper, the computational complexity of several variants of the problem of isothermic DNA sequencing by hybridization, is analyzed. The isothermic sequencing is a recent method, in which isothermic oligonucleotide libraries are used during the ...
Homology prediction refinement and reconstruction of gene content and order of ancestral bacterial genomes
BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational BiologyWe present a systematical methodology to refine orthologs identification generated by 3rd party de novo prediction programs and reconstruction of ancestral bacteria genome with this information by a neighboring gene pairs (NGPs) based method. The ...
A Characterization of the Set of Species Trees that Produce Anomalous Ranked Gene Trees
Ranked gene trees, which consider both the gene tree topology and the sequence in which gene lineages separate, can potentially provide a new source of information for use in modeling genealogies and performing inference of species trees. Recently, we ...
Comments