Abstract
Since its emergence almost 20 years ago (Schwartz et al., Science 1995), optical mapping has undergone a transition from laboratory technique to commercially available data generation method. In line with this transition, it is only relatively recently that optical mapping data has started to be used for scaffolding contigs and assembly validation in large-scale sequencing projects — for example, the goat (Dong et al., Nature Biotech. 2013) and amborella (Chamala et al., Science 2013) genomes. One major hurdle to the wider use of optical mapping data is the efficient alignment of in silico digested contigs to an optical map. We develop Twin to tackle this very problem. Twin is the first index-based method for aligning in silico digested contigs to an optical map. Our results demonstrate that Twin is an order of magnitude faster than competing methods on the largest genome. Most importantly, it is specifically designed to be capable of dealing with large eukaryote genomes and thus is the only non-proprietary method capable of completing the alignment for the budgerigar genome in a reasonable amount of CPU time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alkan, C., Sajjadian, S., Eichler, E.: Limitations of next-generation genome sequence assembly. Nat. Methods 8(1), 61–65 (2010)
Anantharaman, T., Mishra, B.: A probabilistic analysis of false positives in optical map alignment and validation. In: Proc. of WABI, pp. 27–40 (2001)
Antoniotti, M., Anantharaman, T., Paxia, S., Mishra, B.: Genomics via optical mapping iv: sequence validation via optical map matching. Technical report, New York University (2001)
Aston, C., Schwartz, D.: Optical mapping in genomic analysis. John Wiley and Sons, Ltd. (2006)
Bankevich, A., et al.: others. SPAdes: a new Genome assembly algorithm and its applications to single-cell sequencing. J. Comp. Biol. 19(5), 455–477 (2012)
Bradnam, K.R., et al.: Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience 2(1), 1–31 (2013)
Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, Palo Alto, California (1994)
Chamala, S., et al.: Assembly and validation of the genome of the nonmodel basal angiosperm amborella. Science 342(6165), 1516–1517 (2013)
Church, D.M., et al.: Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biology 7(5), e1000112+ (2009)
Dimalanta, et al.: A microfluidic system for large dna molecule arrays. Anal. Chem. 76(18), 5293–5301 (2004)
Dong, Y., et al.: Sequencing and automated whole-genome optical mapping of the genome of a domestic goat (capra hircus). Nat. Biotechnol. 31(2), 136–141 (2013)
Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52(4), 552–581 (2005)
Gagie, T., Navarro, G., Puglisi, S.J.: New algorithms on wavelet trees and applications to information retrieval. Theor. Comput Sci. 426-427, 25–41 (2012)
Gog, S., Petri, M.: Optimized succinct data structures for massive data. Software Pract. Expr. (to appear)
Gurevich, A., Saveliev, V., Vyahhi, N., Tesler, G.: QUAST: quality assessment tool for genome assemblies. Bioinformatics 29(8), 1072–1075 (2013)
Howard, J.T., et al.: De Novo high-coverage sequencing and annotated assemblies of the budgerigar genome (2013)
Kawahara, Y., et al.: Improvement of the oryza sativa nipponbare reference genome using next generation sequence and optical map data. Rice 6(4), 1–10 (2013)
Kent, J.: BLAT–The BLAST-Like Alignment Tool. Genome Res. 12(4), 656–664 (2002)
Lin, H., et al.: AGORA: Assembly Guided by Optical Restriction Alignment. BMC Bioinformatics 12, 189 (2012)
Manber, U., Myers, G.W.: Suffix arrays: A new method for on-line string searches. SIAM J. Sci. Comput. 22(5), 935–948 (1993)
Miller, J.R., et al.: Aggressive assembly of pyrosequencing reads with mates. Bioinformatics 24, 2818–2824 (2008)
Nagarajan, N., Read, T.D., Pop, M.: Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics 24(10), 1229–1235 (2008)
Neely, R.K., Deen, J., Hofkens, J.: Optical mapping of DNA: single-molecule-based methods for mapping genome. Biopolymers 95(5), 298–311 (2011)
Reslewic, S., et al.: Whole-genome shotgun optical mapping of rhodospirillum rubrum. Appl. Environ. Microbiol. 71(9), 5511–5522 (2005)
Ronen, R., Boucher, C., Chitsaz, H., Pevzner, P.: SEQuel: Improving the Accuracy of Genome Assemblies. Bioinformatics 28(12), i188–i196 (2012)
Salzberg, S.: Beware of mis-assembled genomes. Bioinformatics 21(24), 4320–4321 (2005)
Sirén, J., Välimäki, N., Mäkinen, V.: Indexing graphs for path queries with applications in genome research. IEEE/ACM Trans. Comput. Biol. Bioinform. (to appear, 2014)
Teague, B., et al.: High-resolution human genome structure by single-molecule analysis. Proc. Natl. Acad. Sci. 107(24), 10848–10853 (2010)
Thorvaldsdòttir, H., Robinson, J.T., Mesirov, J.P.: Integrative Genomics Viewer (IGV): High-performance Genomics Data Visualization and Exploration. Brief. Bioinform. 14(2), 178–192 (2013)
Valouev, A., et al.: Alignment of optical maps. J. Comp. Biol. 13(2), 442–462 (2006)
VanSteenHouse, H. personal communication (2013)
Zhou, S., et al.: A whole-genome shotgun optical map of yersinia pestis strain KIM. Appl. Environ. Microbiol. 68(12), 6321–6331 (2002)
Zhou, S., et al.: Shotgun optical mapping of the entire leishmania major Friedlin genome. Mol. Biochem. Parasitol. 138(1), 97–106 (2004)
Zhou, S., et al.: A single molecule scaffold for the maize genome. PLoS Genet. 5(11), e1000711 (2009)
Zhou, S., et al.: Validation of rice genome sequence by optical mapping. BMC Genomics 8(1), 278 (2007)
Zimin, A., et al.: Sequencing and assembly of the 22-gb loblolly pine genome. Genetics 196(3), 875–890 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muggli, M.D., Puglisi, S.J., Boucher, C. (2014). Efficient Indexed Alignment of Contigs to Optical Maps. In: Brown, D., Morgenstern, B. (eds) Algorithms in Bioinformatics. WABI 2014. Lecture Notes in Computer Science(), vol 8701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44753-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-44753-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44752-9
Online ISBN: 978-3-662-44753-6
eBook Packages: Computer ScienceComputer Science (R0)