Skip to main content

Phylogenetic Profiling of Insertions and Deletions in Vertebrate Genomes

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Abstract

Micro-indels are small insertion or deletion events (indels) that occur during genome evolution. The study of micro-indels is important, both in order to better understand the underlying biological mechanisms, and also for improving the evolutionary models used in sequence alignment and phylogenetic analysis. The inference of micro-indels from multiple sequence alignments of related genomes poses a difficult computational problem, and is far more complicated than the related task of inferring the history of point mutations. We introduce a tree alignment based approach that is suitable for working with multiple genomes and that emphasizes the concept of indel history. By working with an appropriately restricted alignment model, we are able to propose an algorithm for inferring the optimal indel history of homologous sequences that is efficient for practical problems. Using data from the ENCODE project as well as related sequences from multiple primates, we are able to compare and contrast indel events in both coding and non-coding regions. The ability to work with multiple sequences allows us to refute a previous claim that indel rates are approximately fixed even when the mutation rate changes, and allows us to show that indel events are not neutral. In particular, we identify indel hotspots in the human genome.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blanchette, M., Green, E.D., Miller, W., Haussler, D.: Reconstructing large regions of an ancestral mammalian genome in silico. Genome Res. 14, 2412–2423 (2004)

    Article  Google Scholar 

  2. Boffelli, D., McAuliffe, J., Ovcharenko, D., Lewis, K.D., Ovcharenko, I., Pachter, L., Rubin, E.M.: Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299(5611), 1391–1394 (2003)

    Article  Google Scholar 

  3. Bray, N., Pachter, L.: MAVID: Constrained ancestral alignment of multiple sequences. Genome Res. 14, 693–699 (2004)

    Article  Google Scholar 

  4. Cooper, G.M., Brudno, M., Stone, E.A., Dubchak, I., Batzoglou, S., Sidow, A.: Characterization of evolutionary rates and constraints in three mammalian genomes. Genome Res. 14, 539–548 (2004)

    Article  Google Scholar 

  5. Chuzhanova, N.A., Anassis, E.J., Ball, E.V., Krawczak, M., Cooper, D.N.: Meta-analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity. Human Mutation 21(1), 28–44 (2003)

    Article  Google Scholar 

  6. Dress, A., Steel, M.A.: Convex tree realizations of partitions. Applied Mathematics Letters 5(3), 3–6 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  7. The ENCODE Project Consortium. The ENCODE (ENCyclopedia of DNA Elements) Project. Science 306(5696), 636–640 (2004)

    Google Scholar 

  8. The Berkeley ENCODE Website, http://bio.math.berkeley.edu/encode/

  9. Elias, I.: Settling the Intractability of Multiple Alignment. In: Int. Symp. on Algorithms and Computation (ISAAC), pp. 352–363 (2003)

    Google Scholar 

  10. Fitch, W.M.: A non-sequential method for constructing trees and hierarchical classifications. J. Mol. Evol. 18(1), 30–37 (1981)

    Article  MathSciNet  Google Scholar 

  11. Felsenstein, J.: Inferring Phylogenies. Sinauer Associates Inc, Mass (2004)

    Google Scholar 

  12. Frazer, K.A., Chen, X., Hinds, D.A., Pant, P.V., Patil, N., Cox, D.R.: Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates. Genome Res. 13(3), 341–346 (2003)

    Article  Google Scholar 

  13. Hancock, J.M., Vogler, A.P.: How slippage-derived sequences are incorporated into rRNA variable-region secondary structure: Implications for phylogeny reconstruction. Mol. Phylogenet. Evol. 14, 366–374 (2000)

    Article  Google Scholar 

  14. Hasegawa, M., Kishino, H., Yano, T.: Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174 (1985)

    Article  Google Scholar 

  15. Lai, Y., Sun, F.: The relationship between microsatellite slippage mutation rate and the number of repeat units. Mol. Biol. Evol. 20, 2123–2131 (2003)

    Article  Google Scholar 

  16. Löytynoja, A., Goldman, N.: An algorithm for progressive multiple alignment of sequences with insertions. Proc. Natl. Acad. Sci. 102, 10557–10562 (2005)

    Article  Google Scholar 

  17. McGuire, G., Denham, M.C., Balding, D.J.: Models of sequence evolution for DNA sequences containing gaps. Mol. Biol. Evol. 18, 481–490 (2001)

    Google Scholar 

  18. Mitchison, G.J.: A probabilistic treatment of phylogeny and sequence alignment. J. Mol. Evol. 49, 11–22 (1999)

    Article  Google Scholar 

  19. Mitchison, G.J., Durbin, R.M.: Tree-based maximal likelihood substitution matrices and hidden Markov models. J. Mol. Evol. 41, 1139–1151 (1995)

    Article  Google Scholar 

  20. Petrov, D.A., Sangster, T.A., Johnston, J.S., Hartl, D.L., Shaw, K.L.: Evidence for DNA loss as a determinant of genome size. Science 287, 1060–1062 (2000)

    Article  Google Scholar 

  21. Berkeley PGA, http://pga.lbl.gov/

  22. Saitou, N., Ueda, S.: Evolutionary rates of insertion and deletion in noncoding nucleotide sequences of primates. Mol. Biol. Evol. 11(3), 504–512 (1994)

    Google Scholar 

  23. Sankoff, D., Cedergren, R.: Simultaneous comparisons of three or more sequences related by a tree. In: Sankoff, D., Kruskal, J. (eds.) Time Warp, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, pp. 253–264. Addison Wesley, Reading (1983)

    Google Scholar 

  24. Soding, J., Lupas, A.N.: More than the sum of their parts: on the evolution of proteins from peptides. Bioessays 25(9), 837–846 (2003)

    Article  Google Scholar 

  25. Taylor, M.S., Ponting, C.P., Copley, R.R.: Occurrence and consequences of coding sequence insertions and deletions in mammalian genomes. Genome Res. 14, 555–566 (2004)

    Article  Google Scholar 

  26. Thomas, J.W., Touchman, J.W., Blakesley, R.W., Bouffard, G.G., Beckstrom-Sternberg, S.M., Margulies, E.H., Blanchette, M., Siepel, A.C., Thomas, P.J., McDowell, J.C., et al.: Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003)

    Article  Google Scholar 

  27. Thorne, J.L., Kishino, H., Felsenstein, J.: An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33, 114–124 (1991)

    Article  Google Scholar 

  28. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1(4), 337–348 (1994)

    Article  Google Scholar 

  29. Wang, L., Jiang, T., Lawler, E.L.: Approximation algorithms for tree alignment with a given phylogeny. Algorithmica 16(3), 302–315 (1996)

    Article  MathSciNet  Google Scholar 

  30. Wang, L., Gusfield, D.: Improved approximation algorithms for tree alignment. J. Algorithms 25(2), 255–273 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  31. Wu, C., Li, W.H.: Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. 82, 1741–1745 (1985)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Snir, S., Pachter, L. (2006). Phylogenetic Profiling of Insertions and Deletions in Vertebrate Genomes. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_23

Download citation

  • DOI: https://doi.org/10.1007/11732990_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics