Skip to main content

Bayesian Phylogenetic Inference under a Statistical Insertion-Deletion Model

  • Conference paper
Algorithms in Bioinformatics (WABI 2003)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 2812))

Included in the following conference series:

Abstract

A central problem in computational biology is the inference of phylogeny given a set of DNA or protein sequences. Currently, this problem is tackled stepwise, with phylogenetic reconstruction dependent on an initial multiple sequence alignment step. However these two steps are fundamentally interdependent. Whether the main interest is in sequence alignment or phylogeny, a major goal of computational biology is the co-estimation of both. Here we present a first step towards this goal by developing an extension of the Felsenstein peeling algorithm. Given an alignment, our extension analytically integrates out both substitution and insertion–deletion events within a proper statistical model. This new algorithm provides a solution to two important problems in computational biology. Firstly, indel events become informative for phylogenetic reconstruction, and secondly phylogenetic uncertainty can be included in the estimation of insertion-deletion parameters. We illustrate the practicality of this algorithm within a Bayesian Markov chain Monte Carlo framework by demonstrating it on a non-trivial analysis of a multiple alignment of ten globin protein sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Drummond, A.J., Nicholls, G.K., Rodrigo, A.G., Solomon, W.: Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161(3), 1307–1320 (2002)

    Google Scholar 

  2. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis. Cambridge University Press, Cambridge (1998)

    Book  MATH  Google Scholar 

  3. Eddy, S.: HMMER: Profile hidden Markov models for biological sequence analysis (2001), http://hmmer.wustl.edu/

  4. Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)

    Article  Google Scholar 

  5. Felsenstein, J.: Estimating effective population size from samples of sequences: Inefficiency of pairwise and segregating sites as compared to phylogenetic estimates. Genetical Research Cambridge 59, 139–147 (1992)

    Article  Google Scholar 

  6. Felsenstein, J.: PHYLIP version 3.5c. Dept. of Genetics, Univ. of Washington, Seattle (1993)

    Google Scholar 

  7. Griffiths, R.C., Tavare, S.: Ancestral inference in population genetics. Statistical Science 9, 307–319 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  8. Hedges, S.B., Poling, L.L.: A molecular phylogeny of reptiles. Science 283(5404), 945–946 (1999)

    Article  Google Scholar 

  9. Hein, J.: An algorithm for statistical alignment of sequences related by a binary tree. In: Pac. Symp. Biocomp., pp. 179–190. World Scientific, Singapore (2001)

    Google Scholar 

  10. Hein, J., Jensen, J.L., Pedersen, C.N.S.: Recursions for statistical multiple alignment. Technical Report 425, Dept. of Theor. Stat., Univ. of Aarhus (January 2002)

    Google Scholar 

  11. Hein, J., Wiuf, C., Knudsen, B., Møller, M.B., Wibling, G.: Statistical alignment: Computational properties, homology testing and goodness-of-fit. J. Mol. Biol. 302, 265–279 (2000)

    Article  Google Scholar 

  12. Holmes, I., Bruno, W.J.: Evolutionary HMMs: a Bayesian approach to multiple alignment. Bioinformatics 17(9), 803–820 (2001)

    Article  Google Scholar 

  13. Huelsenbeck, J.P., Ronquist, F.: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics (2001)

    Google Scholar 

  14. Jensen, J.L., Hein, J.: Gibbs sampler for statistical multiple alignment. Technical Report 429, Dept. of Theor. Stat., U. Aarhus (September 2002)

    Google Scholar 

  15. Jukes, T.H., Cantor, C.R.: Evolution of protein molecules. In: Munro (ed.) Mammalian Protein Metabolism, pp. 21–132. Acad. Press, New York (1969)

    Google Scholar 

  16. Kuhner, M.K., Yamato, J., Felsenstein, J.: Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics 140(4), 1421–1430 (1995)

    Google Scholar 

  17. Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  18. Lunter, G.A., Miklós, I., Song, Y.S., Hein, J.: An efficient algorithm for statistical multiple alignment on arbitrary phylogenetic trees. J. Comp. Biol. (2003) (in press)

    Google Scholar 

  19. Miklós, I.: An improved algorithm for statistical alignment of sequences related by a star tree. Bul. Math. Biol. 64, 771–779 (2002)

    Article  Google Scholar 

  20. Miklós, I., Lunter, G.A., Holmes, I.: A ”long indel” model for evolutionary sequence alignment (in preparation)

    Google Scholar 

  21. Pybus, O.G., Drummond, A.J., Nakano, T., Robertson, B.H., Rambaut, A.: The epidemiology and iatrogenic transmission of hepatitis c virus in Egypt: a Bayesian coalescent approach. Mol Biol Evol 20(3), 381–387 (2003)

    Article  Google Scholar 

  22. Pybus, O.G., Rambaut, A., Harvey, P.H.: An integrated framework for the inference of viral population history from reconstructed genealogies. Genetics 155(3), 1429–1437 (2000)

    Google Scholar 

  23. Steel, M., Hein, J.: Applying the Thorne-Kishino-Felsenstein model to sequence evolution on a star-shaped tree. Appl. Math. Let. 14, 679–684 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  24. Stephens, M., Donnelly, P.: Inference in molecular population genetics. J. of the Royal Stat. Soc. B 62, 605–655 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  25. Swofford, D.: Paup* 4.0. Sinauer Associates (2001)

    Google Scholar 

  26. Thorne, J.L., Kishino, H., Felsenstein, J.: An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33, 114–124 (1991)

    Article  Google Scholar 

  27. Whelan, S., Lió, P., Goldman, N.: Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends in Gen. 17, 262–272 (2001)

    Article  Google Scholar 

  28. Wilson, J., Balding, D.J.: Genealogical inference from microsatellite data. Genetics (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lunter, G., Miklós, I., Drummond, A., Jensen, J.L., Hein, J. (2003). Bayesian Phylogenetic Inference under a Statistical Insertion-Deletion Model. In: Benson, G., Page, R.D.M. (eds) Algorithms in Bioinformatics. WABI 2003. Lecture Notes in Computer Science(), vol 2812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39763-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39763-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20076-5

  • Online ISBN: 978-3-540-39763-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics