Skip to main content

Algorithms for Genomic Analysis

  • Reference work entry
  • 786 Accesses

Article Outline

Abstract

Introduction

Phylogenetic Analysis

  Methods Based on Pairwise Distance

  Parsimony Methods

  Maximum Likelihood Methods

Multiple Sequence Alignment

  Scoring Alignment

  Alignment Approaches

  Progressive Algorithms

  Graph-Based Algorithms

  Iterative Algorithms

Novel Graph-Theoretical Genomic Models

  Definitions

  Construction of a Conflict Graph from Paths of Multiple Sequences

  Complexity Theory

  Special Cases of MWCMS

  Computational Models: Integer Programming Formulation

Summary

Acknowledgement

References

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   2,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abbas A, Holmes S (2004) Bioinformatics and management science: Some common tools and techniques. Oper Res 52(2):165–190

    MathSciNet  Google Scholar 

  2. Althaus E, Caprara A, Lenhof H, Reinert K (2006) A branch-and-cut algorithm for multiple sequence alignment. Math Program 105(2-3):387–425

    MathSciNet  MATH  Google Scholar 

  3. Altschul S (1991) Amino acid substitution matrices from an information theoretic perspective. J Mol Biol 219(3):555–565

    Google Scholar 

  4. Altschul SF, Carroll RJ, Lipman DJ (1989) Weights for data related by a tree. J Mol Biol 207(4):647–653

    Google Scholar 

  5. Bains W, Smith G (1988) A novel nethod for DNA sequence determination. J Theor Biol 135:303–307

    Google Scholar 

  6. Barton GJ, Sternberg MJE (1987) A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons. J Mol Biol 198:327–337

    Google Scholar 

  7. Blazewicz J, Formanowicz P, Kasprzak M (2005) Selected combinatorial problems of computational biology. Eur J Oper Res 161:585–597

    MathSciNet  MATH  Google Scholar 

  8. Bonizzoni P, Vedova G (2001) The complexity of multiple sequence alignment with SP-score that is a metric. Theor Comput Sci 259:63–79

    MATH  Google Scholar 

  9. Bos D, Posada D (2005) Using models of nucleotide evolution to build phylogenetic trees. Dev Comp Immunol 29(3):211–227

    Google Scholar 

  10. Bruno WJ, Socci ND, Halpern AL (2000) Weighted neighbor joining: A likelihood-based approach to distance-based phylogeny reconstruction. Mol Biol Evol 17:189–197

    Google Scholar 

  11. Carrillo H, Lipman D (1988) The multiple sequence alignment problem in biology. SIAM J Appl Math 48(5):1073–1082

    MathSciNet  MATH  Google Scholar 

  12. Chakrabarti S, Lanczycki CJ, Panchenko AR, Przytycka TM, Thiessen PA, Bryant SH (2006) Refining multiple sequence alignments with conserved core regions. Nucleic Acids Res 34(9):2598–2606

    Google Scholar 

  13. Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003) Multiple sequence alignment with the clustal series of programs. Nucleic Acids Res 31(13):3497–3500

    Google Scholar 

  14. Chor B, Tuller T (2005) Maximum likelihood of evolutionary trees: hardness and approximation. Bioinf 21(Suppl. 1):I97–I106

    Google Scholar 

  15. Clote P, Backofen R (2000) Computational Molecular Biology: An Introduction. Wiley, NY, USA

    MATH  Google Scholar 

  16. Delsuc F, Brinkmann H, Philippe H (2005) Phylogenomics and the reconstruction of the tree of life. Nature reviews. Genet 6(5):361–375

    Google Scholar 

  17. Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological Sequence Analysis. Cambridge University Press, UK

    MATH  Google Scholar 

  18. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376

    Google Scholar 

  19. Felsenstein J (1988) Phylogenies from molecular sequences: Inference and reliability. Annu Rev Genet 22:521–565

    Google Scholar 

  20. Felsenstein J (1989) PHYLIP – phylogeny inference package (version 3.2). Cladistics 5:164–166

    Google Scholar 

  21. Fitch WM (1971) Toward defining the course of evolution: Minimum change for a specific tree topology. Syst Zool 20(4):406–416

    Google Scholar 

  22. Gallant J, Maider D, Storer J (1980) On finding minimal length superstrings. J Comput Syst Sci 20:50–58

    MATH  Google Scholar 

  23. Garey M, Johnson D (1979) Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, San Francisco, USA

    MATH  Google Scholar 

  24. Gascuel O (1997) BIONJ: An improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14(7):685–695

    Google Scholar 

  25. Goeffon A, Richer J, Hao J (2005) Local search for the maximum parsimony problem. Lect Notes Comput Sci 3612:678–683

    Google Scholar 

  26. Golumbic MC, Rotem D, Urrutia J (1983) Comparability graphs and intersection graphs. Discret Math 43:37–46

    MathSciNet  MATH  Google Scholar 

  27. Gotoh O (1996) Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol 264(4):823–838

    Google Scholar 

  28. Gotoh O (1999) Multiple sequence alignment: algorithms and applications. Adv Biophys 36:159–206

    Google Scholar 

  29. Grötschel M, Lovász L, Schrijver A (1984) Polynomial algorithms for perfect graphs. Annals Discret Math 21:325–356

    Google Scholar 

  30. Grötschel M, Lovász L, Schrijver A (1988) Geometric algorithms and combinatorial optimization. Springer, New York

    MATH  Google Scholar 

  31. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52(5):696–704

    Google Scholar 

  32. Gupta S, Kececioglu J, Schaeffer A (1995) Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J Comput Biol 2:459–472

    Google Scholar 

  33. Hein J (1989) A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is given. Mol Biol Evol 6(6):649–668

    Google Scholar 

  34. Huelsenbeck J, Crandall K (1997) Phylogeny estimation and hypothesis testing using maximum likelihood. Annu Rev Ecol Syst 28:437–66

    Google Scholar 

  35. Hughey R, Krogh A (1996) Hidden markov models for sequence analysis: extension and analysis of the basic method. Comput Appl Biosci 12(2):95–107

    Google Scholar 

  36. Idury RM, Waterman MS (1995) A new algorithm for DNA sequence assembly. J Comput Biol 2(2):291–306

    Google Scholar 

  37. Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian Protein Metabolism. Academic Press, New York, pp 21–123

    Google Scholar 

  38. Just W, Vedova G (2004) Multiple sequence alignment as a facility-location problem. INFORMS J Comput 16(4):430–440

    MathSciNet  Google Scholar 

  39. Keane T, Naughton T, Travers S, McInerney J, McCormack G (2005) DPRml: distributed phylogeny reconstruction by maximum likelihood. Bioinf 21(7):969–974

    Google Scholar 

  40. Kececioglu J, Lenhof H, Mehlhorn K, Mutzel P, Reinert K, Vingron M (2000) A polyhedral approach to sequence alignment problems. Discret Appl Math 104:143–186

    MathSciNet  MATH  Google Scholar 

  41. Kim J, Pramanik S, Chung MJ (1994) Multiple sequence alignment using simulated annealing. Bioinf 10(4):419–426

    Google Scholar 

  42. Kimura M (1980) A simple method for estimating evolutionary of base substitution through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    Google Scholar 

  43. Klotz L, Blanken R (1981) A practical method for calculating evolutionary trees from sequence data. J Theor Biol 91(2):261–272

    Google Scholar 

  44. Korostensky C, Gonnet GH (1999) Near optimal multiple sequence alignments using a traveling salesman problem approach. In: Proceedings of the String Processing and Information Retrieval Symposium. IEEE, Cancun, pp 105–114

    Google Scholar 

  45. Korostensky C, Gonnet GH (2000) Using traveling salesman problem algorithms for evolutionary tree construction. Bioinf 16(7):619–627

    Google Scholar 

  46. Krogh A, Brown M, Mian IS, Sjolander K, Haussler D (1994) Hidden markov models in computational biology: Applications to protein modeling. J Mol Biol 235:1501–1531

    Google Scholar 

  47. Kumar S, Tamura K, Nei M (1994) MEGA: Molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189–191

    Google Scholar 

  48. Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5(2):150–163

    Google Scholar 

  49. Lawrence C, Altschul S, Boguski M, Liu J, Neuwald A, Wootton J (1993) Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment. Science 262:208–214

    Google Scholar 

  50. Lee EK, Easton T, Gupta K (2006) Novel evolutionary models and applications to sequence alignment problems. Annals Oper Res 148(1):167–187

    MATH  Google Scholar 

  51. Levenshtein VL (1966) Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theor 10(9):707–710

    MathSciNet  Google Scholar 

  52. Li W (1981) Simple method for constructing phylogenetic trees from distance matrices. Proc Natl Acad Sci USA 78(2):1085–1089

    MATH  Google Scholar 

  53. Lipman D, Altschul S, Kececioglu J (1989) A tool for multiple sequence alignment. Proc Natl Acad Sci USA 86(12):4412–4415

    Google Scholar 

  54. Maier D, Storer JA (1977) A note on the complexity of the superstring problem. Technical Report 233, Princeton University, USA

    Google Scholar 

  55. Nei M (1996) Phylogenetic analysis in molecular evolutionary genetics. Annu Rev Genet 30:371–403

    Google Scholar 

  56. Notredame C (2002) Recent progress in multiple sequence alignment: a survey. Pharmacogenomics 3(1):131–144

    Google Scholar 

  57. Notredame C, Higgins D (1996) SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res 24(8):1515–1524

    Google Scholar 

  58. Phillips A, Janies D, Wheeler W (2000) Multiple sequence alignment in phylogenetic analysis. Mol Phylogenet Evol 16(3):317–330

    Google Scholar 

  59. Piontkivska H (2004) Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used. Mol Phylogenet Evol 31(3):865–873

    Google Scholar 

  60. Purdom P, Bradford PG, Tamura K, Kumar S (2000) Single column discrepancy and dynamic max-mini optimizations for quickly finding the most parsimonious evolutionary trees. Bioinformamtics 16:140–151

    Google Scholar 

  61. Reinert K, Lenhof H, Mutzel P, Mehlhorn K, Kececioglu J (1997) A branch-and-cut algorithm for multiple sequence alignment. In: Proceedings of the First Annual International Conference on Computational Molecular Biology (RECOMB-97). ACM Press, Santa Fe, pp 241–249

    Google Scholar 

  62. Ronquist F (1998) Fast fitch-parsimony algorithms for large data sets. Cladistics 14:387–400

    Google Scholar 

  63. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    Google Scholar 

  64. Sankoff D, Cedergren RJ (1983) Simultaneous comparison of three or more sequences related by a tree. In: Sankoff D, Kruskal JB (eds) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, MA, USA, pp 253–264

    Google Scholar 

  65. Shyu SJ, Tsai YT, Lee R (2004) The minimal spanning tree preservation approaches for DNA multiple sequence alignment and evolutionary tree construction. J Comb Optim 8(4):453–468

    MathSciNet  MATH  Google Scholar 

  66. Sokal R, Michener C (1958) A statistical method for evaluating systematic relationships. University of Kansas, Scientific Bull 38:1409–1438

    Google Scholar 

  67. Stamatakis A, Ott M, Ludwig T (2005) RAxML-OMP: An efficient program for phylogenetic inference on SMPs. Lect Notes Comput Sci 3606:288–302

    Google Scholar 

  68. Swofford DL, Maddison WP (1987) Reconstructing ancestral character states under wagner parsimony. Math Biosci 87:199–229

    MathSciNet  MATH  Google Scholar 

  69. Swofford DL, Olsen GJ (1990) Phylogeny reconstruction. In: Hillis DM, Moritz G (eds) Molecular Systs. Sinauer Associates, MA, USA, pp 411–501

    Google Scholar 

  70. Tajima F, Nei M (1984) Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol 1(3):269–85

    Google Scholar 

  71. Tajima F, Takezaki N (1994) Estimation of evolutionary distance for reconstructing molecular phylogenetic trees. Mol Biol Evol 11:278–286

    Google Scholar 

  72. Takahashi K, Nei M (2000) Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol Biol Evol 17:1251–1258

    Google Scholar 

  73. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680

    Google Scholar 

  74. Vingron M, Haeseler A (1997) Towards integration of multiple alignment and phylogenetic tree construction. J Comput Biol 4(1):23–34

    Google Scholar 

  75. Vingron M, Waterman M (1994) Sequence alignment and penalty choice. review of concepts, case studies and implications. J Mol Biol 235(1):1–12

    Google Scholar 

  76. Wallace IM, O'Sullivan O, Higgins DG (2005) Evaluation of iterative alignment algorithms for multiple alignment. Bioinformatics 21(8):1408–14

    Google Scholar 

  77. Waterman M, Perlwitz M (1984) Line geometries for sequence comparisons. Bull Math Biol 46(4):567–577

    MathSciNet  MATH  Google Scholar 

  78. Waterman MS (1995) Introduction to Computational Biology: Maps, Sequences and Genomes. Chapman and Hall

    MATH  Google Scholar 

  79. Whelan S, Lio P, Goldman N (2001) Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet 17(5):262–272

    Google Scholar 

  80. Yang Z (1993) Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 10(6):1396–401

    Google Scholar 

  81. Zhang Y, Waterman M (2003) An eulerian path approach to global multiple alignment for DNA sequences. J Comput Biol 10(6):803–819

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

Lee, E.K., Gupta, K. (2008). Algorithms for Genomic Analysis . In: Floudas, C., Pardalos, P. (eds) Encyclopedia of Optimization. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74759-0_9

Download citation

Publish with us

Policies and ethics