Skip to main content

Global Pairwise Protein Sequence Alignment via Mixed-Integer Linear Optimization

  • Reference work entry
Encyclopedia of Optimization

Introduction

Sequence alignment methods aim to both identify related protein sequences and determine the best alignment between them. This approach provides a rough measure of evolutionary distance and may indicate possible relationships between the protein structure and function of similar sequences. Multiple scoring matrices have been developed based on the techniques of the percent of accepted mutations (PAM) [3] and protein blocks (BLOSUM) [5] to quantify this evolutionary distance between aligned residues.

The pairwise sequence alignment problem is most commonly addressed through either (i) global alignment or (ii) local alignment techniques. The goal of global alignment algorithms is to determine the highest scoring overall alignment spanning the length of both sequences. One widely used approach for this problem is a dynamic programming approach proposed by Needleman and Wunsch [10].

Proteins may share sequence similarity in some regions, but not in others. Local...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 2,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Altschul S, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Google Scholar 

  2. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389–3402

    Article  Google Scholar 

  3. Dayhoff M, Schwartz R, Orcutt B (1978) A model of evolutionary change in proteins. Atlas of Protein Sequences and Structures, vol 5. National Biomedical Research Foundation, Washington DC

    Google Scholar 

  4. He D, Arslan AN (2005) A space-efficient algorithm for the constrained pairwise sequence alignment problem. Genome Inform 16:237–246

    Google Scholar 

  5. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89:10915–10919

    Article  Google Scholar 

  6. Floudas CA (1995) Nonlinear and Mixed-integer Optimization: Fundamentals and Applications. Oxford University Press, New York

    MATH  Google Scholar 

  7. Loose C, Klepeis JL, Floudas CA (2004) A new pairwise folding potential based on improved decoy generation and side-chain packing. Prot Struct Funct Bioinf 54:303–314

    Article  Google Scholar 

  8. McAllister SR, Rajgaria R, Floudas CA (2007) A template-based mixed integer linear programming sequence alignment model. In: Torn A, Zilinskas J (eds) Models and Algorithms for Global Optimization, Springer Optimization and Its Applications. Springer, New York, pp 343–360

    Google Scholar 

  9. McAllister SR, Rajgaria R, Floudas CA (2007) Global pairwise sequence alignment through mixed integer linear programming: A template free approach. Optim Method Softw 22:127–144

    Article  MathSciNet  MATH  Google Scholar 

  10. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453

    Article  Google Scholar 

  11. Pearson WR (1990) Rapid and sensitive sequence comparison with fastp and fasta. Methods Enzymol 183:63–98

    Article  Google Scholar 

  12. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85:2444–2448

    Article  Google Scholar 

  13. Rajgaria R, McAllister SR, Floudas CA (2006) A novel high resolution C-alpha C-alpha distance dependent force field based on a high quality decoy set. Prot Struct Funct Bioinf 65:726–741

    Article  Google Scholar 

  14. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  Google Scholar 

  15. Tang CY, Lu CL, Chang MD, Tsai YT, Sun YJ, Chao KM, Chang JM, Chiou YH, Wu CM, Chang HT, Chou WI (2003) Constrained multiple sequence alignment tool development and its application to RNase family alignment. J Bioinform Comput Biol 1:267–287

    Article  Google Scholar 

  16. Tobi D, Elber R (2000) Distance-dependent, pair potential for protein folding: Results from linear optimization. Prot Struct Funct Bioinf 41:40–46

    Article  Google Scholar 

  17. Vingron M (1996) Near-optimal sequence alignment. Curr Opin Struct Biol 6:346–352

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

McAllister, S.R., Rajgaria, R., Floudas, C.A. (2008). Global Pairwise Protein Sequence Alignment via Mixed-Integer Linear Optimization . In: Floudas, C., Pardalos, P. (eds) Encyclopedia of Optimization. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74759-0_250

Download citation

Publish with us

Policies and ethics