Skip to main content

A Hybrid Scoring Function for Protein Multiple Alignment

  • Conference paper
  • First Online:
Algorithms in Bioinformatics (WABI 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

  • 1083 Accesses

Abstract

Previous algorithms for motif discovery and protein alignment have used a variety of scoring functions, each specialized to find certain types of similarity in preference to others. Here we present a novel scoring function that combines the relative entropy score with a sensitivity to amino acid similarities, producing a score that is highly sensitive to the types of weakly-conserved patterns that are typically seen in proteins. We investigate the performance of the hybrid score compared to existing scoring functions. We conclude that the hybrid is more sensitive than previous protein scoring functions, both in the initial detection of a weakly conserved region of similarity, and given such a similarity, in the detection of weakly-conserved instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Blanchette, B. Schwikowski, and M. Tompa. Algorithms for phylogenetic footprinting. J. Comp. Bio., 9(2):211–223, 2002.

    Article  Google Scholar 

  2. K. S. Chan. Asymptotic behavior of the gibbs sampler. J. Amer. Statist. Assoc., 88:320–326, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  3. M. 0. Dayhoff, R. M. Schwartz, and B. C. Orcutt. A model of evolutionary change in proteins. In M. O. Dayhoff, editor, Atlas of Protein Sequence and Structure, volume 5, suppl. 3, pages 345–352. Natl. Biomed. Res. Found., Washington, 1978.

    Google Scholar 

  4. A. Dembo and S. Karlin. Strong limit theorems of empirical functionals for large exceedances of partial sums of iid variables. Annals of Probability, 19(4):1737–1755, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  5. R. Laskowski et. al. Pdbsum. http://www.biochem.ucl.ac.uk/bsm/pdbsum/, 2002.

  6. Schwartz et al. Pipmaker—a web server for aligning two genomic dna sequences. Genome Research, 10:577–586, April 2000.

    Google Scholar 

  7. ExPASy. Prosite. http://www.expasy.ch/prosite/, 2002. hosted by the Swiss Insitute of Bioinformatics.

  8. ExPASy. Swiss-prot. http://www.expasy.ch/sprot/, 2002. hosted by the Swiss Insitute of Bioinformatics.

  9. J. G. Henikoff and S. Henikoff. Using substitution probabilities to improve position-specific scoring matrices. Comput. Appl. Biosci., 12(2):135–43, 1996.

    Google Scholar 

  10. S. Henikoff and J. G. Henikoff. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA, 89:10915–10919, 1992.

    Google Scholar 

  11. S. Karlin and S. F. Altschul. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl Acad. Sci. USA, 87:2264–2268, 1990.

    Google Scholar 

  12. S. Karlin and S. F. Altschul. Applications and statistics for multiple high-scoring segments in molecular sequences. Proc. Natl Acad. Sci. USA, 90:5873–5877, 1993.

    Google Scholar 

  13. C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton. Detecting subtle sequence signals: A gibbs sampling strategy for multiple alignment. Science, 262:208–214, 1993.

    Article  Google Scholar 

  14. B. Morgenstern. Dialign 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics, 15:211–218, 1999.

    Article  Google Scholar 

  15. B. Morgenstern, A. Dress, and T. Werner. Multiple dna and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA, 93:12098–12103, 1996.

    Google Scholar 

  16. S. B. Needleman and C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443–453, 1970.

    Article  Google Scholar 

  17. S. Pietrokovski, J. G. Henikoff, and S. Henikoff. The blocks database—a system for protein classification. Nucl. Acids Res., 24(1):197–200, 1996.

    Article  Google Scholar 

  18. E. Rocke and M. Tompa. An algorithm for finding novel gapped motifs in dna sequences. In Proc. of the 2nd Annual International Conference on Computational Molecular Biology (RECOMB 1998), pages 228–233, March 1998.

    Google Scholar 

  19. T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195–197, 1981.

    Article  Google Scholar 

  20. J. D. Thompson, D. G. Higgins, and T. J. Gibson. Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res., 22:4673–4680, 1994.

    Article  Google Scholar 

  21. T. D. Wu and D. L. Brutlag. Discovering empirically conserved amino acid substitution groups in databases of protein families. In Proc. of the 4th International Conference on Intelligent Systems for Molecular Biology (ISMB 1996), pages 230–240, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rocke, E. (2002). A Hybrid Scoring Function for Protein Multiple Alignment. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-45784-4_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44211-0

  • Online ISBN: 978-3-540-45784-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics