Geometric Sieving: Automated Distributed Optimization of 3D Motifs for Protein Function Prediction

Chen, Brian Y.; Fofanov, Viacheslav Y.; Bryant, Drew H.; Dodson, Bradley D.; Kristensen, David M.; Lisewski, Andreas M.; Kimmel, Marek; Lichtarge, Olivier; Kavraki, Lydia E.

doi:10.1007/11732990_42

Brian Y. Chen²⁴,
Viacheslav Y. Fofanov²⁵,
Drew H. Bryant²⁸,
Bradley D. Dodson²⁴,
David M. Kristensen^26,27,
Andreas M. Lisewski²⁷,
Marek Kimmel²⁵,
Olivier Lichtarge^26,27 &
…
Lydia E. Kavraki^24,26,28

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

1323 Accesses

Abstract

Determining the function of all proteins is a recurring theme in modern biology and medicine, but the sheer number of proteins makes experimental approaches impractical. For this reason, current efforts have considered in silico function prediction in order to guide and accelerate the function determination process. One approach to predicting protein function is to search functionally uncharacterized protein structures (targets), for substructures with geometric and chemical similarity (matches), to known active sites (motifs). Finding a match can imply that the target has an active site similar to the motif, suggesting functional homology.

An effective function predictor requires effective motifs – motifs whose geometric and chemical characteristics are detected by comparison algorithms within functionally homologous targets (sensitive motifs), which also are not detected within functionally unrelated targets (specific motifs). Designing effective motifs is a difficult open problem. Current approaches select and combine structural, physical, and evolutionary properties to design motifs that mirror functional characteristics of active sites.

We present a new approach, Geometric Sieving (GS), which refines candidate motifs into optimized motifs with maximal geometric and chemical dissimilarity from all known protein structures. The paper discusses both the usefulness and the efficiency of GS. We show that candidate motifs from six well-studied proteins, including α-Chymotrypsin, Dihydrofolate Reductase, and Lysozyme, can be optimized with GS to motifs that are among the most sensitive and specific motifs possible for the candidate motifs. For the same proteins, we also report results that relate evolutionarily important motifs with motifs that exhibit maximal geometric and chemical dissimilarity from all known protein structures. Our current observations show that GS is a powerful tool that can complement existing work on motif design and protein function prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Functional protein mining with conformal guarantees

Article Open access 02 January 2025

Automated protein motif generation in the structure-based protein function prediction tool ProMOL

Article 16 November 2015

Achievements and Challenges in Computational Protein Design

References

Wolfson, H.J., Rigoutsos, I.: Geometric hashing: An overview. IEEE Comp. Sci. Eng. 4(4), 10–21 (1997)
Article Google Scholar
Barker, J.A., Thornton, J.M.: An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis. Bioinf. 19(13), 1644–1649 (2003)
Article Google Scholar
Chen, B.Y., et al.: Algorithms for structural comparison and statistical analysis of 3d protein motifs. In: Proceedings of Pacific Symposium on Biocomputing 2005, pp. 334–345 (2005)
Google Scholar
Stark, A., Sunyaev, S., Russell, R.B.: A model for statistical significance of local similarities in structure. J. Mol. Biol. 326, 1307–1316 (2003)
Article Google Scholar
Yao, H., et al.: An accurate, sensitive, and scalable method to identify functional sites in protein structures. J. Mol. Biol. 326, 255–261 (2003)
Article Google Scholar
Laskowski, R.A., Watson, J.D., Thornton, J.M.: Protein function prediction using local 3d templates. Journal of Molecular Biology 351, 614–626 (2005)
Article Google Scholar
Porter, C.T., Bartlett, G.J., Thornton, J.M.: The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Research 32, D129–D133 (2004)
Google Scholar
Shatsky, M., Shulman-Peleg, A., Nussinov, R., Wolfson, H.J.: Recognition of binding patterns common to a set of protein structures. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 440–455. Springer, Heidelberg (2005)
Chapter Google Scholar
Lichtarge, O., Bourne, H.R., Cohen, F.E.: An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257(2), 342–358 (1996)
Article Google Scholar
Lichtarge, O., Yamamoto, K.R., Cohen, F.E.: Identification of functional surfaces of the zinc binding domains of intracellular receptors. J. Mol. Biol. 274, 325–327 (1997)
Article Google Scholar
Connolly, M.L.: Solvent-accessible surfaces of proteins and nucleic acids. Science 221, 709–713 (1983)
Article Google Scholar
Kinoshita, K., Nakamura, H.: Identification of protein biochemical functions by similarity search using the molecular surface database ef-site. Protein Science 12, 1589–1595 (2003)
Article Google Scholar
Shatsky, M., Nussinov, R., Wolfson, H.J.: Flexprot: Alignment of flexible protein structures without a predefinition of hinge regions. Journal of Computational Biology 11(1), 83–106 (2004)
Article Google Scholar
Artymuik, P.J., et al.: A graph-theoretic approach to the identification of three dimensional patterns of amino acid side chains in protein structures. J. Mol. Biol. 243, 327–344 (1994)
Article Google Scholar
Bachar, O., et al.: A computer vision based technique for 3-d sequence independent structural comparison of proteins. Prot. Eng. 6(3), 279–288 (1993)
Article Google Scholar
Rosen, M., et al.: Molecular shape comparisons in searches for active sites and functional similarity. Prot. Eng. 11(4), 263–277 (1998)
Article Google Scholar
Wallace, A.C., Laskowski, R.A., Thornton, J.M.: Derivation of 3D coordinate templates for searching structural databases. Prot. Sci. 5, 1001–1013 (1996)
Article Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
MATH Google Scholar
Jones, M.C., Marron, J.S., Sheather, S.J.: A brief survey of bandwidth selection for density estimation. J. Amer. Stat. Assoc. 91, 401–407 (1996)
Article MATH MathSciNet Google Scholar
Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selections method for kernel density estimation. J. Roy. Stat. Soc. 53(3), 683–690 (1991)
MATH MathSciNet Google Scholar
Berman, H.M., et al.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)
Article Google Scholar
Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., Thornton, J.M.: Cath- a hierarchic classification of protein domain structures. Structure 5(8), 1093–1108 (1997)
Article Google Scholar
Efron, B., Tibshirani, R.: The bootstrap method for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science 1(1), 1–35 (1986)
MathSciNet Google Scholar
Efron, B.: Better bootstrap confidence intervals (with discussion). J. Amer. Stat. Assoc. 82, 171 (1987)
Article MATH MathSciNet Google Scholar
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chappman & Hall, London (1993)
MATH Google Scholar
Blow, D.M., Birktoft, J.J., Hartley, B.S.: Role of a buried acid group in the mechanism of action of chymotrypsin. Nature 221(178), 337–340 (1969)
Article Google Scholar
Reyes, V., et al.: Isomorphous crystal structures of Escherichia coli dihydrofolate reductase complexed with folate, 5-deazafolate, and 5,10-dideazatetrahydrofolate: mechanistic implications. Biochemistry 34, 2710–2723 (1995)
Article Google Scholar
Bystroff, C., et al.: Crystal structures of Escherichia coli dihydrofolate reductase: the nadp⁺ holoenzyme and the folate-nadp⁺ ternary complex. substrate binding and a model for the transition state. Biochemistry 29, 3263–3277 (1990)
Article Google Scholar
Knochel, T.R., et al.: The crystal structure of indole-3-glycerol phosphate synthase from the hyperthermophilic archaeon sulfolobus solfataricus in three different crystal forms: effects of ionic strength. J. Mol. Biol. 262, 502–515 (1996)
Article Google Scholar
Huang, C.-C., et al.: Crystal structures of mycolic acid cyclopropane synthases from mycobacterium tuberculosis. J. Biol. Chem. 277, 11559–11569 (2002)
Article Google Scholar
Krengel, U., Dijkstra, B.W.: Three-dimensional structure of endo-1,4-beta-xylanase i from aspergillus niger: Molecular basis for its low ph optimum. J. Mol. Biol. 263, 70–78 (1996)
Article Google Scholar
International Union of Biochemistry. Nomenclature Committee. Enzyme Nomenclature. Academic Press, San Diego, California (1992)
Google Scholar
Snir, M., Gropp, W.: MPI: The Complete Reference, 2nd edn. The MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rice University, Houston, TX, 77005, USA
Brian Y. Chen, Bradley D. Dodson & Lydia E. Kavraki
Department of Statistics, Rice University,
Viacheslav Y. Fofanov & Marek Kimmel
Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX, 77005, USA
David M. Kristensen, Olivier Lichtarge & Lydia E. Kavraki
Department of Molecular and Human Genetics, Baylor College of Medicine,
David M. Kristensen, Andreas M. Lisewski & Olivier Lichtarge
Department of Bioengineering, Rice University,
Drew H. Bryant & Lydia E. Kavraki

Authors

Brian Y. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Viacheslav Y. Fofanov
View author publications
You can also search for this author in PubMed Google Scholar
Drew H. Bryant
View author publications
You can also search for this author in PubMed Google Scholar
Bradley D. Dodson
View author publications
You can also search for this author in PubMed Google Scholar
David M. Kristensen
View author publications
You can also search for this author in PubMed Google Scholar
Andreas M. Lisewski
View author publications
You can also search for this author in PubMed Google Scholar
Marek Kimmel
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Lichtarge
View author publications
You can also search for this author in PubMed Google Scholar
Lydia E. Kavraki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Georgia Institute of Technology and Università di Padova,
Alberto Apostolico
Topic Chairs, P.O. Box
Concettina Guerra
Center for Molecular Biology and Computer Sciecne Department, Brown University, 115 Waterman St., 02912, Providence, RI, USA
Sorin Istrail
University of California, San Diego, USA
Pavel A. Pevzner
Department of Molecular and Computational Biology, University of Southern California, 1050 Childs Way, 90089-2910, Los Angeles, CA, USA
Michael Waterman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, B.Y. et al. (2006). Geometric Sieving: Automated Distributed Optimization of 3D Motifs for Protein Function Prediction. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_42

Download citation

DOI: https://doi.org/10.1007/11732990_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33295-4
Online ISBN: 978-3-540-33296-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Geometric Sieving: Automated Distributed Optimization of 3D Motifs for Protein Function Prediction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Functional protein mining with conformal guarantees

Automated protein motif generation in the structure-based protein function prediction tool ProMOL

Achievements and Challenges in Computational Protein Design

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Geometric Sieving: Automated Distributed Optimization of 3D Motifs for Protein Function Prediction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Functional protein mining with conformal guarantees

Automated protein motif generation in the structure-based protein function prediction tool ProMOL

Achievements and Challenges in Computational Protein Design

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation