Analysis and prediction of loop segments in protein structures
Introduction
Loops, those segments which connect elements of secondary structure in the protein fold, are often exposed or surfacial features of the protein structure. As a result, loops can be important for defining differences in binding and activity characteristics for a fold family because functional variability is often related to the structural differences in the exposed regions.
Exploring the conformational space of a loop segment is a difficult undertaking given the large structural variability often observed in the loop regions of experimentally determined protein structures. For example, it is not unusual for loop fragments with the identical seven or nine residue sequence to exhibit highly dissimilar structures. These difficulties are compounded by the typically low sequence identities among the loop segments, which makes the application of comparative modeling techniques often inaccurate. As a result, the prediction of loop conformations is treated in a manner similar to generic protein structure prediction. Two types of approaches are typically pursued: those based on the optimization of energy functions Fiser et al., 2000, Rapp and Friesner, 1999, Xiang et al., 2002, and those directed by the statistical analyses of loop conformations Donate et al., 1996, Tramontano and Lesk, 1992, Greer, 1980.
Optimization based methods attempt to treat the loop prediction problem in a general manner. Loop segments can be described through a variety of all atom, unified atom or continuum based representations. If the loop stems are fixed a priori, a number of algorithms can be used to generate feasible loop conformations Shenkin et al., 1987, Wedemeyer and Scheraga, 1999. A free energy function is used to model those interactions within the loop and those between the loop and its environment. Loop predictions require minimization of the free energy to identify the most stable conformation of the loop. The difficulties reflect the need for accurate force fields to correctly model the loop segment.
Statistical methods rely on the identification of database derived loop segments to fit the flanking residue units on either side of a loop. A number of potential structural segments are first identified and then further discriminated according to geometric or energetic criteria. Structural refinement is utilized to rank the set of potential segments. Statistical methods can be accurate when addressing a specific class of loops, or for loops that are well represented among a suite of homologous sequences. However, these approaches suffer from their database dependence and their limited sampling when compared to the exponential growth in the number of allowable conformations as the segment length grows.
In this work two novel approaches are introduced to explore the conformations of loop segments. The goal of the two approaches is to aid in the successful ab initio structure prediction of proteins using the ASTRO-FOLD methodology Klepeis and Floudas, 2003a, Klepeis and Floudas, 2003b. Since both methods are designed for use in a truly ab initio framework for structure prediction, only minimal information regarding the structure of the residues that flank the loop segment is known. Most importantly, an inherent assumption common to many existing loop models—that is, the requirement of fixing the orientation and distance between the flanking loop stem residues—is not imposed. Both methods directly utilize the optimization of energy functions.
In the sequel, a brief introduction to the ASTRO-FOLD methodology is given in order to first describe the conditions under which the loop prediction approaches are designed to operate. This is followed by a detailed description of the modeling and optimization components of each approach. Finally, loop prediction results are presented for a set of benchmarks proteins, as well as for a number of proteins from the recent CASP5 experiment.
Section snippets
Modeling and computational methodology
Before providing the details of the loop prediction approaches, it is instructive to understand the context under which these methods are used. Specifically, although these loop prediction methods can be employed independently, their development was inspired specifically for application to the ASTRO-FOLD ab initio structure prediction approach (Klepeis & Floudas, 2003a, 2003b). A schematic illustration of the ASTRO-FOLD methodology is given in Fig. 1. ASTRO-FOLD is a four stage approach that
Results and discussion
The loop prediction approaches have been applied within the context of the ASTRO-FOLD approach to a number of test systems. Initial tests included a number of benchmark case studies for protein structure prediction (Klepeis & Floudas, 2003a). More recently, a set of results has been compiled based on blind predictions for a number of protein systems as part of the CASP5 experiment (Klepeis & Floudas, 2003b).
Conclusions
The presented loop prediction approaches play an important role in restraining and focusing the conformational searches used in treating the overall three-dimensional structure prediction problem. In particular, these restraints take the form of reduced and domains as well as internal interatomic distance restraints for those residues connecting consecutive elements of secondary structure. The bounds are extracted from the set of low free energy conformers identified from conformational
Acknowledgments
CAF gratefully acknowledges financial support from the National Science Foundation and the National Institutes of Health (R01 GM52032).
References (32)
- et al.
Solution structure, backbone dynamics and chitin binding properties of the anti-fungal protein from streptomyces tendae tu901
Journal of Molecular Biology
(2001) - et al.
Two results on bounding the roots of interval polynomials
Computers and Chemical Engineering
(1999) - et al.
A fast recursive algorithm for molecular dynamics simulation
Journal of Comparative Physics
(1993) - et al.
Predicting solvated peptide conformations via global minimization of energetic atom-to-atom interactions
Computers and Chemical Engineering
(1998) - et al.
Hybrid global optimization algorithms for protein structure prediction: alternating hybrids
Biophysical Journal
(2003) - et al.
A new class of hybrid global optimization algorithms for peptide structure prediction: integrated hybrids
Computer Physics Communication
(2003) - et al.
Structure of bovine trypsin inhibitor. results of joint neutron and x-ray refinement of crystal form ii
Journal of Molecular Biology
(1984) - et al.
Rigorous convex underestimators for general twice-differentiable problems
Journal of Global Optimization
(1996) - Adjiman, C. S., Androulakis, I. P., & Flouds, C. A. (1998). A global optimization method, aBB, for general...
- Adjiman C. S., Dallwig, S., Floudas, C. A., & Neumaier, A. (1998). A global optimization method, aBB, for general...