A method of comparing protein molecular surface based on normal vectors with attributes and its application to function identification
Introduction
Recently, it has been elucidated that many functions of protein, for example, immunity and catalytic, are related to not only their own structures but also the shapes and the physical properties of the molecular surfaces [1]. From this background, some researches on comparing molecular surfaces based on the position of atoms in the surfaces have been promoted [2]. In these researches, molecular surfaces are compared by using only the molecular shapes or coordinates of atoms. However, the physical properties as well as the shape are important to compare protein in terms of its function.
In this paper, we propose a method of comparing the 3D molecular surface data of proteins, which have information on the shape and the physical properties such as electrostatic potential and hydrophobic property. The molecular surface data consist of a set of thousands of vertexes. As the number of vertexes is very large, it takes a huge calculation time to compare protein data. Therefore, this paper presents an efficient matching method by using normal vectors with attributes of curvature, electrostatic potential and hydrophobic property on projections and depressions. The normal vectors are created by calculating the mean curvature and Gaussian curvature [4]. In the matching process, the vectors that should be matched are limited by extracting two vectors with similar relative positions and attributes in surface in order to reduce computational complexity. Further, a matched position of the two surfaces is improved by a local search method in order to compare them more optimally.
Section snippets
Molecular surface data
A protein is a peptide chain which consists of amino acids. Several amino acid residues are exposed on the molecular surface, while the others are hidden in the molecular surface. Since many functions are strongly related to the interaction of the exposed surface with the outside environment, the protein molecular surface data are obtained by calculating the region that can be reached by the outside molecule, which combines or moves dynamically on the protein surface [6]. Examples of the
Approach
In order to align the two molecular surfaces, three vertexes in one molecular surface must be matched to three vertexes in the other surface. The number of combinations of vertexes on them is O(n6) for the number of vertex n. It is impractical to consider all combinations because the number of vertexes is about 10,000 or more. Instead of the three vertexes, two vectors can be used, in which the number of combinations of two normal vectors is reduced to O(n4), and this shows the efficiency of
Evaluation
The proposed method was applied to the antibody protein in the eF-site database, and the calculation time and the accuracy of the matching position were evaluated. The accuracy of the protein identification was also evaluated. The method was implemented on an AlphaServer DS20 (CPU: 2×Alpha21264 500 MHz, memory: 1.5 GB, COMPAQ) and all evaluation data were obtained in this environment.
Conclusion
This paper proposed the method of comparing molecular surface data based on the normal vectors with attributes. In this method, efficient matching is realized by the bucket method and threshold. The proposed method was applied to the 12 protein surface data. As a result, the mean calculation time is about 3 min, while the method in which all combinations of pairs of vectors are considered to takes 50 days, and it is possible to match the two molecular surfaces at the optimal location
Acknowledgements
The authors thanks Prof. Norihisa Komoda and Dr. Kengo Kinoshita who offered useful discussion related to this research. A part of this research is supported by the Japan Science and the Technology Corporation and the Ministry of Education, Culture, Sports Science and Technology, Grant-in-Aid for Scientific Research.
References (6)
- et al.
LIGAND: chemical database for enzyme reactions
Bioinformatics
(1998) - et al.
Comparison of protein surfaces using a genetic algorithm
J. Computer-Aided Mol. Design
(1997) - et al.
Identification of protein functions from a molecular surface database, eF-site
J. Struct. Funct. Genom.
(2001)
Cited by (6)
Prediction of IgE-binding epitopes by means of allergen surface comparison and correlation to cross-reactivity
2011, Journal of Allergy and Clinical ImmunologyPartial geometrie hashing for retrieving similar interaction protein using profile
2007, Proceedings - International Conference on Information Technology-New Generations, ITNG 2007Finding patterns on protein surfaces: Algorithms and applications to protein classification
2005, IEEE Transactions on Knowledge and Data EngineeringA method of filtering protein surface motifs based on similarity among local surfaces
2004, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Logical cluster construction in a grid environment for similar protein retrieval
2004, Proceedings - International Symposium on Applications and the Internet Workshops