Brief communicationHidden symmetries in the primary sequences of beta-barrel family
Introduction
Many protein domains are combinations of recurring substructures (Edward and Hwang, 2004). However, these domains usually have almost random primary sequences (Stephen et al., 2001, Taylor et al., 2002). How and why do the proteins exhibit obvious symmetry at the level of tertiary structures, and yet seldom symmetry in their primary sequences? Many progresses have been made in this problem (Zbilut et al., 2002, Heger and Holm, 2000). In a previous paper, we have revealed the hidden three-fold repetitions in the sequences of proteins from the beta-trefoil family taking account of physicochemical properties of amino acids (Xu and Xiao, 2005). In this paper we shall extend our method to the repetition analysis of both sequence and structure and take the beta-barrel folds as an example.
The beta-barrel domains (Fig. 1) are built of beta strands that can vary in number from 4 or 5 to over 10. These beta strands form two beta sheets that have the usual twist. Two such twisted beta sheets form a barrel-like structure when they are packed against each other. According to their topology structures, this family can be divided into up-and-down beta-barrel and jelly-roll beta-barrel. The up-and-down beta-barrel is formed by an array of beta-strands arranged in an antiparallel manner with each strand hydrogen-bonded to neighboring strands nearly always adjacent in the amino acid sequence (LaLonde et al., 1994). The jelly-roll beta-barrel is usually formed by two Greek-keys, and these proteins usually form two sheets with few if any hydrogen bonds between strands that belong to the different beta sheets. From direct observation of the tertiary structures we can easily find that all of the proteins in this family have two-fold quasi-symmetric structures. This shall be further confirmed in the following by a quantitative method. We shall also try to find the hidden repetitions of the primary sequences corresponding to their tertiary structures.
Section snippets
Methods
We shall investigate the two-fold symmetries of the primary sequences and tertiary structures by using a modified recurrence quantification analysis. The recurrence quantification analysis is a QSAR-related equivalent of a known sequence analysis tool that has originally been called “distance chart analysis” (Konopka, 1994, Konopka, 1997, Konopka, 2003, Wootton, 1997, Konopka and Smythers, 1987, Konopka and Chatterjee, 1988).
The detail of the modified recurrence plot can be found in the
Results and discussions
We shall take Atpase (PDB ID: 1E32) (Fig. 1) as an example to show the hidden symmetry of its primary sequence by using the modified recurrence plot. Fig. 1c is the modified structure recurrence plot of 1E32 and it is clear that the tertiary structure has a pseudo two-fold axis of symmetry. The three-dimensional structures of subsequences 1–45 and 46–89 are similar with each other and with the dRMSD being 2.0898 (Fig. 2). The secondary structures of the two parts are also very similar (Fig. 2
Acknowledgements
This work is supported by the NSFC under Grant Nos. 30525037 and 30470412 and the Foundation of the Ministry of Education of China.
References (14)
Sequences and codes: fundamentals of biomolecular cryptology
- et al.
Distance analysis and sequence properties of functional domains in nucleic acids and proteins
Gene Anal. Technol.
(1988) - et al.
A common sequence-associated physicochemical feature for proteins of beta-trefoil family
Comput. Biol. Chem.
(2005) - et al.
Rapid automatic detection and alignment of repeats in protein sequences
Proteins Struct. Funct. Genet.
(2000) - et al.
Alternative alignments from comparison of protein structures
Proteins Struct. Funct. Bioinform.
(2004) Sequence complexity and composition
Theoretical molecular biology
Cited by (10)
Multi-nucleation and vectorial folding pathways of large helix protein
2011, Computational Biology and ChemistryCitation Excerpt :To see whether this is true, we investigated the internal structure-related sequence repetition of 1DVP. There are many methods to detect the internal repetitive units of proteins in sequence and structure levels (Chen et al., 2009; Fischer et al., 1992; Giuliani et al., 2002; He et al., 2009b; Heger and Holm, 2000; Ji et al., 2007; Konopka, 1994, 2003; Konopka and Chatterjee, 1988; Konopka and Smythers, 1987; Rackovsky, 1998; Szklarczyk and Heringa, 2004; Taylor et al., 2002; Turutina et al., 2006; Vriend and Sander, 1991; Xu and Xiao, 2005). By using our previous method (Huang and Xiao, 2007; Xu and Xiao, 2005), we found that all the segments with helical conformations have strong similarity in sequence with each other and can be regarded as repeats (Fig. 6).
Identification of sequence repetitions in immunoglobulin folds
2010, Journal of Molecular Graphics and ModellingA simple method of identifying symmetric substructures of proteins
2009, Computational Biology and ChemistryIdentification of protein latent periodicities using recurrent correlation analysis
2008, Journal of Theoretical BiologySymmetry Recurrence in protein sequence and structure with Pearson's correlation coefficients
2012, Proceedings - IEEE-EMBS International Conference on Biomedical and Health Informatics: Global Grand Challenge of Health Informatics, BHI 2012