Abstract
In this paper an approach devised to perform multiple alignment is described, able to exploit any available secondary structure information. In particular, given the sequences to be aligned, their secondary structure (either available or predicted) is used to perform an initial alignment –to be refined by means of locally-scoped operators entrusted with “rearranging” the primary level. Aimed at evaluating both the performance of the technique and the impact of “true” secondary structure information on the quality of alignments, a suitable algorithm has been implemented and assessed on relevant test cases. Experimental results point out that the proposed solution is particularly effective when used to align low similarity protein sequences.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Barton, G.J., Sternberg, M.E.J.: A Strategy for the Rapid Multiple Alignment of Protein Sequences. Confidence Levels from Tertiary Structure Comparisons. J. Mol. Biol. 198, 327–337 (1987)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)
Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082 (1988)
Devereux, J., Haeberli, P., Smithies, O.: GCG package. Nucleic Acids Research 12, 387–395 (1984)
Eddy, S.R.: Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3, 114–120 (1995)
Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)
Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees, an integrated method. J. Mol. Evol. 20, 175–186 (1984)
Giunchiglia, F., Villafiorita, A., Walsh, T.: Theories of Abtraction. AI Communications 10, 167–176 (1997)
Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments. J. Mol. Biol. 264, 823–838 (1996)
Heringa, J.: Two strategies for sequence comparison: profile preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23, 341–364 (1999)
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Knoblock, C.A., Tenenberg, J.D., Yang, Q.: Characterizing Abstraction Hierarchies for Planning. In: Proc. of the Ninth National Conference on Artificial Intelligence, vol. 2, pp. 692–697 (1991)
Krogh, A., Brown, M., Mian, I.S., Sjlander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol. 235, 1501–1531 (1994)
Morgenstern, B., Dress, A., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103 (1996)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)
Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)
Notredame, C., Holm, L., Higgins, D.G.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A Novel Method for Fast and Accurate Multiple Sequence Alignment. J. Mol. Biol. 302, 205–217 (2000)
Notredame, C.: Recent Progresses in Multiple Sequence Alignment: a Survey. Pharmaco-genomics 3(1), 131–144 (2002)
Plaisted, D.: Theorem Proving with Abstraction. Artificial Intelligence 16(1), 47–108 (1981)
Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Profiles. Proteins 47, 228–235 (2002)
Prlic, A., Domingues, F.S., Sippl, M.J.: Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng. 13, 545–550 (2000)
Rost, B.: Twilight zone of protein sequence alignments. Protein Engineering 12(2), 85–94 (1999)
Saitta, L., Zucker, J.D.: Semantic Abstraction for Concept Representation and Learning. In: Symposium on Abstraction, Reformulation and Approximation (SARA 1998), Pacific Grove, California, pp. 103–120 (1998)
Smith, R.F., Smith, T.F.: Pattern-Induced Multi-sequence Alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. Protein Eng. 5(1), 35–41 (1992)
Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169 (1988)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties, and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27, 2682–2690 (1999)
Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Armano, G., Milanesi, L., Orro, A. (2005). Using Secondary Structure Information to Perform Multiple Alignment. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds) Transactions on Computational Systems Biology III. Lecture Notes in Computer Science(), vol 3737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599128_6
Download citation
DOI: https://doi.org/10.1007/11599128_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30883-6
Online ISBN: 978-3-540-31446-2
eBook Packages: Computer ScienceComputer Science (R0)