Abstract
This paper presents a novel genetic algorithm (GA) for multiple sequence alignment in protein analysis. The most significant improvement afforded by this algorithm results from its use of segment profiles to generate the diversified initial population and prevent the destruction of conserved regions by crossover and mutation operations. Segment profiles contain rich local information, thereby speeding up convergence. Secondly, it introduces the use of the norMD function in a genetic algorithm to measure multiple alignment Finally, as an approach to the premature problem, an improved progressive method is used to optimize the highest-scoring individual of each new generation. The new algorithm is compared with the ClustalX and T-Coffee programs on several data cases from the BAliBASE benchmark alignment database. The experimental results show that it can yield better performance on data sets with long sequences, regardless of similarity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Thompson, J.D., Plewniak, F.: A comprehensive comparison of multiple sequence alignment programs. Nuc. Acids. Res. 27, 2682–2690 (1999)
Thompson, J.D., Gibson, T.J.: The CLUSTAL_X windows interface: flexible strategies for MSA aided by quality analysis tools. Nuc. Acids. Res. 25(24), 4876–4882 (1997)
Brudno, M., Chapman, M.: Fast and sensitive multiple alignment of large genomic sequences. Bioinformatics 4, 66 (2003)
Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nuc. Acids. Res. 24, 1515–1524 (1996)
Eddy, R.: Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids, pp. 51–68. Cambridge University Press, Cambridge (1998)
Dayhoff, M., Schwartz, R.M.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5, 345–352 (1978)
Thompson, J.D., Plewniak, F.: Multiple Sequence Alignment Objective Function. J. Mol. Biol. 314(4), 937–951 (2001)
Benner, S.A., Cohen, M.A.: Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 7, 1323–1332 (1994)
Shiyi, S., Jun, Y.: Super Pairwise Alignment (SPA): An Efficient Approach to Global Alignment for Homologous Sequences. J. Com. Biol. 9(3), 477–486 (2002)
Thompson, J.D.: BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)
Notredame, C., Higgins, D., Heringa, J.: T-Coffee: A novel method for multiple sequence alignments. J. Mol. Biol. 302, 205–217 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lv, Y., Li, S., Zhou, C., Guo, W., Xu, Z. (2006). Improved Genetic Algorithm for Multiple Sequence Alignment Using Segment Profiles (GASP). In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_43
Download citation
DOI: https://doi.org/10.1007/11811305_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)