Abstract
In the post-genomics era, recognition of transcription factor binding sites (DNA motifs) to help with understanding the regulation of gene is one of the major challenges. An improved algorithm for motif discovery in DNA sequence based on Kd-Trees and Genetic Algorithm (KTGA) is proposed in this paper. Firstly, we use Kd-Trees to stratify the input DNA sequences, and pick out subsequences with the highest scoring of the hamming distance from each layer which constitute the initial population. Then, genetic algorithm is used to find the true DNA sequence motif. The experiment performing on synthetic data and biological data shows that the algorithm not only can be applied to each sequence containing one motif or multiple motifs, but also improve the performance of genetic algorithm at finding DNA motif.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ihuegbu NE, Stormo GD, Buhler J (2012) Fast, sensitive of conserved genome wide motifs. J Comput Biol 19:139–147
Stine M, Dasgupta D, Mukatira S (2003) Motif discovery in upstream sequences of coordinately expressed genes. Evol Comput 3:1596–1603
Liu FFM, Tsai JJP, Chen RM, Shih SH (2004) FMGA: finding motifs by genetic algorithm. In: Proceedings of 4th IEEE symposium on bioinformatics and bioengineering, Taichung, pp 301–309
Che DS, Song YL, Rasheed K (2005) MDGA: motif discovery using a genetic algorithm. Genetic and Evolutionary Computation, New York, pp 447–452
Congdon CB, Fizer CW, Smith NW (2005) Preliminary results for GAMI: a genetic algorithm approach to motif inference. In: Proceedings of symposium on computational intelligence in bioinformatics and computational biology, CA, pp 1–8
Huo HW, Zhao ZH, Stojkovic V, Liu LF (2010) Optimizing genetic algorithm for motif discovery. Math Comput Model 52:2011–2020
Blekas K, Fotiadis DI, Likas A (2003) Greedy mixture learning for multiple motif discovery in biological sequences. Bioinformatics 19:607–617
Pevzner PA, Sze SH (2000) Combinatorial approaches to finding subtle signals in DNA sequences. Intell Syst Mol Biol 8:269–278
Sun HQ, Low MYH, Hsu WJ, Tan CW, Rajapakse JC (2012) Tree-structured algorithm for long weak motif discovery. Bioinformatics 27:2641–2647
Huang CW, Lee WS, Hsieh SY (2011) An improved heuristic algorithm for finding motif signals in DNA sequences. IEEE/ACM Trans Comput Biol Bioinf 8:959–975
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Nos.31170797,61103057), Program for Changjiang Scholars and Innovative Research Team in University (No.IRT1109), the Program for Liaoning Innovative Research Team in University (No.LT2011018) and by the Program for Liaoning Key Lab of Intelligent Information Processing and Network Technology in University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Q., Wu, S., Zhou, C., Zheng, X. (2013). DNA Sequence Motif Discovery Based on Kd-Trees and Genetic Algorithm. In: Yin, Z., Pan, L., Fang, X. (eds) Proceedings of The Eighth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA), 2013. Advances in Intelligent Systems and Computing, vol 212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37502-6_98
Download citation
DOI: https://doi.org/10.1007/978-3-642-37502-6_98
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37501-9
Online ISBN: 978-3-642-37502-6
eBook Packages: EngineeringEngineering (R0)