Abstract
Analysis of regulatory elements (DNA motifs) in non-coding regions is considered as one crucial step to understand the regulation mechanisms of genes with similar expression patterns. With the help of accumulated gene expression data and complete genome sequences, computational approaches have been developed in the past decade to accelerate the mining task. In previous studies, we proposed a DNA motif discovery framework, named as MODEC, which incorporated the evolutionary computation (EC) searching algorithm with data filtering techniques to favor the algorithm performance. With the attempt on exploring real-world motif mining problems, we apply both MODEC and a famous discovery algorithm MEME to predict regulatory elements in different non-coding regions of co-expressed genes from the model plant Arabidopsis thaliana. Results from both MODEC and MEME show that the targeted motif patterns can be found in the expected non-coding regions of the co-expressed gene groups. As the preliminary step of this work, we investigate whether different motif patterns can be detected in the specified non-coding regions of co-expressed genes with different functional categories. The similar prediction results from MODEC and MEME demonstrate the potential of MODEC in the field of practical motif discovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Galas, D.J., Schmitz, A.: DNAse footprinting: a simple method for the detection of protein-DNA binding specificity. Nucleic Acids Res. 5, 3157–3170 (1978)
van Helden, J., André, B., Collado-Vides, J.: Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842 (1998)
Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36. AAAI Press, Menlo Park (1994)
Tompa, M., Li, N., Bailey, T.L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnology 23, 137–144 (2005)
Hu, J., Li, B., Kihara, D.: Limitations and potentials of current motif discovery algorithms. Nucleic Acids Res. 33, 4899–4913 (2005)
Chan, T.-M., Leung, K.-S., Lee, K.-H.: TFBS identification based on genetic algorithm with combined representations and adaptive post-processing. Bioinformatics 24, 341–349 (2008)
Li, L.P., Liang, Y., Bass, R.L.L.: GAPWM: a genetic algorithm method for optimizing a position weight matrix. Bioinformatics 23, 1188–1194 (2007)
Wei, Z., Jensen, S.T.: GAME: detecting cis-regulatory elements using a genetic algorithm. Bioinformatics 22, 1577–1584 (2006)
Li, X., Wang, D.H.: Computational Discovery of Regulatory DNA Motifs Using Evolutionary Computation. In: CEC-IEEE 2010: IEEE Congress on Evolutionary Computation (accepted 2010)
Fiume, E., Christou, P., Giani, S., Breviario, D.: Introns are key regulatory elements of rice tubulin expression. Planta 218, 693–704 (2004)
Xie, X., Lu, J., Kulbokas, E.J., Golub, T.R., Mootha, V., Lindblad-Toh, K., Lander, E.S., Kellis, M.: Systematic discovery of regulatory motifs in human promoters and 3’ UTRs by comparison of several mammals. Nature 434, 338–345 (2005)
Meinke, D.W., Cherry, J.M., Dean, C., Rounsley, S.D., Koornneef, M.: Arabidopsis thaliana: a model plant for genome analysis. Science 282, 662–682 (1998)
Swarbreck, D., Wilks, C., Lamesch, P., Berardini, T.Z., Garcia-Hernandez, M., Foerster, H., Li, D., Meyer, T., Muller, R., Ploetz, L., et al.: The arabidopsis information resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014 (2008)
Vandepoele, K., Quimbaya, M., Casneuf, T., De Veylder, L., Van de Peer, Y.: Unraveling transcriptional control in arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol. 150, 535–546 (2009)
Wang, D.H., Lee, N.K.: MISCORE: mismatch-based matrix similarity scores for DNA motif detection. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008. LNCS, vol. 5506, pp. 478–485. Springer, Heidelberg (2008)
Benos, P.V., Bulyk, M.L., Stormo, G.D.: Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res. 30, 4442–4451 (2002)
Wang, D.H.: Characterization of regulatory motif models. Technical Report, La Trobe University, Australia (October 2009)
Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouzé, P., Moreau, Y.: A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122 (2001)
Galtier, N., Piganeau, G., Mouchiroud, D., Duret, L.: GC content evolution in mammalian genomes, the biased gene conversion hypothesis. Genetics 159, 907–911 (2001)
Mahony, S., Hendrix, D., Golden, A., Smith, T.J., Rokhsar, D.S.: Transcription factor binding site identification using the Self-Organizing Map. Bioinformatics 21, 1807–1814 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Wang, D. (2010). Mining Regulatory Elements in Non-coding Regions of Arabidopsis thaliana . In: Chan, J.H., Ong, YS., Cho, SB. (eds) Computational Systems-Biology and Bioinformatics. CSBio 2010. Communications in Computer and Information Science, vol 115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16750-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-16750-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16749-2
Online ISBN: 978-3-642-16750-8
eBook Packages: Computer ScienceComputer Science (R0)