Abstract
Vast amount of data in various forms have been accumulated through many years of functional genomic research throughout the world. It is a challenge to discover and disseminate knowledge hidden in these data. Many computational methods have been developed to solve this problem. Taking analysis of the microarray data as an example, we spent the past decade developing many data mining strategies and software tools. It appears still insufficient to cover all sources of data. In this paper, we summarize our experiences in mining microarray data by using two plant species, Brassica napus and Arabidopsis thaliana, as examples. We present several successful stories and also a few lessons learnt. The domain problems that we dealt with were the transcriptional regulation in seed development and during defense response against pathogen infection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Churchill, G.A.: Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 32, 490–495 (2002)
Huang, Y., Chen, L., Wang, L., Phan, S., Liu, Z., Vijayan, K., Wan, L., Ross, A., Datla, R., Pan, Y., Zou, J.: Probing endosperm gene expression landscape in Brassica napus. BMC Genomics 10, 256 (2009)
Pan, Y., Zou, J., Huang, Y., Liu, Z., Phan, S., Famili, F.A.: Goal driven analysis of cDNA microarray data. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2009), pp. 186–192. IEEE Press, New York (2009)
Delaney, T.P., Friedrich, L., Ryals, J.A.: Arabidopsis signal transduction mutant defective in chemically and biologically induced disease resistance. Proc. Natl. Acad. Sci. 92, 6602–6606 (1995)
Zhang, Y., Fan, W., Kinkema, M., Li, X., Dong, X.: Interaction of NPR1 with basic leucine zipper protein transcription factors that bind sequences required for salicylic acid induction of the PR-1 gene. Proc. Natl. Acad. Sci. 96, 6523–6528 (1999)
Després, C., DeLong, C., Glaze, S., Liu, E., Fobert, P.R.: The Arabidopsis NPR1/NIM1 protein enhances the DNA binding activity of a subgroup of the TGA family of bZIP transcription factors. Plant Cell 12, 279–290 (2000)
Kinkema, K., Fan, W., Dong, X.: Nuclear localization of NPR1 is required for activation of PR gene expression. Plant Cell 12, 2339–2350 (2000)
Subramaniam, R., Desveaux, D., Spickler, C., Michnick, S.W., Brisson, N.: Direct visualization of protein interactions in plant cells. Nat. Biotech. 19, 769–772 (2001)
Johnson, C., Boden, E., Arias, J.: Salicylic acid and NPR1 induce the recruitment of transactivating TGA factors to a defense gene promoter in Arabidopsis. Plant Cell 15, 1846–1858 (2003)
Kesarwani, M., Yoo, J., Dong, X.: Genetic Interactions of TGA transcription factors in the regulation of pathogenesis- related genes and disease resistance in Arabidopsis thaliana. Plant Physiol. 44, 336–346 (2007)
Jakoby, M., Weisshaar, B., Droge-Laser, W., Vicente-Carbajosa, J., Tiedemann, J., Kroj, T., Parcy, F.: bZIP transcription factors in Arabidopsis. Trends Plant Sci. 7, 106–111 (2002)
Dong, X.: NPR1, all things considered. Curr. Opin. Plant Biol. 7, 547–552 (2004)
Xiang, C., Miao, Z., Lam, E.: DNA-binding properties, genomic organization and expression pattern of TGA6, a new member of the TGA family of bZIP transcription factors in Arabidopsis thaliana. Plant Mol. Biol. 34, 403–415 (1997)
Pan, Y., Pylatuik, J.D., Ouyang, J., Famili, A., Fobert, P.R.: Discovery of functional genes for systemic acquired resistance in Arabidopsis thaliana through integrated data mining. J. Bioinfo. Comput. Biol. 2, 639–655 (2004)
Tchagang, A.B., Shearer, H., Phan, S., Bérubé, H., Famili, F.A., Fobert, P., Pan, Y.: Towards a temporal modeling of the genetic network controlling systemic acquired resistance in Arabidopsis thaliana. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2010), Montreal, Canada, May 2-5 (2010)
Tchagang, A.B., Phan, S., Famili, F.A., Pan, Y.: OPTricluster: The Order Preserving Triclustering Algorithm. Technical Report, Knowledge Discovery Group, Institute for Information Technology, National Research Council Canada (2008)
Phan, S., Famili, F., Tang, Z., Pan, Y., Liu, Z., Ouyang, J., Lenferink, A., O’Connor, M.: A novel pattern based clustering methodology for time-series microarray data. Intern. J. Comput. Math. 84, 585–597 (2007)
Mu, J., Tan, H., Zheng, Q., Fu, F., Liang, Y., Zhang, J., Yang, X., Wang, T., Chong, K., Wang, X.-J., Zuo, J.: LEAFY COTYLEDON1 is a key regulator of fatty acid biosynthesis in Arabidopsis. Plant Physiology 148, 1042–1054 (2008)
Maeo, K., Tokuda, T., Ayame, A., Mitsui, N., Kawai, T., Tsukagoshi, H., Ishiguro, S., Nakamura, K.: An AP2-type transcription factor, WRINKLED1, of Arabidopsis thaliana binds to the AW-box sequence conserved among proximal upstream regions of genes involved in fatty acid synthesis. Plant J. 60, 476–487 (2009)
Baud, S., Mendoza, M.S., To, A., Harscoet, E., Lepiniec, L., Dubreucq, B.: WRINKLED1 specifies the regulatory action of LEAFY COTYLEDON2 towards fatty acid metabolism during seed maturation in Arabidopsis. Plant J. 50, 825–838 (2007)
Lebel, E., Heifetz, P., Thorne, L., Uknes, S., Ryals, J., Ward, E.: Functional analysis of regulatory sequences controlling PR-1 gene expression in Arabidopsis. Plant J. 16, 223–233 (1998)
Eulgem, T., Rushton, P.J., Robatzek, S., Somssich, I.E.: The WRKY super-family of plant transcription factors. Trends Plant Sci. 5, 199–205 (2000)
Yu, D., Chen, C., Chen, Z.: Evidence for an important role of WRKY DNA binding proteins in the regulation of NPR1 gene expression. Plant Cell 13, 1527–1539 (2001)
Bérubé, H., Tchagang, A., Wang, Y., Liu, Z., Phan, S., Famili, F., Pan, Y.: BRISKA: brassica seed knowledge application. In: Poster at 17th International Conference on Intelligent Systems in Molecular Biology, Stockholm (2009)
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)
Okuda, S., Yamada, T., Hamajima, M., Itoh, M., Katayama, T., Bork, P., Goto, S., Kanehisa, M.: KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res. 36, W423–W426 (2008)
Matys, V., Kel-Margoulis, O.V., Fricke, E., et al.: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006)
Mungall, C.J., Emmert, D.B.: The FlyBase Consortium: A Chado case study: an ontology-based modular schema for representing genome-associated biological information. Bioinformatics 23, i337–i346 (2007)
The Arabidopsis Information Resource (TAIR), http://www.arabidopsis.org/
Guo, A.Y., Chen, X., Gao, G., Zhang, H., Zhu, Q.H., Liu, X.C., Zhong, Y.F., Gu, X., He, K., Luo, J.: PlantTFDB: a comprehensive plant transcription factor database. Nucleic Acids Res. 36, D966–D969 (2008)
Adar, E.: GUESS: a language and interface for graph exploration. In: CHI 2006. ACM, New York (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pan, Y. et al. (2010). Integrative Data Mining in Functional Genomics of Brassica napus and Arabidopsis thaliana . In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6098. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13033-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-13033-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13032-8
Online ISBN: 978-3-642-13033-5
eBook Packages: Computer ScienceComputer Science (R0)