Abstract
Cyanobacteria are photosynthetic organisms that are credited with both the creation and replenishment of the oxygen-rich atmosphere, and are also responsible for more than half of the primary production on earth. Despite their crucial evolutionary and environmental roles, the study of these organisms has lagged behind other model organisms. This paper presents preliminary results on our ongoing research to unravel the biological interactions occurring within cyanobacteria. We develop an analysis framework that leverages recently developed bioinformatics and machine learning tools, such as genome-wide sequence matching based annotation, gene ontology analysis, cluster analysis and dynamic Bayesian network. Together, these tools allow us to overcome the lack of knowledge of less well-studied organisms, and reveal interesting relationships among their biological processes. Experiments on the Cyanothece bacterium demonstrate the practicability and usefulness of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
de Campos, L.M.: A scoring function for learning bayesian networks based on mutual information and conditional independence tests. Mach. Learn. Res. 7, 2149–2187 (2006)
Dojer, N.: Learning Bayesian Networks Does Not Have to Be NP-Hard. In: Královič, R., Urzyczyn, P. (eds.) MFCS 2006. LNCS, vol. 4162, pp. 305–314. Springer, Heidelberg (2006)
Dondelinger, F., Lebre, S., Husmeier, D.: Heterogeneous continuous dynamic bayesian networks with flexible structure and inter-time segment information sharing. In: ICML, pp. 303–310 (2010)
Elvitigala, T., Polpitiya, A., Wang, W., Stockel, J., Khandelwal, A., Quatrano, R., Pakrasi, H., Ghosh, B.: High-throughput biological data analysis. IEEE Control Systems 30(6), 81–100 (2010)
Gotz, S., Garcia-Gomez, J.M., Terol, J., Williams, T.D., Nagaraj, S.H., Nueda, M.J., Robles, M., Talon, M., Dopazo, J., Conesa, A.: High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Research 36(10), 3420–3435 (2008)
Grzegorczyk, M., Husmeier, D.: Non-stationary continuous dynamic Bayesian networks. In: NIPS 2009, (2009)
de Hoon, M., Imoto, S., Kobayashi, K., Ogasawara, N., Miyano, S.: Inferring gene regulatory networks from time-ordered gene expression data of bacillus subtilis using differential equations. In: Pac. Symp. Biocomput., pp. 17–28 (2003)
Husmeier, D.: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19(17), 2271–2282 (2003)
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)
Jung, J., Thon, M.: Automatic Annotation of Protein Functional Class from Sparse and Imbalanced Data Sets. In: Dalkilic, M.M., Kim, S., Yang, J. (eds.) VDMB 2006. LNCS (LNBI), vol. 4316, pp. 65–77. Springer, Heidelberg (2006)
Kazusa DNA Research Institute: The cyanobacteria database (2011), http://genome.kazusa.or.jp/cyanobase
Maere, S., Heymans, K., Kuiper, M.: BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21(16), 3448–3449
Murphy, K., Mian, S.: Modelling gene expression data using dynamic bayesian networks. Tech. rep., Computer Science Division. University of California, Berkeley, CA (1999)
Oilgea Inc.: Comprehensive oilgae report (2011), http://www.oilgae.com
Robinson, J., Hartemink, A.: Learning Non-Stationary Dynamic Bayesian Networks. The Journal of Machine Learning Research 11, 3647–3680 (2010)
Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D., Friedman, N.: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics 34, 166–176 (2003)
Shamir, O., Tishby, N.: Model selection and stability in k-means clustering. In: COLT 2008, Springer, Heidelberg (2008)
Singh, A., Elvitigala, T., Cameron, J., Ghosh, B., Bhattacharyya-Pakrasi, M., Pakrasi, H.: Integrative analysis of large scale expression profiles reveals core transcriptional response and coordination between multiple cellular processes in a cyanobacterium. BMC Systems Biology 4(1), 105 (2010)
Stockel, J., Welsh, E.A., Liberton, M., Kunnvakkam, R., Aurora, R., Pakrasi, H.B.: Global transcriptomic analysis of cyanothece 51142 reveals robust diurnal oscillation of central metabolic processes. Proceedings of the National Academy of Sciences 105(16), 6156–6161 (2008)
The Gene Ontology Consortium: Current annotations (2011), http://www.geneontology.org
The InterPro Consortium: Interpro: An integrated documentation resource for protein families, domains and functional sites. Briefings in Bioinformatics 3(3), 225–235 (2002)
Vinh, N.X., Chetty, M., Coppel, R., Wangikar, P.P.: Polynomial Time Algorithm for Learning Globally Optimal Dynamic Bayesian Network. In: Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part I, vol. 7062. Springer, Heidelberg (2011)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1073–1080. ACM, New York (2009)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research 11, 2837–2854 (2010)
Wang, W., Ghosh, B., Pakrasi, H.: Identification and modeling of genes with diurnal oscillations from microarray time series data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(1), 108–121 (2011)
Yu, J., Smith, V.A., Wang, P.P., Hartemink, A.J., Jarvis, E.D.: Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 20(18), 3594–3603 (2004)
Zou, M., Conzen, S.D.: A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21(1), 71–79 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vinh, N.X., Chetty, M., Coppel, R., Wangikar, P.P. (2011). Dynamic Bayesian Network Modeling of Cyanobacterial Biological Processes via Gene Clustering. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7062. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24955-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-24955-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24954-9
Online ISBN: 978-3-642-24955-6
eBook Packages: Computer ScienceComputer Science (R0)