Abstract
Biological research and development are routinely producing terabytes of data that need to be organized, queried and reduced to useful scientific knowledge. Even though data integration can provide solutions to such biological problems, it is often problematic due to the sources’ heterogeneity and their semantic and structural diversity. Moreover, necessary updates of both structure and content of databases provide further challenges for an integration process. We present a new biological data warehouse for Pseudomonas species “PseudomonasDW” to integrate annotation and pathway data from highly different resources. The combination of knowledge from multiple disciplines and sources should advance the understanding of cellular processes and lead to the prediction of cellular behavior in its entirety. The key aspect of our approach is the combination of a materialized and a virtual data integration to exploit their advantages in a new hybrid approach. The data are extracted from the original data sources using SB-KOM (System Biology Khaos Ontology-based Mediator) and then stored locally in the data warehouse to ensure a fast performance and data consistency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Buttler, D., Coleman, M., Critchlow, T., Fileto, R., Han, W., Pu, C., Rocco, D., Xiong, L.: Querying multiple bioinformatics information sources: can Semantic web research help. ACM SIGMOD 31, 59–64 (2002)
The gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 32, 258–261 (2004)
Florescu, D., Levy, A.Y., Mendelzon, A.O.: Database Techniques for the World-Wide Web: A Survey. ACM SIGMOD 27, 59–74 (1998)
Lenzerini, M.: Data Integration: A Theoretical Perspective. In: ACM Symposium on Principles of Database Systems (2002)
Sujansky, W.: Heterogeneous Database Integration in Biomedecine. Methodological Review. Journal of Biomedical Informatics 34, 285–298 (2001)
Allaire, M.: Diversité fonctionnelle des Pseudomonas producteurs d’antibiotiques dans les rhizosphères de conifères en pépinières et en milieu naturel. PhD Thesis, Faculty of Agriculture and Food University Laval Québec (2005)
Kohler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rueegg, A., Rawlings, C., Verier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with Ondex. Bioinformatics 22, 1383–1390 (2006)
Shah, S.P., Huang, Y., Xu, T., Yuen, M.M.S., Ling, J., Ouellette, B.F.F.: Atlas–a data warehouse for integrative bioinformatics. BMC Bioinformatics 6 (2005)
Lee, T.J., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D.W.J., Tenenbaum, J.D.: BioWarehouse: a bioinformatics database warehouse toolkit. BMC Bioinformatics 7 (2006)
Töpel, T., Hofestädt, R., Scheible, D., Trefz, F.: RAMEDIS: the rare metabolic diseases database. Applied Bioinformatics 5, 115–118 (2006)
Choi, C.C., Munch, R., Leupold, S., Klein, J., Siegel, I., Thielen, B., Benkert, B., Kucklick, M., Schobert, M., Barthelmes, J., Ebeling, C., Haddad, I., Scheer, M., Grote, A., Hiller, K., Bunk, B., Schreiber, K., Retter, I., Schomburg, D., Jahn, D.: SYSTOMONAS-an integrated database for systems biology analysis of Pseudomonas. Nucleic Acids Research 35, 537–537 (2007)
Trißl, S., Rother, K., Rother, Muller, H., Steinke, T., Koch, I., Preissner, R., Frommel, C., Leser, U.: Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics 6 (2005)
Calvanese, D., Giacomo, G.D., Lenzerini, M., Naradi, D., Rosati, R.: Source integration in data warehousing. In: DEXA Workshop, pp. 192–197 (1998)
Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., Hirakawa, M.: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Research 34, 354–357 (2006)
Chang, A., Scheer, M., Grote, A., Schomburg, I., Schomburg, D.: BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucleic Acids Research 37, 588–592 (2009)
The UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38, 142–148 (2010)
Benson, D.A., Karsch-Mizrachi, L., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucleic Acids Research 38, 46–51 (2010)
Munch, R., Hiller, K., Barg, H., Heldt, D., Linz, S., Wingender, E., Jahn, D.: PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Research 31, 266–269 (2003)
Navas-Delgado, I., Aldana-Montes, J.F.: Extending SD-Core for Ontology-based Data Integration. Journal of Universal Computer Science 15, 3201–3230 (2009)
Navas-Delgado, I., Roldan Garcia, M.M., Mazorra, D.D., Aldana-Montes, J.F.: Developing Data Services. In: The 17th Conference on Advanced Information System Engineering. Data Integration and the Semantic Web, DISWeb 2005, vol. 4, pp. 287–301 (2005)
Chniber, O., Kerzazi, A., Navas-Delgado, I., Aldana-Montes, J.F.: KOMF: The Khoas Ontology-based Mediator Framework NETTAB 2008: Bioinformatics Methods for Biomedical Complex System Applications (2008)
Navas-Delgado, I., Kerzazi, A., Chniber, O., Aldana-Montes, J.F.: A Semantic Middleware Applied to Molecular Biology. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2008. LNCS, vol. 5333, pp. 976–985. Springer, Heidelberg (2008)
Navas-Delgado, I., Aldana-Montes, J.F.: SD-Core: Generic Semantic Middleware Components for the Semantic Web. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part II. LNCS (LNAI), vol. 5178, pp. 617–622. Springer, Heidelberg (2008)
Hillebrand, G.G., Kanellakis, P.C., Mairson, H.G., Vardi, M.Y.: Undecidable Boundedness Problems for Datalog Programs. J. of Logic Programming 25, 163–190 (1995)
Jorg, T., Dessloch, S.: Towards generating ETL processes for increlental loading. In: Proc. Intl. Symposium on Database Engineering and Applications, vol. 299, pp. 101–110 (2008)
Neerincx, P.B.T., Leunissen, J.: Evolution of web services in bioinformatics. Brief. Bioinform. 6, 178–188 (2005)
Marrakchi, K., Briache, A., Kerzazi, A., Navas-Delgado, I., Aldana-Montes, J.F., Ettayebi, M., Lairini, K., Rossi Hassani, B.D.: PseudomonasDW Ontology: an ontology for Pseudomonas species. Technical report (ITI.10-2), Department of Computer Languages and Computing Science, Higher Technical School of Computer Science Engineering University of Málaga (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marrakchi, K. et al. (2010). A Data Warehouse Approach to Semantic Integration of Pseudomonas Data. In: Lambrix, P., Kemp, G. (eds) Data Integration in the Life Sciences. DILS 2010. Lecture Notes in Computer Science(), vol 6254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15120-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-15120-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15119-4
Online ISBN: 978-3-642-15120-0
eBook Packages: Computer ScienceComputer Science (R0)