Skip to main content

A Data Warehouse Approach to Semantic Integration of Pseudomonas Data

  • Conference paper
Data Integration in the Life Sciences (DILS 2010)

Abstract

Biological research and development are routinely producing terabytes of data that need to be organized, queried and reduced to useful scientific knowledge. Even though data integration can provide solutions to such biological problems, it is often problematic due to the sources’ heterogeneity and their semantic and structural diversity. Moreover, necessary updates of both structure and content of databases provide further challenges for an integration process. We present a new biological data warehouse for Pseudomonas species “PseudomonasDW” to integrate annotation and pathway data from highly different resources. The combination of knowledge from multiple disciplines and sources should advance the understanding of cellular processes and lead to the prediction of cellular behavior in its entirety. The key aspect of our approach is the combination of a materialized and a virtual data integration to exploit their advantages in a new hybrid approach. The data are extracted from the original data sources using SB-KOM (System Biology Khaos Ontology-based Mediator) and then stored locally in the data warehouse to ensure a fast performance and data consistency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buttler, D., Coleman, M., Critchlow, T., Fileto, R., Han, W., Pu, C., Rocco, D., Xiong, L.: Querying multiple bioinformatics information sources: can Semantic web research help. ACM SIGMOD 31, 59–64 (2002)

    Article  Google Scholar 

  2. The gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 32, 258–261 (2004)

    Google Scholar 

  3. Florescu, D., Levy, A.Y., Mendelzon, A.O.: Database Techniques for the World-Wide Web: A Survey. ACM SIGMOD 27, 59–74 (1998)

    Article  Google Scholar 

  4. Lenzerini, M.: Data Integration: A Theoretical Perspective. In: ACM Symposium on Principles of Database Systems (2002)

    Google Scholar 

  5. Sujansky, W.: Heterogeneous Database Integration in Biomedecine. Methodological Review. Journal of Biomedical Informatics 34, 285–298 (2001)

    Article  Google Scholar 

  6. Allaire, M.: Diversité fonctionnelle des Pseudomonas producteurs d’antibiotiques dans les rhizosphères de conifères en pépinières et en milieu naturel. PhD Thesis, Faculty of Agriculture and Food University Laval Québec (2005)

    Google Scholar 

  7. Kohler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rueegg, A., Rawlings, C., Verier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with Ondex. Bioinformatics 22, 1383–1390 (2006)

    Article  Google Scholar 

  8. Shah, S.P., Huang, Y., Xu, T., Yuen, M.M.S., Ling, J., Ouellette, B.F.F.: Atlas–a data warehouse for integrative bioinformatics. BMC Bioinformatics 6 (2005)

    Google Scholar 

  9. Lee, T.J., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D.W.J., Tenenbaum, J.D.: BioWarehouse: a bioinformatics database warehouse toolkit. BMC Bioinformatics 7 (2006)

    Google Scholar 

  10. Töpel, T., Hofestädt, R., Scheible, D., Trefz, F.: RAMEDIS: the rare metabolic diseases database. Applied Bioinformatics 5, 115–118 (2006)

    Article  Google Scholar 

  11. Choi, C.C., Munch, R., Leupold, S., Klein, J., Siegel, I., Thielen, B., Benkert, B., Kucklick, M., Schobert, M., Barthelmes, J., Ebeling, C., Haddad, I., Scheer, M., Grote, A., Hiller, K., Bunk, B., Schreiber, K., Retter, I., Schomburg, D., Jahn, D.: SYSTOMONAS-an integrated database for systems biology analysis of Pseudomonas. Nucleic Acids Research 35, 537–537 (2007)

    Article  Google Scholar 

  12. Trißl, S., Rother, K., Rother, Muller, H., Steinke, T., Koch, I., Preissner, R., Frommel, C., Leser, U.: Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics 6 (2005)

    Google Scholar 

  13. Calvanese, D., Giacomo, G.D., Lenzerini, M., Naradi, D., Rosati, R.: Source integration in data warehousing. In: DEXA Workshop, pp. 192–197 (1998)

    Google Scholar 

  14. Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., Hirakawa, M.: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Research 34, 354–357 (2006)

    Article  Google Scholar 

  15. Chang, A., Scheer, M., Grote, A., Schomburg, I., Schomburg, D.: BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucleic Acids Research 37, 588–592 (2009)

    Article  Google Scholar 

  16. The UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38, 142–148 (2010)

    Google Scholar 

  17. Benson, D.A., Karsch-Mizrachi, L., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucleic Acids Research 38, 46–51 (2010)

    Article  Google Scholar 

  18. Munch, R., Hiller, K., Barg, H., Heldt, D., Linz, S., Wingender, E., Jahn, D.: PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Research 31, 266–269 (2003)

    Article  Google Scholar 

  19. Navas-Delgado, I., Aldana-Montes, J.F.: Extending SD-Core for Ontology-based Data Integration. Journal of Universal Computer Science 15, 3201–3230 (2009)

    Google Scholar 

  20. Navas-Delgado, I., Roldan Garcia, M.M., Mazorra, D.D., Aldana-Montes, J.F.: Developing Data Services. In: The 17th Conference on Advanced Information System Engineering. Data Integration and the Semantic Web, DISWeb 2005, vol. 4, pp. 287–301 (2005)

    Google Scholar 

  21. Chniber, O., Kerzazi, A., Navas-Delgado, I., Aldana-Montes, J.F.: KOMF: The Khoas Ontology-based Mediator Framework NETTAB 2008: Bioinformatics Methods for Biomedical Complex System Applications (2008)

    Google Scholar 

  22. Navas-Delgado, I., Kerzazi, A., Chniber, O., Aldana-Montes, J.F.: A Semantic Middleware Applied to Molecular Biology. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2008. LNCS, vol. 5333, pp. 976–985. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  23. Navas-Delgado, I., Aldana-Montes, J.F.: SD-Core: Generic Semantic Middleware Components for the Semantic Web. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part II. LNCS (LNAI), vol. 5178, pp. 617–622. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  24. Hillebrand, G.G., Kanellakis, P.C., Mairson, H.G., Vardi, M.Y.: Undecidable Boundedness Problems for Datalog Programs. J. of Logic Programming 25, 163–190 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  25. Jorg, T., Dessloch, S.: Towards generating ETL processes for increlental loading. In: Proc. Intl. Symposium on Database Engineering and Applications, vol. 299, pp. 101–110 (2008)

    Google Scholar 

  26. Neerincx, P.B.T., Leunissen, J.: Evolution of web services in bioinformatics. Brief. Bioinform. 6, 178–188 (2005)

    Article  Google Scholar 

  27. Marrakchi, K., Briache, A., Kerzazi, A., Navas-Delgado, I., Aldana-Montes, J.F., Ettayebi, M., Lairini, K., Rossi Hassani, B.D.: PseudomonasDW Ontology: an ontology for Pseudomonas species. Technical report (ITI.10-2), Department of Computer Languages and Computing Science, Higher Technical School of Computer Science Engineering University of Málaga (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Marrakchi, K. et al. (2010). A Data Warehouse Approach to Semantic Integration of Pseudomonas Data. In: Lambrix, P., Kemp, G. (eds) Data Integration in the Life Sciences. DILS 2010. Lecture Notes in Computer Science(), vol 6254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15120-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15120-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15119-4

  • Online ISBN: 978-3-642-15120-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics