Skip to main content

Data Warehouse Based Approach to the Integration of Semi-structured Data

  • Conference paper
Advances in Web and Network Technologies, and Information Management (APWeb 2009, WAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5731))

Abstract

Semi-structured data play an increasing role in the development of the web through the use of XML. However, the management of semi-structured data poses specific problems because semi-structured data, contrary to classical database, do not rely on a predefined schema. The schema of a document is contained in the document itself and similar documents may be represented by different schemas. Consequently, the techniques and algorithms used for querying or integrating this data are more complex than those used for structured data. In this article we propose the architecture of a Data Warehouse designed for the integration of semi-structured data, so as to make possible searches in data repositories of various origins and structures. This architecture relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. In particular, we are interested in the Osiris data model, which gives several points of views on a family of objects. On the other hand, the indexing system of Osiris supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the objects indexing approach proposed by Osiris.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wu, M.C., Buchmann, A.P.: Research issues in Data Warehousing. In: Datebanksysteme in Buro, Technik and Wissenschaft, pp. 61–82 (1997)

    Google Scholar 

  2. Wiederhold, G.: Mediators in the Architecture of Future Information System. IEEE Computer Magazine 25(3), 38–49 (1992)

    Article  Google Scholar 

  3. Kermanshahani, S.: Semi-Materialized Framework: a Hybrid Approach to Data Integration. In: CSTST Student Workshop, Paris (October 2008)

    Google Scholar 

  4. Garcia-Molina, H.: The TSIMMIS Approach to Mediation: Data Models and Languages. Journal of Intelligent Information Systems 8(2), 117–132 (1997)

    Article  Google Scholar 

  5. Bornhovd, C.: MIX – A Representation Model for the Integration of Web- Based Data. Technical report, Dep.CS, Darmstadt University of Technology, Germany (1998)

    Google Scholar 

  6. Abiteboul, S., Cluet, S., Ferran, G., Rousset, M.C.: The Xyleme Project. Gemo Repot 248, INRIA (2001)

    Google Scholar 

  7. Manolescu, I., Florescu, D., Kossman, D.: Answering XML Queries Over Heterogeneous Data Sources. In: Proceeding of 27th International Conference on VLDB (2001)

    Google Scholar 

  8. Baril, X.: Un modèle de Vues pour l’Intégration de Sources de Données XML: VIMIX. PHD thesis, Languedoc University of Science and Techniques (2003)

    Google Scholar 

  9. Sebi, I.: Interrogation de Documents XML à Travers des Vues. PhD thesis, EDITE, CEDRIS Laboratory (2007)

    Google Scholar 

  10. Cannataro, M., Cluet, S., Tradigo, G., Veltri, P., Vodislav, D.: Using Views to Query XML. In: Encyclopedia of Dtabase Technologies and Applications, pp. 729–735 (2005)

    Google Scholar 

  11. Halevy, A.: Answering Queries Using Views: A Survey. The VLBD Journal 10(4), 270–294 (2001)

    MATH  Google Scholar 

  12. Roger, M., Simonet, A., Simonet, M.: Bringing together description logics and database in an object oriented model. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, p. 504. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Simonet, A., Simonet, M.: Classement d’Instance et Evaluation des Requêtes en Osiris. In: BDA 1996: Bases de Données Avancées, Cassis, France, pp. 273–288 (August 1996)

    Google Scholar 

  14. Scholl, M.H., Laasch, C., Tresch, M.: Updatable Views in Object-Oriented Databases. In: Delobel, C., Masunaga, Y., Kifer, M. (eds.) DOOD 1991. LNCS, vol. 566, pp. 187–198. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  15. Ahmad, H., Kermanshahani, S., Simonet, A., Simonet, M.: A View-Based Approach to the Integration of Structured and Semi-structured Data. IEEE International Baltic Conference on Databases and Information Systems-Communication of Baltic DBIS (2006)

    Google Scholar 

  16. Stanat, D., McAllister, D.: Discrete Mathematics in Computer Science. Prentice Hall, Englewood Cliffs (1977)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ahmad, H., Kermanshahani, S., Simonet, A., Simonet, M. (2009). Data Warehouse Based Approach to the Integration of Semi-structured Data. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03996-6_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03995-9

  • Online ISBN: 978-3-642-03996-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics