Abstract
Semi-structured data play an increasing role in the development of the web through the use of XML. However, the management of semi-structured data poses specific problems because semi-structured data, contrary to classical database, do not rely on a predefined schema. The schema of a document is contained in the document itself and similar documents may be represented by different schemas. Consequently, the techniques and algorithms used for querying or integrating this data are more complex than those used for structured data. In this article we propose the architecture of a Data Warehouse designed for the integration of semi-structured data, so as to make possible searches in data repositories of various origins and structures. This architecture relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. In particular, we are interested in the Osiris data model, which gives several points of views on a family of objects. On the other hand, the indexing system of Osiris supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the objects indexing approach proposed by Osiris.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wu, M.C., Buchmann, A.P.: Research issues in Data Warehousing. In: Datebanksysteme in Buro, Technik and Wissenschaft, pp. 61–82 (1997)
Wiederhold, G.: Mediators in the Architecture of Future Information System. IEEE Computer Magazine 25(3), 38–49 (1992)
Kermanshahani, S.: Semi-Materialized Framework: a Hybrid Approach to Data Integration. In: CSTST Student Workshop, Paris (October 2008)
Garcia-Molina, H.: The TSIMMIS Approach to Mediation: Data Models and Languages. Journal of Intelligent Information Systems 8(2), 117–132 (1997)
Bornhovd, C.: MIX – A Representation Model for the Integration of Web- Based Data. Technical report, Dep.CS, Darmstadt University of Technology, Germany (1998)
Abiteboul, S., Cluet, S., Ferran, G., Rousset, M.C.: The Xyleme Project. Gemo Repot 248, INRIA (2001)
Manolescu, I., Florescu, D., Kossman, D.: Answering XML Queries Over Heterogeneous Data Sources. In: Proceeding of 27th International Conference on VLDB (2001)
Baril, X.: Un modèle de Vues pour l’Intégration de Sources de Données XML: VIMIX. PHD thesis, Languedoc University of Science and Techniques (2003)
Sebi, I.: Interrogation de Documents XML Ã Travers des Vues. PhD thesis, EDITE, CEDRIS Laboratory (2007)
Cannataro, M., Cluet, S., Tradigo, G., Veltri, P., Vodislav, D.: Using Views to Query XML. In: Encyclopedia of Dtabase Technologies and Applications, pp. 729–735 (2005)
Halevy, A.: Answering Queries Using Views: A Survey. The VLBD Journal 10(4), 270–294 (2001)
Roger, M., Simonet, A., Simonet, M.: Bringing together description logics and database in an object oriented model. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, p. 504. Springer, Heidelberg (2002)
Simonet, A., Simonet, M.: Classement d’Instance et Evaluation des Requêtes en Osiris. In: BDA 1996: Bases de Données Avancées, Cassis, France, pp. 273–288 (August 1996)
Scholl, M.H., Laasch, C., Tresch, M.: Updatable Views in Object-Oriented Databases. In: Delobel, C., Masunaga, Y., Kifer, M. (eds.) DOOD 1991. LNCS, vol. 566, pp. 187–198. Springer, Heidelberg (1991)
Ahmad, H., Kermanshahani, S., Simonet, A., Simonet, M.: A View-Based Approach to the Integration of Structured and Semi-structured Data. IEEE International Baltic Conference on Databases and Information Systems-Communication of Baltic DBIS (2006)
Stanat, D., McAllister, D.: Discrete Mathematics in Computer Science. Prentice Hall, Englewood Cliffs (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahmad, H., Kermanshahani, S., Simonet, A., Simonet, M. (2009). Data Warehouse Based Approach to the Integration of Semi-structured Data. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-03996-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03995-9
Online ISBN: 978-3-642-03996-6
eBook Packages: Computer ScienceComputer Science (R0)