Abstract
SINGAPORE (SINGle Access POint for heterogeneous data REpositories) is a system for querying heterogeneous data. One of its particular features is that new sources may be registered at runtime. For this reason it does not rely on a predefined global integrated schema, but users can integrate data from the underlying sources when querying. Since formulating such queries may be a demanding task, our system allows the formulation of fuzzy queries, which are easier to formulate, at the expense of possibly producing less exact results. As a consequence, input queries need special treatment, called query preprocessing, which generates complex target queries that effectively return the results for the initial user queries. In this paper we discuss the importance of query preprocessing in our system, present heuristics for implementing it and show how techniques from database management systems and information retrieval can be combined in the process of query transformation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ruxandra Domenig and Klaus R. Dittrich. A query based approach for integrating heterogeneous data sources. Proc. 9th Int’l Conf. on Information and Knowledge Management, Washington, DC, November 2000.
Ruxandra Domenig and Klaus R. Dittrich. Singapore: A query based approach for integrating heterogeneous data sources. Technical Report of the Institute of Information Technology, University of Zürich, 2000.
H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, V. Vassalos, and J. Widom. The TSIMMIS approach to mediation: Data models and languages. Journal of Intelligent Information Systems, 1997.
L. Liu, C. Pu, and Y. Lee. Adaptive query mediation across heterogeneous information sources. In Proceedings of the International Conference on Cooperative Information Systems (CoopIS), 1996.
M. F. Porter. An algorithm for suffix stripping. Program, 14(3), July 1980.
M. Roth, F. Ozean, and L. Haas. Cost models do matter: Providing cost information for diverse data sources in a federated system. Proceedings of the international Conference on VLDB, 1999.
E. Selberg and O. Etzioni. Multi-service search and comparison using the MetaCrawler. In Proceedings of the 1995 World Wide Web Conference, 1995.
A. Tomasic, L. Raschid, and P. Valduriez. Scaling access to heterogeneous data sources with Disco. In IEEE Transactions on Knowledge and Data Engineering. IEEE, September/October 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Domenig, R., Dittrich, K.R. (2001). Query Preprocessing for Intergrated Search in Heterogeneous Data Sources. In: Heuer, A., Leymann, F., Priebe, D. (eds) Datenbanksysteme in Büro, Technik und Wissenschaft. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56687-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-56687-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41707-1
Online ISBN: 978-3-642-56687-5
eBook Packages: Springer Book Archive