Abstract
In data integration systems, a central site often maintain a global catalog of all available data sources, and maintain statistics to allow the query optimizer to generate a good query plan. These statistics could be updated in a lazy manner during query execution time. A user query is often broken into several query fragments, and a centralized task scheduler schedules the execution of the respective query fragment, fetching data from the various data sources. This is then integrated at the central site and presented to the user. As data sources are introduced, there is a need to update the global catalog from time to time. However, due to the autonomous nature of the data sources, which are maintained by local administrators, it is dificult to ensure accurate statistics as well as the availability of the data sources. In addition, since the data are integrated at the central site, the central site could become a potential bottleneck. The unpredictable nature of the wide area environment further exacerbate the problem of query processing.
In this paper, we present our ongoing work on dbRouter, a distributed query optimization and processing framework for open environment. The dbRouter provides mechanisms to faciliate the discovery of new data sources, performs distributed query optimization, and manages the routing of data to its destination for processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Laurent Amsaleg, Michael J. Franklin, and Anthony Tomasic. Dynamic query operator scheduling for wide-area remote access.
Laurent Amsaleg, Michael J. Franklin, Anthony Tomasic, and Tolga Urhan. Scrambling query plans to cope with unexpected delays, 1996.
Remzi H. Arpaci-Dusseau, Eric Anderson, Noah Treuhaft, David E. Culler, Joseph M. Hellerstein, David Patterson, and Kathy Yelic. Cluster i/o with river: Making the fast case common, 1999.
R. Avnur and J. Hellerstein. Eddies: Continuously adaptive query processing, 2000.
Philippe Bonnet, Johannes Gehrke, and Praveen Seshadri. Towards sensor database systems, Jan 2001.
Sudarshan Chawathe, Hector Garcia-Molina, Joachim Hammer, Kelly Ireland, Yannis Papakonstantinou, Jeffrey D. Ullman, and Jennifer Widom. The TSIMMIS project: Integration of heterogeneous information sources. In 16th Meeting of the Information Processing Society of Japan, pages 7–18, Tokyo, Japan, 1994.
L. Haas, D. Kossman, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources, 1997.
Tomasz Imielinski and Samir Goel. Dataspace-querying and monitoring deeply networked collections in physical space.
Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution system for data integration. Proceedings of ACM SIGMOD Conf., Philadelphia, PA, 1999., 1999.
Z.G. Ives, A. Y. Levy, J. Madhavan, R. Pottinger, S. Saroiu, I. Tatarinov, S. Betzler, Q. Chen, E. Jaslikowska, J. Su, W. Tak, and T. Yeung. Self-organising data sharing communities with sagres.
Michael Stillger, Johann K. Obermaier, and Johann Christoph Freytag. Aques: An agent-based query evaluation system, June 1997.
M. Stonebraker, P.M. Aoki, R. Devine, W. Litwin, and M. Olson. Mariposa: A new architecure for distributed data, Feb 1994.
A. Tomasic, L. Raschid, and P. Valduriez. Scaling access to heterogeneous data sources with disco, September/October 1998.
Tolga Urhan and Michael J. Franklin. Xjoin: A reactively-scheduled pipelined join operator, 2000.
Tolga Urhan, Michael J. Franklin, and Laurent Amsaleg. Cost-based query scrambling for initial delays, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tok, W.H., Bressan, S. (2002). dbRouter - A Scaleable and Distributed Query Optimization and Processing Framework. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2002. Lecture Notes in Computer Science, vol 2453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46146-9_65
Download citation
DOI: https://doi.org/10.1007/3-540-46146-9_65
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44126-7
Online ISBN: 978-3-540-46146-3
eBook Packages: Springer Book Archive