ABSTRACT
We consider the processing of digital library queries, consisting of a text component and a structured component in distributed environments. The text component can be processed using techniques given in previous papers such as [7, 8, 11]. In this paper, we concentrate on the processing of the structured component of a distributed query. Histograms are constructed and algorithms are given to provide estimates of the desirabilities of the databases with respect to the given query. Databases are selected in descending order of desirability. An algorithm is also given to select tuples from the selected databases. Experimental results are given to show that the techniques provided here are effective and efficient.
- 1.M. Carey and D. Kossmann. On Saying "Enough Already" in SQL, Proc. of ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997, pp. 219-230 Google ScholarDigital Library
- 2.M. Carey and D. Kossmann. Reducing the Braking Distance of an SQL Query Engine, Proc. of 24th International Conference on Very Large Data Bases, New York City, August 1998, pp. 158-169. Google ScholarDigital Library
- 3.S. Chaudhuri and L. Gravano. Evaluating Top-k Selection queries, Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 397-410. Google ScholarDigital Library
- 4.D. Donjerkovic and R. Ramakrishnan. Probabilistic Optimization of Top N Queries, Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 411-422. Google ScholarDigital Library
- 5.R. Fagin. Combining fuzzy Information from Multiple Systems, Proc. of ACM Symposium on Principles of Database Systems, Montreal, Quebec, 1996, pp. 216-2226. Google ScholarDigital Library
- 6.J. French, E. Fox, K. Maly, and A. Selman. Wide Area Technical Report Service: Technical Report Online. Communications of the ACM, 38, 4, April 1995, pp. 45-46. Google ScholarDigital Library
- 7.S. Gauch, G. Wang, and M. Gomez. Profusion: Intelligent Fusion from Multiple, Distributed Search Engines, Journal of Universal Computer Science, 2, 9, 1996, pp. 637-649.Google Scholar
- 8.L. Gravano and H. Garcia-Molina. Generalizing GlOSS to Vector-Space databases and Broker Hierarchies, Proc. of 21st International Conferences on Very Large Data Bases, Zurich, Switzerland, September 1995, pp. 78-89. Google ScholarDigital Library
- 9.Y. Ioannidis and V. Poosala. Histogram-based Approximation of Set-valued Query Answers, Proc.of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 174-185. Google ScholarDigital Library
- 10.A. Konig and G. Weikum. Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-Size Estimation. Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 423-434. Google ScholarDigital Library
- 11.K. Liu, C. Yu, W. Meng, W. Wu and N. Rishe, A Statistical Method for Estimating the Usefulness of Text databases, IEEE Transactions on Knowledge and Data Engineering, (to appear). Google ScholarDigital Library
- 12.W. Meng, K. Liu, C. Yu, X. Wang, Y. Chang, N. Rishe. Determine Text Databases to Search in the Internet. Proc. of 24th International Conference on Very Large Data Bases, New York City, August 1998, pp. 14-25. Google ScholarDigital Library
- 13.Networked Computer Science Technical Reference Library (http://cs-tr.cs.cornell.edu/).Google Scholar
- 14.C. Yu, K. Liu, W. Meng, Z. Wu, and N. Rishe. A Methodology to Retrieve Text Documents from Multiple Databases. IEEETransactions on Knowledge and Data Engineering (to appear). Google ScholarDigital Library
- 15.C. Yu, W. Sun, S. Dao, and D. Keirsey. Determining relationships among attributes for Interoperability of Multidatabase Systems. Proc. of the 1st International Workshop on Interoperability in Multidatabase Systems, Kyoto, Japan, April 1991.Google Scholar
Index Terms
- Database selection for processing k nearest neighbors queries in distributed environments
Recommendations
Combining Joint and Semi-Join Operations for Distributed Query Processing
The application of a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing is discussed. Specifically, two important concepts that occur with the use of join operations as ...
Interleaving a Join Sequence with Semijoins in Distributed Query Processing
The problem of combining join and semijoin reducers for distributed query processing is studied. An approach based on interleaving a join sequence with beneficial semijoins is proposed. A join sequence is mapped into a join sequence tree first. The join ...
Progress in Database Search Strategies
Retrieval speed and precision ultimately determine the success of any database system. This article outlines the challenges posed by distributed and heterogeneous database systems, including those that store unstructured data, and surveys recent work. ...
Comments