Article

Database selection for processing k nearest neighbors queries in distributed environments

Authors:
Clement Yu

Dept. of CS, U. of Illinois at Chicago, Chicago, IL

Dept. of CS, U. of Illinois at Chicago, Chicago, IL
View Profile

,
Prasoon Sharma

Dept. of CS, U. of Illinois at Chicago, Chicago, IL

Dept. of CS, U. of Illinois at Chicago, Chicago, IL
View Profile

,
Weiyi Meng

Dept. of CS, SUNY at Binghamton, Binghamton, NY

Dept. of CS, SUNY at Binghamton, Binghamton, NY
View Profile

,
Yan Qin

Dept. of CS, U. of Illinois at Chicago, Chicago, IL

Dept. of CS, U. of Illinois at Chicago, Chicago, IL
View Profile

JCDL '01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital librariesJanuary 2001Pages 215–222https://doi.org/10.1145/379437.379504

Published:01 January 2001Publication History

JCDL '01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries

Pages 215–222

ABSTRACT

We consider the processing of digital library queries, consisting of a text component and a structured component in distributed environments. The text component can be processed using techniques given in previous papers such as [7, 8, 11]. In this paper, we concentrate on the processing of the structured component of a distributed query. Histograms are constructed and algorithms are given to provide estimates of the desirabilities of the databases with respect to the given query. Databases are selected in descending order of desirability. An algorithm is also given to select tuples from the selected databases. Experimental results are given to show that the techniques provided here are effective and efficient.

References

1.M. Carey and D. Kossmann. On Saying "Enough Already" in SQL, Proc. of ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997, pp. 219-230 Google ScholarDigital Library
2.M. Carey and D. Kossmann. Reducing the Braking Distance of an SQL Query Engine, Proc. of 24th International Conference on Very Large Data Bases, New York City, August 1998, pp. 158-169. Google ScholarDigital Library
3.S. Chaudhuri and L. Gravano. Evaluating Top-k Selection queries, Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 397-410. Google ScholarDigital Library
4.D. Donjerkovic and R. Ramakrishnan. Probabilistic Optimization of Top N Queries, Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 411-422. Google ScholarDigital Library
5.R. Fagin. Combining fuzzy Information from Multiple Systems, Proc. of ACM Symposium on Principles of Database Systems, Montreal, Quebec, 1996, pp. 216-2226. Google ScholarDigital Library
6.J. French, E. Fox, K. Maly, and A. Selman. Wide Area Technical Report Service: Technical Report Online. Communications of the ACM, 38, 4, April 1995, pp. 45-46. Google ScholarDigital Library
7.S. Gauch, G. Wang, and M. Gomez. Profusion: Intelligent Fusion from Multiple, Distributed Search Engines, Journal of Universal Computer Science, 2, 9, 1996, pp. 637-649.Google Scholar
8.L. Gravano and H. Garcia-Molina. Generalizing GlOSS to Vector-Space databases and Broker Hierarchies, Proc. of 21st International Conferences on Very Large Data Bases, Zurich, Switzerland, September 1995, pp. 78-89. Google ScholarDigital Library
9.Y. Ioannidis and V. Poosala. Histogram-based Approximation of Set-valued Query Answers, Proc.of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 174-185. Google ScholarDigital Library
10.A. Konig and G. Weikum. Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-Size Estimation. Proc. of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 423-434. Google ScholarDigital Library
11.K. Liu, C. Yu, W. Meng, W. Wu and N. Rishe, A Statistical Method for Estimating the Usefulness of Text databases, IEEE Transactions on Knowledge and Data Engineering, (to appear). Google ScholarDigital Library
12.W. Meng, K. Liu, C. Yu, X. Wang, Y. Chang, N. Rishe. Determine Text Databases to Search in the Internet. Proc. of 24th International Conference on Very Large Data Bases, New York City, August 1998, pp. 14-25. Google ScholarDigital Library
13.Networked Computer Science Technical Reference Library (http://cs-tr.cs.cornell.edu/).Google Scholar
14.C. Yu, K. Liu, W. Meng, Z. Wu, and N. Rishe. A Methodology to Retrieve Text Documents from Multiple Databases. IEEETransactions on Knowledge and Data Engineering (to appear). Google ScholarDigital Library
15.C. Yu, W. Sun, S. Dao, and D. Keirsey. Determining relationships among attributes for Interoperability of Multidatabase Systems. Proc. of the 1st International Workshop on Interoperability in Multidatabase Systems, Kyoto, Japan, April 1991.Google Scholar

Index Terms

Database selection for processing k nearest neighbors queries in distributed environments

Recommendations

Combining Joint and Semi-Join Operations for Distributed Query Processing

The application of a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing is discussed. Specifically, two important concepts that occur with the use of join operations as ...
Read More
Interleaving a Join Sequence with Semijoins in Distributed Query Processing

The problem of combining join and semijoin reducers for distributed query processing is studied. An approach based on interleaving a join sequence with beneficial semijoins is proposed. A join sequence is mapped into a join sequence tree first. The join ...
Read More
Progress in Database Search Strategies

Retrieval speed and precision ultimately determine the success of any database system. This article outlines the challenges posed by distributed and heterogeneous database systems, including those that store unstructured data, and surveys recent work. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
JCDL '01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
January 2001
481 pages
ISBN:1581133456
DOI:10.1145/379437
Chairmen:
Edward A. Fox
Virginia Tech
,
Christine L. Borgman
UCLA
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
database selection
distributed databases
k nearest neighbors
query processing
Qualifiers
- Article
Conference

Acceptance Rates
JCDL '01 Paper Acceptance Rate76of250submissions,30%Overall Acceptance Rate415of1,482submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 18
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Database selection for processing k nearest neighbors queries in distributed environments

JCDL '01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries

ABSTRACT

References

Cited By

Index Terms

Recommendations

Combining Joint and Semi-Join Operations for Distributed Query Processing

Interleaving a Join Sequence with Semijoins in Distributed Query Processing

Progress in Database Search Strategies