The effectiveness of GIOSS for the text database discovery problem

Authors:
Luis Gravano

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA
View Profile

,
Héctor García-Molina

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA
View Profile

,
Anthony Tomasic

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA and Princeton University, Department of Computer Science

Stanford University, Computer Science Dept., Margaret Jacks Hall, Stanford, CA and Princeton University, Department of Computer Science
View Profile

SIGMOD '94: Proceedings of the 1994 ACM SIGMOD international conference on Management of dataMay 1994Pages 126–137https://doi.org/10.1145/191839.191869

Published:24 May 1994Publication History

SIGMOD '94: Proceedings of the 1994 ACM SIGMOD international conference on Management of data

Pages 126–137

ABSTRACT

The popularity of on-line document databases has led to a new problem: finding which text databases (out of many candidate choices) are the most relevant to a user. Identifying the relevant databases for a given query is the text database discovery problem. The first part of this paper presents a practical solution based on estimating the result size of a query and a database. The method is termed GlOSS—Glossary of Servers Server. The second part of this paper evaluates the effectiveness of GlOSS based on a trace of real user queries. In addition, we analyze the storage cost of our approach.

References

1.Luis Gravano, H6ctor Garc/a-Molina, and Anthony Tomasic. The efficacy of GLOSS for the text database discovery problem. Technical Report STAN-CS-TN- 93-002, Stanford University, November 1993. Available by anonymous ftp from db.stazlford.edu in /pub/grava_no/1993/st a_n. cs. tn. 93. 009. ps. Google ScholarDigital Library
2.Michael F. Schwartz, Alan Emtage, Brewster Kahle, and B. Cliford Neuman. A comparison of INTERNET resource discovery approaches. Computer Systems, 5(4), 1992.Google Scholar
3.Katia Obraczka, Peter B. Danzig, and Shih-Hao Li. INTERNET resource discovery services. IEEE Computer, September 1993. Google ScholarDigital Library
4.Tim Berners-Lee, Robert Cailliau, Jean-F. Croft, and Bernd Pollermann. World-Wide Web: The Information Universe. Electronic Networking: Research, Applications and Policy, 1(2), 1992.Google Scholar
5.Steve Foster. About the Veronica service, November 1992. Message posted in comp. ~nfosystems. gopher.Google Scholar
6.B. Clifford Neuman. The Prospero File System: A global file system based on the Virtual System model. Computer Systems, 5(4), 1992.Google Scholar
7.Brewster Kahle and Art Medlar. An information system for corporate users: Wide Area Information Servers. Technical Report TMC199, Thinking Machines Corporation, April 1991.Google Scholar
8.Jim Fullton, Archie Warnock, et al. Release notes for freeWAIS 0.2, October 1993.Google Scholar
9.Michael F. Schwartz. A scalable, non-hierarchical resource discovery mechanism based on probabilistic protocols. Technical Report CU-CS-474-90, Dept. of Computer Science, University of Colorado at Boulder, June 1990.Google Scholar
10.Michael F. Schwartz. INTERNET resource discovery at the University of Colorado. IEEE Computer, September 1993. Google ScholarDigital Library
11.Peter B. Danzig, Shih-Hao Li, and Katia Obraczka. Distributed indexing of autonomous INTERNET services. Computer Systems, 5(4), 1992.Google Scholar
12.Peter B. Danzig, Jongsuk Ahn, John Noll, and Katia Obraczka. Distributed indexing: a scalable mechanism for distributed information retrieval. In Proceedings of the 14th Annual SIGIR Conference, October 1991. Google ScholarDigital Library
13.Patricia Simpson and Rafael Alonso. Querying a network of autonomous databases. Technical Report CS-TR-202-89, Dept. of Computer Science, Princeton University, January 1989.Google Scholar
14.Daniel Barbar# and Chris Clifton. Information Brokers: Sharing knowledge in a heterogeneous distributed system. Technical Report MITL-TR-31-92, Matsushita Information Technology Laboratory, October 1992.Google Scholar
15.Joann J. OrdilIe and Barton P. Miller. Distributed active catalogs and meta-data caching in descriptive name services. Technical Report #1118, University of Wisconsin-Madison, November 1992.Google Scholar
16.Chris Weider and Simon Spero. Architecture of the WHOIS++ Index Service, October 1993. Working draft.Google Scholar
17.Ran Giladi and Peretz Shoval. Routing queries in a network of databases driven by a meta knowledgebase. In Proceedings of the International Workshop on Next Generation Informatwn Technologies and Systems, June 1993.Google Scholar
18.Mark A. Sheldon, Andrzej Duda, Ron Weiss, James W. O'Toole, and David K. Gifford. A content routing system for distributed information servers. To appear in EDBT '94. Google ScholarDigital Library
19.Alice Y. Chamis. Selection of online databases using switching vocabularies. Journal of the American Society for Information Sc,ence, 39(3), 1988.Google Scholar
20.Gerard Salton and Michael J. McGill. Introduction to modern information retrieval. McGraw-Hill, 1983. Google ScholarDigital Library
21.Gerard Salton and Chris Buckley. Parallel text search methods. Communicatwns of the ACM, 31(2), February 1988. Google ScholarDigital Library

Index Terms

Recommendations

The effectiveness of GIOSS for the text database discovery problem

The popularity of on-line document databases has led to a new problem: finding which text databases (out of many candidate choices) are the most relevant to a user. Identifying the relevant databases for a given query is the text database discovery ...
Read More
The Efficacy of GlOSS for the Text Database Discovery Problem
Read More
Neural agent for text database discovery
Intelligent exploration of the web

As the number and diversity of text databases on the Internet increases rapidly, users are faced with finding the text databases that are relevant to the user query. Identifying the relevant text databases out of many candidates for a given query is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '94: Proceedings of the 1994 ACM SIGMOD international conference on Management of data
May 1994
525 pages
ISBN:0897916395
DOI:10.1145/191839
Editors:
Richard Thomas Snodgrass
Univ. of Arizona
,
Marianne Winslett
Univ. of Illinois
ACM SIGMOD Record Volume 23, Issue 2
June 1994
522 pages
ISSN:0163-5808
DOI:10.1145/191843
Editors:
Richard Thomas Snodgrass
Univ. of Arizona
,
Marianne Winslett
Univ. of Illinois
Issue’s Table of Contents
Copyright © 1994 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 May 1994
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 103
  Total Citations
  View Citations
- 533
  Total Downloads
- Downloads (Last 12 months)50
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The effectiveness of GIOSS for the text database discovery problem

SIGMOD '94: Proceedings of the 1994 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

The effectiveness of GIOSS for the text database discovery problem

The Efficacy of GlOSS for the Text Database Discovery Problem

Neural agent for text database discovery