Abstract
Resource discovery in a distributed digital library poses many challenges, one of which is how to choose search engines for query distribution, given a query and a set of search engines. This paper focuses on search engine performance as a criterion for search engine selection and defines two measurements of search engine performance: availability – will the search engine respond within a time limit, and response time – how quickly will the search engine respond, given that it responds at all. We predicted both of these performance characteristics with a variety of algorithms, all of which required little computation time and combined past performance data for each search engine into a succinct record. We used operational data from the NCSTRL distributed digital library to make and evaluate predictions, and we found that simple prediction methods performed as well as more complex methods and that prediction accuracy was closely related to data consistency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
"Information Retrieval (Z39.50): Application Service Definition and Protocol Specification," ANSI/NISO, 1995.
Cahoon, B. and K. McKinley, "Performance Evaluation of a Distributed Architecture for Information Retrieval," presented at ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, 1996.
Callan, J. P., Z. Lu, et al., "Searching Distributed Collections with Inference Networks," presented at 18th International Conference on Research and Development in Information Retrieval, Seattle, 1995.
Chang, C.-C. K. and H. Garcia-Molina, "Evaluating the Cost of Boolean Query Mapping," presented at ACM Digital Libraries’ 97, Philadelphia, 1997.
Chu, W., "Optimal File Allocation in Multiple Computer Systems," IEEE Transactions on Computers, October, 1969.
Davis, J. and C. Lagoze, "Dienst Protocol Version 5.0," 1997; http://www.cs.cornell.edu/lagoze/dienst/protocol5.htm.
Dushay, N., J. C. French, et al., "A Characterization Study of NCSTRL Distributed Searching," Cornell University Computer Science, Technical Report TR99-1725, January 1999.
Dushay, N., J. C. French, et al., "Using Query Mediators for Distributed Searching in Federated Digital Libraries," to be presented at ACM Digital Libraries’ 99, Berkeley, CA, 1999.
French, J. C., A. L. Powell, et al., "Comparing the Performance of Database Selection Algorithms," to be presented at ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, CA, 1999.
French, J. C., A. L. Powell, et al., "Efficient Searching in Distributed Digital Libraries," presented at ACM Digital Libraries’ 98, Pittsburgh, 1998.
French, J. C., A. L. Powell, et al., "Evaluating Database Selection Techniques: A Testbed and Experiment," presented at ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 1998.
French, J. C. and C. L. Viles, "Ensuring Retrieval Effectiveness in Distributed Digital Libraries," Journal of Visual Communication and Image Representation, 7(1), pp. 61–73, 1996.
Gravano, L., C.-C. Chang, et al., "STARTS: Stanford Proposal for Internet Meta-Searching," presented at ACM SIGMOD International Conference on the Management of Data, 1997.
Gravano, L., H. Garcia-Molina, et al., "The Effectiveness of GlOSS for the Text-Database Discovery Problem," presented at ACM SIGMOD International Conference on the Management of Data, 1994.
Lagoze, C., "From Static to Dynamic Surrogates: Resource Discovery in the Digital Age," D-Lib Magazine, June 1997.
Lagoze, C., E. Shaw, et al., "Dienst Implementation Reference Manual," Cornell University Computer Science, Technical Report TR95-1514, May 1995.
Lasher, R. and D. Cohen, "A Format for Bibliographic Records," Internet Engineering Task Force, RFC 1807, June 1995.
Leiner, B. M., "The NCSTRL Approach to Open Architecture for the Confederated Digital Library," D-Lib Magazine, December 1998.
Roszkowski, M. and C. Lukas, "A Distributed Architecture for Resource Discovery Using Metadata," D-Lib Magazine, June 1998.
Vingralek, R., Y. Breitbart, et al., "Web++: A System for Fast and Reliable Web Service," to be presented at the 15th International Conference on Data Engineering, Sydney, Australia, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dushay, N., French, J.C., Lagoze, C. (1999). Predicting Indexer Performance in a Distributed Digital Library. In: Abiteboul, S., Vercoustre, AM. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1999. Lecture Notes in Computer Science, vol 1696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48155-9_11
Download citation
DOI: https://doi.org/10.1007/3-540-48155-9_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66558-8
Online ISBN: 978-3-540-48155-3
eBook Packages: Springer Book Archive