Skip to main content

An Advanced Server Ranking Algorithm for Distributed Retrieval Systems on the Internet

  • Conference paper
Computer and Information Sciences - ISCIS 2004 (ISCIS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3280))

Included in the following conference series:

  • 794 Accesses

Abstract

Database selection, also known as resource selection, server selection and query routing is an important topic in distributed information retrieval research. Several approaches to database selection use document frequency data to rank servers. Many researchers have shown that the effectiveness of these algorithms depends on database size and content. In this paper we propose a database selection algorithm which uses document frequency data and an extended database description in order to rank servers. The algorithm does not depend on the size and content of the databases in the system. We provide experimental evidence, based on actual data, that our algorithm outperforms the vGlOSS, CVV and CORI database selection algorithms in respect of the precision and recall evaluation measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bailey, P., Craswell, N., Hawking, D.: Engineering a Multi-Purpose Test Collection for Web Retrieval Experiments. In: Information Processing and Management (2002)

    Google Scholar 

  2. Callan, J.P., Lu, Z., Croft, W.B.: Searching Distributed Collections with Inference Networks. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21–28. ACM Press, New York (1995)

    Chapter  Google Scholar 

  3. Chakravarthy, A.S., Haase, K.B.: NetSerf: Using Semantic Knowledge to Find Internet Information Archives. In: Proceedings of the Eighteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 4–11 (1995)

    Google Scholar 

  4. Cutting, D.R., Karger, D.R., Pederson, J.O., Tukey, J.W.: Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329 (1992)

    Google Scholar 

  5. Gravano, L., Garcia-Molina, H.: Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies. In: Proceedings of the 21st International Conference on Very Large Data Bases VLDB 1995, pp. 78–89 (1995)

    Google Scholar 

  6. Internet Archive. Internet Archive: Building an Internet Library (1997), http://www.archive.org

  7. Khoussainov, R., O’Meara, T., Patel, A.: Adaptive Distributed Search and Advertising for WWW. In: Callaos, N., Holmes, L., Osers, R. (eds.) Proceedings of the Fifth World Multiconference on Systemics, Cybernetics and Informatics (SCI 2001), vol. 5, pp. 73–78 (2001)

    Google Scholar 

  8. Kirk, T., Levy, A.Y., Sagiv, Y., Srivastava, D.: The Information Manifold. In: Knoblock, C., Levy, A. (eds.) Information Gathering from Heterogeneous, Distributed Environments (1995)

    Google Scholar 

  9. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  10. Si, L., Callan, J.: The Effect of Database Size Distribution on Resource Selection Algorithms. In: Callan, J., Crestani, F., Sanderson, M. (eds.) Proceedings of the SIGIR 2003 Workshop on Distributed Information Retrieval (2003)

    Google Scholar 

  11. Si, L., Lu, J., Callan, J.: Distributed Information Retrieval With Skewed Database Size Distribution. In: Proceedings of the National Conference on Digital Government Research (2003)

    Google Scholar 

  12. Van Rijsbergen, C.J.: Information Retrieval. Department of Computing Science, University of Glasgow, 2nd edn. Butterworths (1979)

    Google Scholar 

  13. Voorhees, E.M.: Evaluation by Highly Relevant Documents. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 74–82. ACM Press, New York (2001)

    Chapter  Google Scholar 

  14. Voorhees, E.M., Harman, D.K.: Overview of the Ninth Text Retrieval Conference (TREC-9). In: Proceedings of the Ninth Text REtrieval Conference (TREC-9 ). Department of Commerce, National Institute of Standards and Technology (2001)

    Google Scholar 

  15. Yuwono, B., Lee, D.K.: WISE: A World Wide Web Resource Database System. Knowledge and Data Engineering 8(4), 548–554 (1996)

    Article  Google Scholar 

  16. Yuwono, B., Lee, D.L.: Server Ranking for Distributed Text Retrieval Systems on the Internet. In: Topor, R.W., Tanaka, K. (eds.) Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA), pp. 41–50. World Scientific, Singapore (1997)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hyusein, B., Carthy, J. (2004). An Advanced Server Ranking Algorithm for Distributed Retrieval Systems on the Internet. In: Aykanat, C., Dayar, T., Körpeoğlu, İ. (eds) Computer and Information Sciences - ISCIS 2004. ISCIS 2004. Lecture Notes in Computer Science, vol 3280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30182-0_84

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30182-0_84

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23526-2

  • Online ISBN: 978-3-540-30182-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics