Skip to main content

Achieving Effective Multi-term Queries for Fast DHT Information Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5175))

Abstract

Distributed Hash Tables (DHTs) are well-suited for exact match look- ups using unique identifiers, but do not directly support multi-term queries. Related research of query expansion has shown that adding new terms to a query via ad hoc feedback improves the retrieval effectiveness of such query. In the paper, we propose an effective multi-term query processing algorithm for information retrieval in DHT systems. Given the significance of first term in a multi-term query, the query is sent to the peers containing the first term. To enhance the query effectiveness, we design two query expansion mechanisms and an implicit relevance feedback approach based on users’ behaviors. Additionally, we record the query log and the expansion terms for each query which can accelerate the future queries and improve the query accuracy. Experimental results show that our query methods yield substantial improvements in retrieval effectiveness in the following three aspects: recall, precision at 10 standard recall levels and precision histograms.

This research has been supported by the National Natural Science foundation of China under Grant No.60673183, National Grand Fundamental Research 973 program of China under Grant No.2004CB318204 and Australian research grant (ARC DP0773483).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lu, J., Callan, J.P.: Content-based retrieval in hybrid peer-to-peer networks. In: CIKM, pp. 199–206 (2003)

    Google Scholar 

  2. Lu, J., Callan, J.P.: User modeling for full-text federated search in peer-to-peer networks. In: SIGIR, pp. 332–339 (2006)

    Google Scholar 

  3. Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. In: WebDB 2003 (June 2003)

    Google Scholar 

  4. Bender, M., Michel, S., Parreira, J.X., Crecelius, T.: P2p web search: Make it light, make it fly (demo). In: CIDR, pp. 164–168 (2007)

    Google Scholar 

  5. Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Li, J., Loo, B.T., Hellerstein, J., Kaashoek, F., Karger, D.R., Morris, R.: On the feasibility of peer-to-peer web indexing and search. In: IPTPS 2003 (February 2003)

    Google Scholar 

  7. Gnawali, O.D.: A keyword set search system for peer-to-peer networks. Master’s thesis, Massachusetts Institute of Technology (June 2002)

    Google Scholar 

  8. Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: A study and analysis of user queries on the web. Information Processing and Management 36, 207–227 (2000)

    Article  Google Scholar 

  9. Chen, H., Jin, H., Liu, Y., Ni, L.M.: Difficulty-aware hybrid search in peer-to-peer networks. In: ICPP, p. 6 (2007)

    Google Scholar 

  10. Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: SIGIR, pp. 187–195 (1996)

    Google Scholar 

  11. Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp. 353–366 (2004)

    Google Scholar 

  12. Ramabhadran, S., Hellerstein, J.M., Ratnasamy, S., Shenker, S.: Prefix hash tree: An indexing data structure over distributed hash tables (2004)

    Google Scholar 

  13. Zhou, M., Zhang, R., Qian, W., Zhou, A.: Gchord: Indexing for multiattribute query in p2p system with low maintenance cost. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 55–66. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. In: ICDE, pp. 165–176 (2004)

    Google Scholar 

  15. Tang, C., Dwarkadas, S.: Hybrid global-local indexing for efficient peer-to-peer information retrieval. In: NSDI (2004)

    Google Scholar 

  16. Witschel, H.F., Böhme, T.: Evaluating profiling and query expansion methods for p2p information retrieval. In: P2PIR (2005)

    Google Scholar 

  17. Xu, J., Callan, J.: Effective retrieval with distributed collections. In: Proc. of SIGIR 1998, pp. 112–120 (1998)

    Google Scholar 

  18. Churn, K.W., Hanks, P.: Word association norms, mutual information and lexicography. In: Proceedings of ACL 27, Vancouver, Canada, pp. 76–83 (1989)

    Google Scholar 

  19. Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Science 18, 613–620 (1975)

    MATH  Google Scholar 

  20. Bausch, P., Calishain, T., Dornfest, R.: Google Hacks, 3rd edn., pp. 101–105. O’Reilly Media, Inc., Sebastopol (2006)

    Google Scholar 

  21. Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM 30, 964–971 (1987)

    Article  Google Scholar 

  22. Chinese web inforamtion retrieval forum, http://www.cwirf.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

James Bailey David Maier Klaus-Dieter Schewe Bernhard Thalheim Xiaoyang Sean Wang

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, Q., Shen, H.T., Dai, Y., Cui, B., Zhou, X. (2008). Achieving Effective Multi-term Queries for Fast DHT Information Retrieval. In: Bailey, J., Maier, D., Schewe, KD., Thalheim, B., Wang, X.S. (eds) Web Information Systems Engineering - WISE 2008. WISE 2008. Lecture Notes in Computer Science, vol 5175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85481-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85481-4_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85480-7

  • Online ISBN: 978-3-540-85481-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics