Abstract
Distributed Hash Tables (DHTs) are well-suited for exact match look- ups using unique identifiers, but do not directly support multi-term queries. Related research of query expansion has shown that adding new terms to a query via ad hoc feedback improves the retrieval effectiveness of such query. In the paper, we propose an effective multi-term query processing algorithm for information retrieval in DHT systems. Given the significance of first term in a multi-term query, the query is sent to the peers containing the first term. To enhance the query effectiveness, we design two query expansion mechanisms and an implicit relevance feedback approach based on users’ behaviors. Additionally, we record the query log and the expansion terms for each query which can accelerate the future queries and improve the query accuracy. Experimental results show that our query methods yield substantial improvements in retrieval effectiveness in the following three aspects: recall, precision at 10 standard recall levels and precision histograms.
This research has been supported by the National Natural Science foundation of China under Grant No.60673183, National Grand Fundamental Research 973 program of China under Grant No.2004CB318204 and Australian research grant (ARC DP0773483).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lu, J., Callan, J.P.: Content-based retrieval in hybrid peer-to-peer networks. In: CIKM, pp. 199–206 (2003)
Lu, J., Callan, J.P.: User modeling for full-text federated search in peer-to-peer networks. In: SIGIR, pp. 332–339 (2006)
Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. In: WebDB 2003 (June 2003)
Bender, M., Michel, S., Parreira, J.X., Crecelius, T.: P2p web search: Make it light, make it fly (demo). In: CIDR, pp. 164–168 (2007)
Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)
Li, J., Loo, B.T., Hellerstein, J., Kaashoek, F., Karger, D.R., Morris, R.: On the feasibility of peer-to-peer web indexing and search. In: IPTPS 2003 (February 2003)
Gnawali, O.D.: A keyword set search system for peer-to-peer networks. Master’s thesis, Massachusetts Institute of Technology (June 2002)
Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: A study and analysis of user queries on the web. Information Processing and Management 36, 207–227 (2000)
Chen, H., Jin, H., Liu, Y., Ni, L.M.: Difficulty-aware hybrid search in peer-to-peer networks. In: ICPP, p. 6 (2007)
Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: SIGIR, pp. 187–195 (1996)
Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: SIGCOMM, pp. 353–366 (2004)
Ramabhadran, S., Hellerstein, J.M., Ratnasamy, S., Shenker, S.: Prefix hash tree: An indexing data structure over distributed hash tables (2004)
Zhou, M., Zhang, R., Qian, W., Zhou, A.: Gchord: Indexing for multiattribute query in p2p system with low maintenance cost. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 55–66. Springer, Heidelberg (2007)
Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. In: ICDE, pp. 165–176 (2004)
Tang, C., Dwarkadas, S.: Hybrid global-local indexing for efficient peer-to-peer information retrieval. In: NSDI (2004)
Witschel, H.F., Böhme, T.: Evaluating profiling and query expansion methods for p2p information retrieval. In: P2PIR (2005)
Xu, J., Callan, J.: Effective retrieval with distributed collections. In: Proc. of SIGIR 1998, pp. 112–120 (1998)
Churn, K.W., Hanks, P.: Word association norms, mutual information and lexicography. In: Proceedings of ACL 27, Vancouver, Canada, pp. 76–83 (1989)
Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Science 18, 613–620 (1975)
Bausch, P., Calishain, T., Dornfest, R.: Google Hacks, 3rd edn., pp. 101–105. O’Reilly Media, Inc., Sebastopol (2006)
Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM 30, 964–971 (1987)
Chinese web inforamtion retrieval forum, http://www.cwirf.org
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, Q., Shen, H.T., Dai, Y., Cui, B., Zhou, X. (2008). Achieving Effective Multi-term Queries for Fast DHT Information Retrieval. In: Bailey, J., Maier, D., Schewe, KD., Thalheim, B., Wang, X.S. (eds) Web Information Systems Engineering - WISE 2008. WISE 2008. Lecture Notes in Computer Science, vol 5175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85481-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-85481-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85480-7
Online ISBN: 978-3-540-85481-4
eBook Packages: Computer ScienceComputer Science (R0)