Skip to main content
Log in

HAPS: Supporting Effective and Efficient Full-Text P2P Search with Peer Dynamics

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Recently, peer-to-peer (P2P) search technique has become popular in the Web as an alternative to centralized search due to its high scalability and low deployment-cost. However, P2P search systems are known to suffer from the problem of peer dynamics, such as frequent node join/leave and document changes, which cause serious performance degradation. This paper presents the architecture of a P2P search system that supports full-text search in an overlay network with peer dynamics. This architecture, namely HAPS, consists of two layers of peers. The upper layer is a DHT (distributed hash table) network interconnected by some super peers (which we refer to as hubs). Each hub maintains distributed data structures called search directories, which could be used to guide the query and to control the search cost. The bottom layer consists of clusters of ordinary peers (called providers), which can receive queries and return relevant results. Extensive experimental results indicate that HAPS can perform searches effectively and efficiently. In addition, the performance comparison illustrates that HAPS outperforms a flat structured system and a hierarchical unstructured system in the environment with peer dynamics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Nottelmann H, Fischer G, Titarenko A, Nurzenski A. An integrated approach for searching and browsing in heterogeneous peer-to-peer networks. In ACM SIGIR Workshop Hetergeneous and Distributed Information Retrieval, Salvador, Brazil, Aug. 19, 2005.

  2. Bender M, Michel S, Triantafillou P, Weikum G, Zimmer C. Improving collection selection with overlap awareness in P2P search engines. In Proc. ACM SIGIR Conf. Research and Development in Information Retrieval, Salvador, Brazil, Aug. 15–19, 2005, pp.67–74.

  3. Lu J, Callan J. Merging retrieval results in hierarchical peer-to-peer networks. In Proc. ACM SIGIR Conf. Research and Development in Information Retrieval, Sheffield, UK, July 25–29, 2004, pp.472–473.

  4. Nejdl W, Wolpers M, Siberski W, Schmitz C. Super-peer based routing and clustering strategies for RDF-based peer-to-peer networks. In Proc. WWW, Budapest, Hungary, May 20–24, 2003, pp.536–543.

  5. Xing Jin, W-P Ken Yiu, S-H Gray Chan. Supporting multiple-keyword search in a hybrid structured peer-to-peer network. In Proc. IEEE International Conference on Communications, Seoul, Korea, May 16–20, 2005, pp.42–47.

  6. Garcés-Erice L, Biersack E W, Felber P, Ross K W, Urvoy-Keller G. Hierarchical peer-to-peer systems. In Proc. Euro-Par, Klagenfurt, Austria, Aug. 26–29, 2003, pp.1230–1239.

  7. Ganesan P, Gummadi K, Garcia-Molina H. Canon in G major: Designing DHTS with hierarchical structure. In Proc. ICDCS, Tokyo, Japan, March 23–26, 2004, pp.263–272.

  8. Luu T, Klemm F, Podnar I, Aberer K, Rajman M. Alvis peers: A scalable full-text peer-to-peer retrieval engine. In Proc. P2PIR, Arlington, USA, Nov. 11, 2006, pp.41–48.

  9. Tang C, Dwarkadas S. Hybrid global-local indexing for efficient peer-to-peer information retrieval. In Proc. NSDI, San Francisco, USA, Mar. 29–31, 2004, pp.211–224.

  10. Podnar I, Rajman M, Luu T, Klemm F, Aberer K. Scalable peer-to-peer web retrieval with highly discriminative keys. In Proc. IEEE ICDE, Istanbul, Turkey, Apr. 15–20, 2007, pp.1096–1105.

  11. Reynolds P, Vahdat A. Efficient peer-to-peer keyword searching. In Proc. Int. Conf. Middleware, 2003, Rio de Janeiro, Brazil, June 16–20, pp.21–40.

  12. Bender M, Michel S, Triantafillou P, Weikum G, Zimmer C. Minerva: Collaborative P2P search. In Proc. VLDB, Trondheim, Norway, Aug. 30–Sept. 2, 2005, pp.1263–1266.

  13. Crespo A, Garcia-Molina H. Semantic overlay networks for P2P systems. In Proc. AP2PC, New York, USA, July 19, 2004, pp.1–13.

  14. Lu J, Callan J. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In Proc. ECIR, Santiago de Compostela, Spain, Mar. 21–23, 2005, pp.52–66.

  15. Godfrey P B, Shenker S, Stoica I. Minimizing churn in distributed systems. In Proc. ACM SIGCOMM, Pisa, Italy, Sept. 11–15, 2006, pp.147–158.

  16. Rhea S, Geels D, Roscoe T, Kubiatowicz J. Handling churn in a DHT. In Proc. USENIX Annual Technical Conference, Boston, USA, June 27–July 2, 2004, pp.127–140.

  17. Linga P, Gupta I, Birman K. A churn-resistant peer-to-peer web caching system. In Proc. Workshop on Survivable and Self-Regenerative System, Fairfax, USA, Oct. 31, 2003, pp.1–10.

  18. Stutzbach D, Rejaie R. Understanding churn in peer-to-peer networks. In Proc. Internet Measurement Conference, Rio de Janeiro, Brazil, Oct. 25–27, 2006, pp.189–202.

  19. Klemm F, Aberer K. Aggregation of a term vocabulary for peer-to-peer information retrieval: A DHT stress test. In Proc. DBISP2P, Trondheim, Norway, Aug. 28–29, 2005, pp.187–194.

  20. Stocia I, Morris R, Karger D, Kaashoek F, Balakrishnan H. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proc. ACM SIGCOMM, San Diego, USA, Aug. 27–31, 2001, pp.149–160.

  21. Michel S, Triantafillou P, Weikum G. Klee: A framework for distributed top-k query algorithms. In Proc. VLDB, Trondheim, Norway, Aug. 30–Sept. 2, 2005, pp.637–648.

  22. Vlachou A, Doulkeridis C, Norvag K, Vazirgiannis M. On efficient top-k query processing in highly distributed environments. In Proc. SIGMOD, Vancouver, Canada, June 9–12, 2008, pp.753–764.

  23. Zhou S, Zhang Z, Qian W, Zhou A. SIPPER: Selecting informative peers in structured P2P environment for content-based retrieval. In Proc. ICDE, Atlanta, USA, Apr. 3–7, 2006, pp.161–162.

  24. Xu J, Croft W B. Cluster-based language models for distributed retrieval. In Proc. ACM SIGIR Conf. Research and Development in Information Retrieval, Berkeley, USA, Aug. 15–19, 1999, pp.254–261.

  25. Ratnasamy S, Francis P, Handley M, Karp R M, Shenker S. A scalable content-addressable network. In Proc. ACM SIG-COMM, San Diego, USA, Aug. 27–31, 2001, pp.161–172.

  26. El-Ansary S, Alima L O, Brand P, Haridi S. Effcient broad-cast in structured P2P networks. In Proc. IPTPS, Berkeley, USA, Feb. 21–22, 2003, pp.304–314.

  27. Zhai C, Lafferty J D. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. ACM SIGIR Conf. Research and Development in Information Retrieval, New Orleans, USA, Sept. 9–13, 2001, pp.334–342.

  28. Callan J. Distributed Information Retrieval. Advances in Information Retrieval, Kluwer Academic Publishers, 2000, pp.127–150.

  29. Gravano L, Garcia-Molina H, Tomasic A. Gloss: Text-source discovery over the Internet. ACM Transactions of DatabaseSystem, 1999, 24(2): 229–264.

    Article  Google Scholar 

  30. Callan J, Lu Z, Croft W B. Searching distributed collections with inference networks. In Proc. ACM SIGIR Conf. Research and Development in Information Retrieval,Seattle, USA, July 9–13, 1995, pp.21–28.

  31. Callan J et al. Peer to peer testbed definitions. http://boston.lti.cs.cmu.edu/callan/data/#p2p/trecwt10g-query-bydoc.v1.txt.gz.

  32. Lu J. Full-text federated search in peer-to-peer networks [Ph.D. Dissertation]. Carnegie Mellon University, 2007.

  33. Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval. Addison-Wesley, 1999.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke Chen.

Additional information

This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 60803003, 60970124, 60903038, and by the Science and Technology Projects of Zhejiang Province under Grant No. 2008C14G2010007.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ren, ZJ., Chen, K., Shou, LD. et al. HAPS: Supporting Effective and Efficient Full-Text P2P Search with Peer Dynamics. J. Comput. Sci. Technol. 25, 482–498 (2010). https://doi.org/10.1007/s11390-010-9339-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-010-9339-8

Keywords

Navigation