Skip to main content
Log in

Towards bandwidth-efficient keyword continuous query processing over DHTs

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

In this paper, we focus our studies on a distributed keyword continuous query processing system that is built on distributed hash tables. Treating bandwidth as a first-class resource, we propose novel query indexing algorithms including MHI and SAP-MHI, multicast-based document announcement, and adaptive query resolution to reduce bandwidth cost. Our detailed simulations show that our proposed techniques, combined together, effectively and greatly cut down bandwidth consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Document alert is outside of the scope of the paper since it can be decoupled from the system and implemented in various ways, e.g., via multicast, emails or RSS.

  2. Note that term IDs uniquely identify terms. Each query inverted list may additionally store the term ID so that the term ID uniquely identifies the inverted list.

  3. Given k term IDs contained in an announcement message, if we piggyback r successor term IDs for each of these k term IDs, the total number of term IDs in the message is at most k+r rather than kr. So, the piggyback overhead is at most r term IDs.

  4. The number of documents in which a term appears.

  5. In this paper, we set y to be larger than \(\log {N}\) where N is the number of nodes in the underlying DHT.

References

  1. Zhu Y (2008) Bandwidth-efficient continuous query processing over DHTs. In: Proceedings of ICPP

  2. Kannan J, Yang B, Shenker S, Sharma P, Banerjee S, Basu S, Lee S-J (2006) Smartseer: Using a DHT to process continuous queries over peer-to-peer networks. In: Proceedings of IEEE INFOCOM

  3. Nath S, Gibbons PB, Seshan S, Anderson ZR (2004) Synopsis diffusion for robust aggregation in sensor networks. In: Proceedings of the 2nd international conference on embedded networked sensor systems. New York

  4. Reynolds P, Vahdat A (2003) Efficient peer-to-peer keyword searching. In: Proceedings of ACM/IFIP/USENIX international middleware conference (Middleware). Rio de Janeiro, pp 21–40

  5. Li J, Loo BT, Hellerstein J, Kaashoek F, Karger DR, Morris R (2003) On the feasibility of peer-to-peer web indexing and search. In: Proceedings of the 2nd international workshop on peer-to-peer systems (IPTPS). Berkeley, pp 207–215

  6. Tang C, Xu Z, Dwarkadas S (2003) Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proceedings of ACM SIGCOMM. Karlsruhe, pp 175–186

  7. Zhu Y, Hu Y (2007) Efficient semantic search on DHT overlays. J Parallel Distrib Comput 67:604–616

    Article  MATH  Google Scholar 

  8. Avnur R, Hellerstein JM Eddies: continuously adaptive query processing. In: Proceedings of ACM SIGMOD

  9. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of ACM symposium on principles of database systems. New York, pp 1–16

  10. Bollacker KD, Lawrence S, Giles CL (1999) A system for automatic personalized tracking of scientific literature on the web. In: Proceedings of ACM conference on digital libraries. New York, pp 105–113

  11. Huebsch R, Hellerstein JM, Lanham N, Loo BT, Shenker S, Stoica I (2003) Querying the internet with pier. In: Proceedings of VLDB, pp 321–332

  12. Zhu Y, Hu Y (2007) Ferry: A P2P-based architecture for content-based publish/subscribe services. IEEE Trans Parallel Distrib Syst 18:672–685

    Article  Google Scholar 

  13. Rowstron AIT, Kermarrec A-M, Castro M, Druschel P (2001) SCRIBE: the design of a large-scale event notification infrastructure. In: Proceedings of the 3rd international networked group communication, pp 30–43

  14. Castro M, Druschel P, Kermarrec A-M, Nandi A, Rowstron A, Singh A (2003) Splitstream: high-bandwidth multicast in cooperative environments. In: Proceedings of the 19th ACM symposium on operating systems principles (SOSP). Bolton Landing

  15. Gupta A, Sahin OD, Agrawal D, Abbadi AE (2004) Meghdoot: content-based publish/subscribe over P2P networks. In: ACM/IFIP/USENIX 5th international middleware conference. Toronto

  16. Stoica I, Morris R, Karger D, Kaashoek M, Balakrishnan H (2001) Chord: a scalable peer-to-peer lookup service for internet applications. In: Proceedings of ACM SIGCOMM. San Diego, pp 149–160

  17. Xie Y, O’Hallaron D (2002) Locality in search engine queries and its implications for caching. In: Proceedings of INFOCOM

  18. Patch K Net scan finds like-minded users. http://www.trnmag.comi/ftories/2003/050703/Net_scan_finds_like-minded_users_050703.html

  19. Text retrieval conference (trec). http://trec.nist.org

  20. Berry MW, Drmac Z, Jessup ER (1999) Matrices, vector spaces, and information retrieval. SIAM Rev 41(2):335–362

    Article  MATH  MathSciNet  Google Scholar 

  21. Castro M, Druschel P, Ganesh A, Rowstron A, Wallach DS (2002) Secure routing for structured peer-to-peer overlay networks. In: Proceedings of OSDI, pp 299–314

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingwu Zhu.

Additional information

Extended version of the work [1] presented in Proceedings of ICPP’08.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y. Towards bandwidth-efficient keyword continuous query processing over DHTs. Peer-to-Peer Netw. Appl. 9, 142–158 (2016). https://doi.org/10.1007/s12083-014-0319-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-014-0319-6

Keywords

Navigation