Skip to main content
Log in

Popularity-aware prefetch in P2P range caching

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Unstructured peer-to-peer infrastructure has been widely employed to support large-scale distributed applications. Many of these applications, such as location-based services and multimedia content distribution, require the support of range selection queries. Under the widely-adopted query shipping protocols, the cost of query processing is affected by the number of result copies or replicas in the system. Since range queries can return results that include poorly-replicated data items, the cost of these queries is usually dominated by the retrieval cost of these data items. In this work, we propose a popularity-aware prefetch-based approach that can effectively facilitate the caching of poorly-replicated data items that are potentially requested in subsequent range queries, resulting in substantial cost savings. We prove that the performance of retrieving poorly-replicated data items is guaranteed to improve under an increasing query load. Extensive experiments show that the overall range query processing cost decreases significantly under various query load settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.gnutella.com

  2. http://www.bittorrent.com/

  3. For conciseness, we use range query and range selection query interchangeably in the remainder of the paper.

  4. http://www.joost.com

  5. http://www.gnutella.com

  6. We define “correlation” in Section 3.1.

  7. In this work, we focus on the query shipping cost to locate query results, ignoring local processing cost.

  8. The actual number of replicas equals the square root of the corresponding query load size multiplied by a constant factor [4].

  9. This does not conflict with the focus on range query processing since range queries may include multiple point values.

  10. p2psim:http://pdos.csail.mit.edu/p2psim/

  11. In this experiment, each epoch lasts 5×106 ms.

References

  1. Balke W, Nejdl W, Siberski W, Thaden U (2005) Progressive distributed top-k retrieval in peer-to-peer networks. In: Proc int conf on data engineering, pp 174–185

  2. Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    Article  MathSciNet  Google Scholar 

  3. Cheng B, Liu X, Zhang Z, Jin H (2007) A measurement study of a peer-to-peer video-on-demand system. In: Peer-to-peer systems, first international workshop

  4. Cohen E, Shenker S (2002) Replication strategies in unstructured peer-to-peer networks. In: Proc ACM SIGCOMM, pp 177–190

  5. Crainiceanu A, Linga P, Gehrke J, Shanmugasundaram J (2004) Querying peer-to-peer networks using P-Trees. In: Proc 7th int workshop on the world wide web and databases (WebDB), pp 25–30

  6. Edwards HM (1974) Riemann’s zeta function. Academic, London

    MATH  Google Scholar 

  7. Gkantsidis C, Mihail M, Saberi A (2004) Random walks in peer-to-peer networks. In: Proc 23rd annual joint conference of the IEEE computer and communications societies

  8. Gopalakrishnan V, Silaghi B, Bhattacharjee B, Keleher P (2004) Adaptive replication in peer-to-peer systems. In: Proc 24th int conf on distributed computing systems, pp 360–369

  9. Huebsch R, Hellerstein JM, Lanham N, Loo BT, Shenker S, Stoica I (2003) Querying the internet with PIER. In: Proc 29th int conf on very large data bases, pp 321–332

  10. Iyer S, Rowstron AIT, Druschel P (2002) Squirrel: a decentralized peer-to-peer web cache. In: Proc ACM SIGACT-SIGOPS symp on principles of dist comp, pp 213–222

  11. Jagadish HV, Ooi BC, Vu QH (2005) BATON: a balanced tree structure for peer-to-peer networks. In: Proc 31th int conf on very large data bases

  12. Jelasity M, Voulgaris S, Guerraoui R, Kermarrec A-M, van Steen M (2007) Gossip-based peer sampling. ACM Trans Comput Syst 25(3)

  13. Kothari A, Agrawal D, Gupta A, Suri S (2003) Range addressable network: a P2P cache architecture for data ranges. In: Peer-to-peer computing, pp 14–22

  14. Ramabhadran S, Ratnasamy S, Hellerstein JM, Shenker S (2004) Brief announcement: prefix hash tree. In: Proc ACM SIGACT-SIGOPS symp on principles of dist comp

  15. Ramakrishnan R, Gehrke J (2002) Database management systems. McGraw-Hill, New York

    Google Scholar 

  16. Ramasubramanian V, Sirer EG (2004) The design and implementation of a next generation name service for the internet. In: Proc ACM SIGCOMM, pp 331–342

  17. Sahin OD, Gupta A, Agrawal D, Abbadi AE (2004) A peer-to-peer framework for caching range queries. In: Proc 20th int conf on data engineering, pp 165–176

  18. Scott D (1992) Multivariate density estimation: theory, practice and visualization. Wiley, New York

    Book  MATH  Google Scholar 

  19. Stallings W (2004) Operating systems: internals and design principles. Prentice Hall, Englewood Cliffs

    Google Scholar 

  20. Terpstra WW, Kangasharju J, Leng C, Buchmann AP (2007) Bubblestorm: resilient, probabilistic, and exhaustive peer-to-peer search. In: Proc ACM SIGCOMM, pp 49–60

  21. Valduriez P, Pacitti E (2004) Data management in large-scale P2P systems. In: High performance computing for computational science—VECPAR 2004, 6th international conference, pp 104–118

  22. Wang C, Xiao L, Liu Y, Zheng P (2006) DiCAS: an efficient distributed caching mechanism for P2P systems. IEEE Trans Parallel Distrib Syst 17(10):1097–1109

    Article  Google Scholar 

  23. Yang B, Garcia-Molina H (2002) Improving search in peer-to-peer networks. In: Proc 22nd int conf on distributed computing systems, pp 5–12

  24. Zhang R, Hu YC (2005) Assisted peer-to-peer search with partial indexing. In: The 24st annual joint conference of the IEEE computer and communications societies, pp 1514–1525

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Daudjee, K. & Özsu, M.T. Popularity-aware prefetch in P2P range caching. Peer-to-Peer Netw. Appl. 3, 145–160 (2010). https://doi.org/10.1007/s12083-009-0054-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-009-0054-6

Keywords

Navigation