Skip to main content

Advertisement

Log in

Peer sampling with improved accuracy

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Node sampling services provide peers in a peer-to-peer system with a source of randomly chosen addresses of other nodes. Ideally, samples should be independent and uniform. The restrictions of a distributed environment, however, introduce various dependancies between samples. We review gossip-based sampling protocols proposed in previous work, and identify sources of inaccuracy. These include replicating the items from which samples are drawn, and imprecise management of the process of refreshing items. Based on this analysis, we propose a new protocol, Eddy, which aims to minimize temporal and spatial dependancies between samples. We demonstrate, through extensive simulation experiments, that these changes lead to an improved sampling service. Eddy maintains a balanced distribution of items representing active system nodes, even in the face of realistic levels of message loss and node churn. As a result, it behaves more like a centralized random number generator than previous protocols. We demonstrate this by showing that using Eddy improves the accuracy of a simple algorithm that uses random samples to estimate the size of a peer-to-peer network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. To simplify the pseudo-code the assumption is made that p j has exactly C items in its cache (line 10). If the cache is larger, only C join requests should be sent. If the cache is smaller, p i can fill its remaining cache slots with its own items

  2. The cache size is occasionally larger than C due to gossip exchanges being interrupted by requests from other nodes.

References

  1. Allavena A, Demers A, Hopcroft JE (2005) Correctness of a gossip based membership protocol. In: PODC ’05: proceedings of the 24th annual ACM symposium on principles of distributed computing. ACM, New York, pp 292–301

    Google Scholar 

  2. Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a peer-to-peer network. Technical report, Stanford University

  3. Demers A, Greene D, Hauser C, Irish W, Larson J, Shenker S, Sturgis H, Swinehart D, Terry D (1987) Epidemic algorithms for replicated database maintenance. In: PODC ’87: proceedings of the 6th annual ACM symposium on principles of distributed computing. ACM, New York, pp 1–12

    Chapter  Google Scholar 

  4. Drost N, Ogston E, van Nieuwpoort RV, Bal HE (2007) Arrg: real-world gossiping. In: HPDC ’07: proceedings of the 16th international symposium on high performance distributed computing. ACM, New York, pp 147–158

    Chapter  Google Scholar 

  5. Eugster P, Guerraoui R, Handurukande S, Kouznetsov P, Kermarrec A-M (2003) Lightweight probabilistic broadcast. ACM Trans Comput Syst 21(4):341–374

    Article  Google Scholar 

  6. Ganesh AJ, Kermarrec A-M, Massoulie L (2003) Peer-to-peer membership management for gossip-based protocols. IEEE Trans Comput 52(2):139–149

    Article  Google Scholar 

  7. Iwanicki K, van Steen M, Voulgaris S (2006) Gossip-based clock synchronization for large decentralized systems. In: SelfMan ’06: proceedings of the second IEEE international workshop on self-managed networks, systems and services, Dublin, June 2006, pp 28–42

  8. Jelasity M, Kowalczyk W, van Steen M (2003) Newscast computing. Technical report, Vrije Universiteit Amsterdam, Department of Computer Science

  9. Jelasity M, Voulgaris S, Guerraoui R, Kermarrec A-M, van Steen M (2007) Gossip-based peer sampling. ACM Trans Comput Syst 25(3):8

    Article  Google Scholar 

  10. Karp R, Schindelhauer C, Shenker S, Vocking B (2000) Randomized rumor spreading. In: FOCS ’00: proceedings of the 41st annual IEEE symposium on foundations of computer science. IEEE Computer Society, Los Alamitos, pp 565–574

    Chapter  Google Scholar 

  11. Kempe D, Dobra A, Gehrke J (2003) Gossip-based computation of aggregate information. In: FOCS ’03: proceedings of the 44th annual IEEE symposium on foundations of computer science. IEEE Computer Society, Washington, DC, p 482

    Chapter  Google Scholar 

  12. Kermarrec A-M, Massoulie L, Ganesh AJ (2003) Probabilistic reliable dissemination in large-scale systems. IEEE Trans Parallel Distrib Syst 14(3):248–258

    Article  Google Scholar 

  13. Kostoulas D, Psaltoulis D, Gupta I, Birman K, Demers A (2005) Decentralized schemes for size estimation in large and dynamic groups. In: NCA ’05: proceedings of the fourth IEEE international symposium on network computing and applications. IEEE Computer Society, Washington, DC, pp 41–48

    Chapter  Google Scholar 

  14. Leonard D, Yao Z, Rai V, Loguinov D (2007) On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks. IEEE/ACM Trans Netw 15(3):644–656

    Article  Google Scholar 

  15. Ogston E, Jarvis SA (2008) Improving the accuracy of peer-to-peer sampling services. In: ComP2P ’08: proceedings of the first international workshop on computational P2P networks: theory and practice. IEEE Computer Society, Athens

    Google Scholar 

  16. Ogston E, Overeinder B, van Steen M, Brazier F (2003) A method for decentralized clustering in large multi-agent systems. In: AAMAS ’03: proceedings of the second international joint conference on autonomous agent and multi agent systems, Melbourne, July 2003, pp 798–796

  17. Stavrou A, Rubenstein D, Sahu S (2002) A lightweight, robust p2p system to handle flash crowds. In: ICNP ’02: proceedings of the 10th IEEE international conference on network protocols. IEEE Computer Society, Washington, DC, pp 226–235

    Chapter  Google Scholar 

  18. Tan G, Jarvis SA (2007) Improving the fault resilience of overlay multicast for media streaming. IEEE Trans Parallel Distrib Syst 18(6):721–734

    Article  Google Scholar 

  19. Voulgaris S, Gavidia D, van Steen M (2005) Cyclon: inexpensive membership management for unstructured p2p overlays. J Netw Syst Manag 13(2):197–217, June

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded in part by the Engineering and Physical Sciences Research Council (EPSRC) UK grant number EP/F000936/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elth Ogston.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ogston, E., Jarvis, S.A. Peer sampling with improved accuracy. Peer-to-Peer Netw. Appl. 2, 24–36 (2009). https://doi.org/10.1007/s12083-008-0017-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-008-0017-3

Keywords

Navigation