Skip to main content
Log in

Insight into redundancy schemes in DHTs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hybrid redundancy scheme, which shares user downloaded files for subsequent accesses and utilizes erasure coding to adjust file availability. Comparison experiments of three schemes show that replication saves more bandwidth than erasure coding, although it requires more storage space, when average node availability is higher than 47%; moreover, our hybrid scheme saves more maintenance bandwidth with acceptable redundancy factor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ratnasamy S, Francis P, Handley M, Karp R, Shenker S (2001) a scalabe content addressable network. In: Proc of ACM SIGCOMM, 2001, pp 161–172

  2. Stoica I, Morris R, Karger D, Kaashoek MF, Balakrishnan H (2001) Chord: a scalable peer-to-peer lookup service for Internet applications. In: Proc of ACM SIGCOMM, 2001, pp 149–160

  3. Rowstron A, Druschel P (2001) Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer system. In: Proc of Middleware, 2001, pp 329–350

  4. Zhao B, Huang L, Stribling J, Rhea SC, Joseph AD, Kubiatowicz JD (2004) Tapestry: a resilient global-scale overlay for service deployment. IEEE Trans Select Areas Commun 22(1):41–53

    Article  Google Scholar 

  5. Malkhi D, Naor M, Ratajczak D (2002) Viceroy: a scalable and dynamic emulation of the buttterfly. In: Proc of Principles of Distributed Computing, 2002, pp 183–192

  6. Shen H, Xu CZ, Ghen G (2004) Cycloid a constant-degree and lookup-efficient P2P overlay network. In: Proc of IPDPS, 2004, pp 26–30

  7. Dabek F, Kaashoek MF, Karger D, Morris R, Stoica I (2001) Wide-area cooperative storage with CFS. In: Proc of ACM SOSP, 2001, pp 202–215

  8. Dabek F, Li J, Sit E, Robertson J, Kaashoek F, Morris R (2004) Designing a DHT for low latency and high throughput. In: Proc of NSDI, 2004, pp 85–98

  9. Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: an architecture for global-scale persistent storage. In: Proc of ASPLOS, 2000, pp 190–201

  10. Rowstron A, Druschel P (2001) Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proc of SOSP, 2001, pp 188–201

  11. Bhagwan R, Tati K, Cheng Y, Savage S, Voelker G (2004) Total recall: system support for automated avalability management. In: Proc of NSDI, 2004, pp 337–350

  12. Weatherspoon H, Kubiatowicz J (2002) Erasure coding vs. replication: a quantitative comparison. In: Proc of IPTPS, 2002, pp 328–338

  13. Blake C, Rodrigues R (2003) High availability, scalable storage, dynamic peer networks: pick two. In: Proc of HotOS-IX, 2003, pp 1–6

  14. Rodrigues R, Liskow B (2005) High availability in DHTs: erasure coding vs. replication. In: Proc of IPTPS, 2005, pp 226–239

  15. Bolosky WJ, Douceur JR, Ely D, Theimer M (2000) Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In: Proc of SIGMETRICS, 2000, pp 34–43

  16. Patterson D, Gibson G, Katz R (1988) The case of raid Redundant arrays of inexpensive disks. In: Proc of SIGMOD, 1988, pp 109–116

  17. Reed S, Solomon G (1960) Polynomial codes over certain finite fields. J SIAM 8:300–304

    MATH  MathSciNet  Google Scholar 

  18. Ranganathan K, Iamnitchi A, Foster I (2002) Improving data availability through dynamic model-driven replicatiion in large peer-to-peer communities. In: Proc of CCGRID, 2002, p 376

  19. Cuenca-Acuna FM, Martin RP, Nguyen TD (2003) Autonomous replication for high availability in unstructured P2P systems. In: Proc of SRDS, 2003, pp 99–108

  20. Karger D, Lehman E, Leighton F, Levine M, Lewin D, Panigrahy R (1997) Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on world wide web. In: Proc of STC, 1997, pp 654–663

  21. Bhagwan R, Savage S, Voelker G (2002) Replication strategies for highly available peer-to-peer storage systems. UCSD Technical Report CS2002-0726

  22. Byers JW, Luby M, Mitzenmacher M, Rege A (1998) a digital fountain approach to reliable distribtuion of bulk data. In: Proc of SIGCOMM, 1998, pp 56–67

  23. Gil T, Kaashoek F, Li J, Morris R, Stribling J p2psim: a simulator for peer-to-peer protocols. http://www.pdos.lcs.mit.edu/p2psim/

  24. Breslau L, Cao P, Fan L, Phillips G, Schenker S (1999) Web-caching and zipf-like distribution: evidence and implications. In: Proc of INFOCOM, 1999, pp 126–134

  25. Merzbacher M, Patterson D (2002) Measuring end-user availability on the web: practical experience. In: Proc of IPDS, 2002, pp 473–477

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guihai Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, G., Qiu, T. & Wu, F. Insight into redundancy schemes in DHTs. J Supercomput 43, 183–198 (2008). https://doi.org/10.1007/s11227-007-0126-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-007-0126-4

Keywords