Abstract
In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hybrid redundancy scheme, which shares user downloaded files for subsequent accesses and utilizes erasure coding to adjust file availability. Comparison experiments of three schemes show that replication saves more bandwidth than erasure coding, although it requires more storage space, when average node availability is higher than 47%; moreover, our hybrid scheme saves more maintenance bandwidth with acceptable redundancy factor.
Similar content being viewed by others
References
Ratnasamy S, Francis P, Handley M, Karp R, Shenker S (2001) a scalabe content addressable network. In: Proc of ACM SIGCOMM, 2001, pp 161–172
Stoica I, Morris R, Karger D, Kaashoek MF, Balakrishnan H (2001) Chord: a scalable peer-to-peer lookup service for Internet applications. In: Proc of ACM SIGCOMM, 2001, pp 149–160
Rowstron A, Druschel P (2001) Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer system. In: Proc of Middleware, 2001, pp 329–350
Zhao B, Huang L, Stribling J, Rhea SC, Joseph AD, Kubiatowicz JD (2004) Tapestry: a resilient global-scale overlay for service deployment. IEEE Trans Select Areas Commun 22(1):41–53
Malkhi D, Naor M, Ratajczak D (2002) Viceroy: a scalable and dynamic emulation of the buttterfly. In: Proc of Principles of Distributed Computing, 2002, pp 183–192
Shen H, Xu CZ, Ghen G (2004) Cycloid a constant-degree and lookup-efficient P2P overlay network. In: Proc of IPDPS, 2004, pp 26–30
Dabek F, Kaashoek MF, Karger D, Morris R, Stoica I (2001) Wide-area cooperative storage with CFS. In: Proc of ACM SOSP, 2001, pp 202–215
Dabek F, Li J, Sit E, Robertson J, Kaashoek F, Morris R (2004) Designing a DHT for low latency and high throughput. In: Proc of NSDI, 2004, pp 85–98
Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: an architecture for global-scale persistent storage. In: Proc of ASPLOS, 2000, pp 190–201
Rowstron A, Druschel P (2001) Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proc of SOSP, 2001, pp 188–201
Bhagwan R, Tati K, Cheng Y, Savage S, Voelker G (2004) Total recall: system support for automated avalability management. In: Proc of NSDI, 2004, pp 337–350
Weatherspoon H, Kubiatowicz J (2002) Erasure coding vs. replication: a quantitative comparison. In: Proc of IPTPS, 2002, pp 328–338
Blake C, Rodrigues R (2003) High availability, scalable storage, dynamic peer networks: pick two. In: Proc of HotOS-IX, 2003, pp 1–6
Rodrigues R, Liskow B (2005) High availability in DHTs: erasure coding vs. replication. In: Proc of IPTPS, 2005, pp 226–239
Bolosky WJ, Douceur JR, Ely D, Theimer M (2000) Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In: Proc of SIGMETRICS, 2000, pp 34–43
Patterson D, Gibson G, Katz R (1988) The case of raid Redundant arrays of inexpensive disks. In: Proc of SIGMOD, 1988, pp 109–116
Reed S, Solomon G (1960) Polynomial codes over certain finite fields. J SIAM 8:300–304
Ranganathan K, Iamnitchi A, Foster I (2002) Improving data availability through dynamic model-driven replicatiion in large peer-to-peer communities. In: Proc of CCGRID, 2002, p 376
Cuenca-Acuna FM, Martin RP, Nguyen TD (2003) Autonomous replication for high availability in unstructured P2P systems. In: Proc of SRDS, 2003, pp 99–108
Karger D, Lehman E, Leighton F, Levine M, Lewin D, Panigrahy R (1997) Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on world wide web. In: Proc of STC, 1997, pp 654–663
Bhagwan R, Savage S, Voelker G (2002) Replication strategies for highly available peer-to-peer storage systems. UCSD Technical Report CS2002-0726
Byers JW, Luby M, Mitzenmacher M, Rege A (1998) a digital fountain approach to reliable distribtuion of bulk data. In: Proc of SIGCOMM, 1998, pp 56–67
Gil T, Kaashoek F, Li J, Morris R, Stribling J p2psim: a simulator for peer-to-peer protocols. http://www.pdos.lcs.mit.edu/p2psim/
Breslau L, Cao P, Fan L, Phillips G, Schenker S (1999) Web-caching and zipf-like distribution: evidence and implications. In: Proc of INFOCOM, 1999, pp 126–134
Merzbacher M, Patterson D (2002) Measuring end-user availability on the web: practical experience. In: Proc of IPDS, 2002, pp 473–477
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, G., Qiu, T. & Wu, F. Insight into redundancy schemes in DHTs. J Supercomput 43, 183–198 (2008). https://doi.org/10.1007/s11227-007-0126-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0126-4