Skip to main content

Efficient Replication of Large Data Objects

  • Conference paper
Distributed Computing (DISC 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2848))

Included in the following conference series:

  • 310 Accesses

Abstract

We present a new distributed data replication algorithm tailored especially for large-scale read/write data objects such as files. The algorithm guarantees atomic data consistency, while incurring low latency costs. The key idea of the algorithm is to maintain copies of the data objects separately from information about the locations of up-to-date copies. Because it performs most of its work using only the location information, our algorithm needs to access only a few copies of the actual data; specifically, only one copy during a read and only f+1 copies during a write, where f is an assumed upper bound on the number of copies that can fail. These bounds are optimal. The algorithm works in an asynchronous message-passing environment. It does not use additional mechanisms such as group communication or distributed locking. It is suitable for implementation in WANs as well as LANs. We also present two lower bounds on the costs of data replication. The first lower bound is on the number of low-level writes required during a read operation on the data. The second bound is on the minimum space complexity of a class of efficient replication algorithms. These lower bounds suggest that some of the techniques used in our algorithm are necessary. They are also of independent interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Adya, A., Bolosky, W., Castro, M., Cermak, G., et al.: Farsite: Federated, available, and reliable storage for an incompletely trusted environment. In: Proceedings of the fifth symposium on operating systems design and implementation (2002)

    Google Scholar 

  2. Amir, Y., Dolev, D., Melliar-Smith, P., Moser, L.: Robust and efficient replication using group communication (1994)

    Google Scholar 

  3. Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message-passing systems. Journal of the ACM 42(1), 124–142 (1995)

    Article  MATH  Google Scholar 

  4. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency control and recovery in database systems. Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1987)

    Google Scholar 

  5. Breitbart, Y., Korth, H.F.: Replication and consistency: being lazy helps sometimes. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp. 173–184. ACM Press, New York (1997)

    Chapter  Google Scholar 

  6. Fan, R.: Efficient replication of large data-objects. Technical Report MIT-LCS-TR-886, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139 (February 2003)

    Google Scholar 

  7. Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the seventh symposium on Operating systems principles, pp. 150–162 (1979)

    Google Scholar 

  8. Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 173–182. ACM Press, New York (1996)

    Chapter  Google Scholar 

  9. Welch, J., Attiya, H.: Distributed Computing. McGraw Hill International, Ltd., New York (1998)

    Google Scholar 

  10. Herlihy, M.P., Wing, J.M.: Axioms for concurrent objects, pp. 13–26. ACM Press, New York (1987)

    Google Scholar 

  11. Herlihy, M.: Wait-free synchronization. ACM Transactions on Programming Languages and Systems 13(1), 124–149 (1991)

    Article  Google Scholar 

  12. Ladin, R., Liskov, B., Shrira, L., Ghemawat, S.: Providing high availability using lazy replication. ACM Transactions on Computer Systems 10(4), 360–391 (1992)

    Article  Google Scholar 

  13. Lynch, N.: Distributed Algorithms. Morgan Kaufmann Publishers, Inc., San Mateo (March 1996)

    MATH  Google Scholar 

  14. Lynch, N., Shvartsman, A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing (FTCS 1997), Seattle, Washington, USA, June 1997, pp. 272–281. IEEE, Los Alamitos (1997)

    Google Scholar 

  15. Pacitti, E., Minet, P., Simon, E.: Fast algorithms for maintaining replica consistency in lazy master replicated databases. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, September 7-10, pp. 126–137. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  16. Paris, J.-F.: Voting with witnesses: A consistency scheme for replicated files. In: Proceedings of the 6th International Conference on Distributed Computing Systems (ICDCS), pp. 606–612. IEEE Computer Society, Los Alamitos (1986)

    Google Scholar 

  17. Petersen, K., Li, K.: An evaluation of multiprocessor cache coherence based on virtual memory support. In: Proc. of the 8th Int’l Parallel Processing Symp (IPPS 1994), pp. 158–164 (1994)

    Google Scholar 

  18. van Renesse, R., Tanenbaum, A.S.: Voting with ghosts. In: Proceedings of the 8th International Conference on Distributed Computing Systems (ICDCS), pp. 456–462. IEEE Computer Society, Los Alamitos (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fan, R., Lynch, N. (2003). Efficient Replication of Large Data Objects. In: Fich, F.E. (eds) Distributed Computing. DISC 2003. Lecture Notes in Computer Science, vol 2848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39989-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39989-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20184-7

  • Online ISBN: 978-3-540-39989-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics