Abstract
We present a new distributed data replication algorithm tailored especially for large-scale read/write data objects such as files. The algorithm guarantees atomic data consistency, while incurring low latency costs. The key idea of the algorithm is to maintain copies of the data objects separately from information about the locations of up-to-date copies. Because it performs most of its work using only the location information, our algorithm needs to access only a few copies of the actual data; specifically, only one copy during a read and only f+1 copies during a write, where f is an assumed upper bound on the number of copies that can fail. These bounds are optimal. The algorithm works in an asynchronous message-passing environment. It does not use additional mechanisms such as group communication or distributed locking. It is suitable for implementation in WANs as well as LANs. We also present two lower bounds on the costs of data replication. The first lower bound is on the number of low-level writes required during a read operation on the data. The second bound is on the minimum space complexity of a class of efficient replication algorithms. These lower bounds suggest that some of the techniques used in our algorithm are necessary. They are also of independent interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adya, A., Bolosky, W., Castro, M., Cermak, G., et al.: Farsite: Federated, available, and reliable storage for an incompletely trusted environment. In: Proceedings of the fifth symposium on operating systems design and implementation (2002)
Amir, Y., Dolev, D., Melliar-Smith, P., Moser, L.: Robust and efficient replication using group communication (1994)
Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message-passing systems. Journal of the ACM 42(1), 124–142 (1995)
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency control and recovery in database systems. Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1987)
Breitbart, Y., Korth, H.F.: Replication and consistency: being lazy helps sometimes. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp. 173–184. ACM Press, New York (1997)
Fan, R.: Efficient replication of large data-objects. Technical Report MIT-LCS-TR-886, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139 (February 2003)
Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the seventh symposium on Operating systems principles, pp. 150–162 (1979)
Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 173–182. ACM Press, New York (1996)
Welch, J., Attiya, H.: Distributed Computing. McGraw Hill International, Ltd., New York (1998)
Herlihy, M.P., Wing, J.M.: Axioms for concurrent objects, pp. 13–26. ACM Press, New York (1987)
Herlihy, M.: Wait-free synchronization. ACM Transactions on Programming Languages and Systems 13(1), 124–149 (1991)
Ladin, R., Liskov, B., Shrira, L., Ghemawat, S.: Providing high availability using lazy replication. ACM Transactions on Computer Systems 10(4), 360–391 (1992)
Lynch, N.: Distributed Algorithms. Morgan Kaufmann Publishers, Inc., San Mateo (March 1996)
Lynch, N., Shvartsman, A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing (FTCS 1997), Seattle, Washington, USA, June 1997, pp. 272–281. IEEE, Los Alamitos (1997)
Pacitti, E., Minet, P., Simon, E.: Fast algorithms for maintaining replica consistency in lazy master replicated databases. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, September 7-10, pp. 126–137. Morgan Kaufmann, San Francisco (1999)
Paris, J.-F.: Voting with witnesses: A consistency scheme for replicated files. In: Proceedings of the 6th International Conference on Distributed Computing Systems (ICDCS), pp. 606–612. IEEE Computer Society, Los Alamitos (1986)
Petersen, K., Li, K.: An evaluation of multiprocessor cache coherence based on virtual memory support. In: Proc. of the 8th Int’l Parallel Processing Symp (IPPS 1994), pp. 158–164 (1994)
van Renesse, R., Tanenbaum, A.S.: Voting with ghosts. In: Proceedings of the 8th International Conference on Distributed Computing Systems (ICDCS), pp. 456–462. IEEE Computer Society, Los Alamitos (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fan, R., Lynch, N. (2003). Efficient Replication of Large Data Objects. In: Fich, F.E. (eds) Distributed Computing. DISC 2003. Lecture Notes in Computer Science, vol 2848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39989-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-39989-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20184-7
Online ISBN: 978-3-540-39989-6
eBook Packages: Springer Book Archive