Skip to main content
Log in

Rambo: a robust, reconfigurable atomic memory service for dynamic networks

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

In this paper, we present Rambo, an algorithm for emulating a read/write distributed shared memory in a dynamic, rapidly changing environment. Rambo provides a highly reliable, highly available service, even as participants join, leave, and fail. In fact, the entire set of participants may change during an execution, as the initial devices depart and are replaced by a new set of devices. Even so, Rambo ensures that data stored in the distributed shared memory remains available and consistent. There are two basic techniques used by Rambo to tolerate dynamic changes. Over short intervals of time, replication suffices to provide fault-tolerance. While some devices may fail and leave, the data remains available at other replicas. Over longer intervals of time, Rambo copes with changing participants via reconfiguration, which incorporates newly joined devices while excluding devices that have departed or failed. The main novelty of Rambo lies in the combination of an efficient reconfiguration mechanism with a quorum-based replication strategy for read/write shared memory. The Rambo algorithm can tolerate a wide variety of aberrant behavior, including lost and delayed messages, participants with unsynchronized clocks, and, more generally, arbitrary asynchrony. Despite such behavior, Rambo guarantees that its data is stored consistency. We analyze the performance of Rambo during periods when the system is relatively well-behaved: messages are delivered in a timely fashion, reconfiguration is not too frequent, etc. We show that in these circumstances, read and write operations are efficient, completing in at most eight message delays.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abraham I., Malkhi D.: Probabilistic quorums for dynamic systems. Distrib. Comput. 18(2), 113–124 (2005)

    Article  Google Scholar 

  2. Agrawal, D., El Abbadi, A.: Resilient logical structures for efficient management of replicated data. In: Proceedings of the International Conference on Very Large Data Bases, pp. 151–162 (1992)

  3. Aguilera, M.K., Keidar, I., Malkhi, D., Shraer, A.: Dynamic atomic storage without consensus. In: Proceedings of the Symposium on Principles of Distributed Computing, pp. 17–25 (2009)

  4. Albrecht, J.R., Saito, Y.: Rambo for Dummies. Technical Report HPL-2005-39, Hewlett-Packard (2005)

  5. Alvisi L., Malkhi D., Pierce E.T., Reiter M.K.: Fault detection for Byzantine quorum systems. Trans. Parallel Distrib. Syst. 12(9), 996–1007 (2001)

    Article  Google Scholar 

  6. Amir, Y., Dolev, D., Melliar-Smith, P.M., Moser, L.: Robust and Efficient Replication Using Group Communication. Technical Report 1994-20, Hebrew University (1994)

  7. Amir, Y., Wool, A.: Evaluating quorum systems over the internet. In: Proceedings of the International Symposium on Fault-Tolerant Computing, pp. 26–35 (1996)

  8. Attiya H., Bar-Noy A., Dolev D.: Sharing memory robustly in message-passing systems. J. ACM 42(1), 124–142 (1995)

    Article  MATH  Google Scholar 

  9. Beal, J., Gilbert, S.: RamboNodes for the metropolitan ad hoc network. In: Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks (2004)

  10. Bearden, M., Bianchini, R.P., Jr.: A fault-tolerant algorithm for decentralized on-line quorum adaptation. In: Proceedings of the International Symposium on Fault-Tolerant Computing Systems, pp. 262–271 (1998)

  11. Bernstein P.A., Hadzilacos V., Goodman N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)

    Google Scholar 

  12. Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  13. Charron-Bost, B., Schiper, A.: Improving fast Paxos: being optimistic with no overhead. In: Proceedings of the Pacific Rim International Symposium on Dependable Computing, pp. 287–295 (2006)

  14. Chockler, G., Gilbert, S., Gramoli, V., Musial, P.M., Shvartsman, A.A.: Reconfigurable distributed storage for dynamic networks. In: Proceedings of the International Conference on Principles of Distributed Systems, pp. 214–219 (2005)

  15. Davidson S.B., Garcia-Molina H., Skeen D.: Consistency in partitioned networks. ACM Comput. Surv. 17(3), 341–370 (1985)

    Article  Google Scholar 

  16. Dolev S., Gilbert S., Lynch N.A., Shvartsman A.A., Welch J.L.: Geoquorums: implementing atomic memory in mobile ad hoc networks. Distrib. Comput. 18(2), 125–155 (2005)

    Article  Google Scholar 

  17. El Abbadi, A., Skeen, D., Cristian, F.: An efficient fault-tolerant protocol for replicated data management. In: Proceedings of the Symposium on Principles of Databases, pp. 215–228 (1985)

  18. El Abbadi A., Toueg S.: Maintaining availability in partitioned replicated databases. Trans. Database Syst. 14(2), 264–290 (1989)

    Article  MathSciNet  Google Scholar 

  19. Englert, B., Shvartsman, A.A.: Graceful quorum reconfiguration in a robust emulation of shared memory. In: Proceedings of the International Conference on Distributed Computer Systems, pp. 454–463 (2000)

  20. Fekete A., Lynch N.A., Shvartsman A.A.: Specifying and using a partitionable group communication service. Trans. Comput. Syst. 19(2), 171–216 (2001)

    Article  Google Scholar 

  21. Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  22. Garcia-Molina H., Barbara D.: How to assign votes in a distributed system. J. ACM 32(4), 841–860 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  23. Georgiou, C., Musial, P.M., Shvartsman, A.A.: Developing a consistent domain-oriented distributed object service. In: Proceedings of the International Symposium on Network Computing and Applications, pp. 149–158 (2005)

  24. Georgiou C., Musial P.M., Shvartsman A.A.: Long-lived Rambo: Trading knowledge for communication. Theor. Comput. Sci. 383(1), 59–85 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  25. Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the Symposium on Operating Systems Principles, pp. 150–162 (1979)

  26. Gilbert, S.: Rambo II: Rapidly Reconfigurable Atomic Memory for Dynamic Networks. Master’s thesis, MIT (2003)

  27. Gilbert, S., Lynch, N.A., Shvartsman, A.A.: Rambo II: Rapidly reconfigurable atomic memory for dynamic networks. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 259–268 (2003)

  28. Goldman K., Lynch N.A.: Quorum consensus in nested transaction systems. Trans. Database Syst. 19(4), 537–585 (1994)

    Article  Google Scholar 

  29. Gramoli, V.: Rambo III: Speeding-up the Reconfiguration of an Atomic Memory Service in Dynamic Distributed System. Master’s thesis, Université Paris Sud, Orsay (2004)

  30. Gramoli, V., Musial, P.M., Shvartsman, A.A.: Operation liveness and gossip management in a dynamic distributed atomic data service. In: Proceedings of the International Conference on Parallel and Distributed Computing Systems, pp. 206–211 (2005)

  31. Herlihy, M.: Replication Methods for Abstract Data Types. PhD thesis, Massachusettes Institute of Technology (1984)

  32. Herlihy M.: Dynamic quorum adjustment for partitioned data. Trans. Database Syst. 12(2), 170–194 (1987)

    Article  Google Scholar 

  33. Jajodia S., Mutchler D.: Dynamic voting algorithms for maintaining the consistency of a replicated database. Trans. Database Syst. 15(2), 230–280 (1990)

    Article  Google Scholar 

  34. Kaynar, D.K., Lynch, N.A., Segala, R., Vaandrager, F.: The Theory of Timed I/O Automata. Technical Report MIT-LCS-TR-917a, MIT (2004)

  35. Keidar, I.: A highly Available Paradigm for Consistent Object Replication. Master’s thesis, Hebrew University, Jerusalem (1994)

  36. Keidar, I., Dolev, D.: Efficient message ordering in dynamic networks. In: Proceedings of the Symposium on Principles of Distributed Domputing, pp. 68–76 (1996)

  37. Konwar, K.M., Musial, P.M., Nicolaou, N.C., Shvartsman, A.A.: Implementing atomic data through indirect learning in dynamic networks. In: Proceedings of the International Symposium on Network Computing and Applications, pp. 223–230 (2007)

  38. Lamport L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)

    Article  MATH  Google Scholar 

  39. Lamport L.: The part-time parliament. Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  40. Lamport, L.: Fast Paxos. Technical Report MSR-TR-2005-12, Microsoft (2005)

  41. Lamport L.: Fast Paxos. Distrib. Comput. 19(2), 79–103 (2006)

    Article  MathSciNet  Google Scholar 

  42. Liu, M., Agrawal, D., El Abaddi, A.: On the implementation of the quorum consensus protocol. In: Proceedings of the International Conference on Parallel and Distributed Computing Systems, pp. 318–325 (1995)

  43. Lotem, E.Y., Keidar, I., Dolev, D.: Dynamic voting for consistent primary components. In: Proceedings of the Symposium on Principles of Distributed Computing pp. 63–71 (1997)

  44. Lynch N.A.: Distributed Algorithms. Morgan Kaufman, San Francisco (1996)

    MATH  Google Scholar 

  45. Lynch, N.A., Shvartsman, A.A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: Proceedings of the International Symposium on Fault-Tolerant Computing, pp. 272–281 (1997)

  46. Lynch, N.A., Shvartsman, A.A.: Rambo: A reconfigurable atomic memory service for dynamic networks. In: Proceedings of the International Symposium on Distributed Computing, pp. 173–190 (2002)

  47. Malkhi, D., Reiter, M.K.: Byzantine quorum systems. In: Proceedings of the Symposium on Theory of Computing, pp. 569–578 (1997)

  48. Musial, P.M.: From High Level Specification to Executable Code: Specification, Refinement, and Implementation of a Survivable and Consistent Data Service for Dynamic Networks. PhD thesis, University of Connecticut, Storrs (2007)

  49. Musial, P.M., Shvartsman, A.A.: Implementing a reconfigurable atomic memory service for dynamic networks. In: Proceedings of the International Parallel and Distributed Processing Symposium, p. 208b (2004)

  50. Muthitacharoen, A., Gilbert, S., Morris, R.: Etna: A Fault-Tolerant Algorithm for Atomic Mutable DHT Data. Technical Report MIT-LCS-TR-993, MIT (2005)

  51. Naor, M., Wieder, U.: Scalable and dynamic quorum systems. In: Proceedings of the Symposium on Principles of Distributed Computing, pp. 114–122 (2003)

  52. Naor M., Wool A.: The load, capacity, and availability of quorum systems. J. Comput. 27(2), 423–447 (1998)

    MATH  MathSciNet  Google Scholar 

  53. Peleg D., Wool A.: The availability of quorum systems. Inf. Comput. 123(2), 210–223 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  54. Peleg, D., Wool, A.: How to be an efficient snoop, or the probe complexity of quorum systems. In: Proceedings of the Symposium on Principles of Distributed Computing, pp. 290–299 (1996)

  55. De Prisco, R., Fekete, A., Lynch, N.A., Shvartsman, A.A.: A dynamic primary configuration group communication service. In: Proceedings of the International Symposium on Distributed Computing, pp. 64–78 (1999)

  56. De Priso R., Lampson B., Lynch N.: Revisiting the Paxos algorithm. Theor. Comput. Sci. 243(1–2), 35–91 (2000)

    Article  Google Scholar 

  57. Rangarajan, S., Tripathi, S.: A robust distributed mutual exclusion algorithm. In: Proceedings of the International Workshop on Distributed Algorithms, pp. 295–308 (1991)

  58. Saito, Y., Frølund, S., Veitch, A.C., Merchant, A., Spence, S.: FAB: building distributed enterprise disk arrays from commodity components. In: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 48–58 (2004)

  59. Sanders B.A.: The information structure of distributed mutual exclusion algorithms. Trans. Comput. Syst. 5(3), 284–299 (1987)

    Article  Google Scholar 

  60. Shraer, A., Martin, J.-P., Malkhi, D., Keidar, I.: Data-centric reconfiguration with network attached disks. In: Proceedings of LADIS (2010)

  61. Upfal E., Wigderson A.: How to share memory in a distributed system. J. ACM 34(1), 116–127 (1987)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seth Gilbert.

Additional information

Preliminary versions of this work appeared as the following extended abstracts: (a) Nancy A. Lynch, Alexander A. Shvartsman: RAMBO: A Reconfigurable Atomic Memory Service for Dynamic Networks. DISC 2002:173–190, and (b) Seth Gilbert, Nancy A. Lynch, Alexander A. Shvartsman: RAMBO II: Rapidly Reconfigurable Atomic Memory for Dynamic Networks. DSN 2003:259–268. This work was supported in part by the NSF ITR Grant CCR-0121277. The work of the second author was additionally supported by the NSF Grant 9804665, and the work of the third author was additionally supported in part by the NSF Grants 9984778, 9988304, and 0311368.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gilbert, S., Lynch, N.A. & Shvartsman, A.A. Rambo: a robust, reconfigurable atomic memory service for dynamic networks. Distrib. Comput. 23, 225–272 (2010). https://doi.org/10.1007/s00446-010-0117-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-010-0117-1

Keywords

Navigation