Abstract
The Hidra Membership Monitor (HMM) is a distributed service that maintains the current set of active nodes in a cluster of machines. This protocol allows the detection of multiple machine joins or failures in a unique reconfiguration, using a low amount of messages (with a cost that is linear on the number of nodes). These membership services are needed to detect cluster changes as soon as possible, initiating then the reconfiguration of the cluster state, where support for replicated objects has been included.
The HMM also manages and synchronises the reconfiguration steps needed by the kernel and Hidra components of each node, ensuring that all of them take the same steps at once. Thus, our system does not need an atomic multicast protocol to deliver the messages in these reconfiguration steps. All these services provide the basis to develop reliable intracluster transport protocols and to reduce the reconfiguration time of replicated objects and services.
This work was partially supported by the CICYT (Comisión Interministerial de Ciencia y Tecnología) under project TIC99-0280-C02.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Y. Amir, L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, and P. Ciarfella. Fast message ordering and membership using a logical token-passing ring. In Proc. of the 13th International Conference on Distributed Computing Systems, pages 551–560, Pittsburgh, PA, EE.UU., May 1993. IEEE-CS Press.
F. Cristian. Reaching agreement on processor-group membership in synchronous distributed systems. Distributed Computing, 6(4):175–187, 1991.
P. Galdámez, F. D. Muñoz-Escoí, and J. M. Bernabéu-Aubán. High availability support in CORBA environments. In F. Plásil and K. G. Jeffery, editors, 24th Seminar on Current Trends in Theory and Practice of Informatics, Milovy, Czech Republic, volume 1338 of LNCS, pages 407–414. Springer Verlag, November 1997.
P. Galdámez, F. D. Muñoz-Escoí, and J. M. Bernabéu-Aubán. Garbage collection for mobile and replicated objects. In J. Pavelka, G. Tel, and M. Bartosek, editors, 26th Seminar on Current Trends in Theory and Practice of Informatics, Milovy, Czech Republic, volume 1725 of LNCS, pages 373–380. Springer Verlag, November 1999.
H. Kopetz and G. Grünsteidl. TTP-A protocol for fault-tolerant real-time systems. IEEE Computer, pages 14–23, January 1994.
F. D. Muñoz-Escoí, P. Galdámez, and J. M. Bernabéu-Aubán. ROI: An invocation mechanism for replicated objects. In Proc. of the 17th IEEE Symposium on Re-liable Distributed Systems, Purdue Univ., West Lafayette, IN, USA, pages 29–35, October 1998.
F. D. Muñoz-Escoí, P. Galdámez, and J. M. Bernabéu-Aubán. A synchronisation mechanism for replicated objects. In B. Rovan, editor, Proc. of the 25th Conference on Current Trends in Theory and Practice of Informatics, Jasná, Slovakia, volume 1521 of LNCS, pages 389–398. Springer Verlag, November 1998.
F. D. Muñoz-Escoí, P. Galdámez, and J. M. Bernabéu-Aubán. The NanOS cluster operating system. In R. Buyya, editor, High Performance Cluster Computing, volume 1, chapter 29, pages 682–702. Prentice-Hall PTR, Upper Saddle River, NJ, USA, 1999.
OMG. The Common Object Request Broker: Architecture and Specification. Object Management Group, July 1999. Revision 2.3.
R. Rajkumar, S. Fakhouri, and F. Jahanian. Processor group membership protocols: Specification, design and implementation. In Proc. of the 12th IEEE Symposium on Reliable Distributed Systems, Princeton, NJ, pages 2–11, October 1993.
A. Ricciardi and K. P. Birman. Consistent process membership in asynchronous environments. In K. P. Birman and R. van Renesse, editors, Reliable Distributed Computing with the Isis Toolkit, chapter 13, pages 237–262. IEEE Computer Society Press, Los Alamitos, CA, USA, 1994.
L. Rodrigues, P. Veríssimo, and J. Rufino. A low-level processor group membership protocol for LANs. In Proc. of the 13th International Conference on Distributed Computing Systems, pages 541–550, May 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muñoz-Escoí, F.D., Gomis, Ó., Galdámez, P., Bernabéu-Aubán, J.M. (2001). HMM: A Cluster Membership Service. In: Sakellariou, R., Gurd, J., Freeman, L., Keane, J. (eds) Euro-Par 2001 Parallel Processing. Euro-Par 2001. Lecture Notes in Computer Science, vol 2150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44681-8_109
Download citation
DOI: https://doi.org/10.1007/3-540-44681-8_109
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42495-6
Online ISBN: 978-3-540-44681-1
eBook Packages: Springer Book Archive