Elsevier

Computer Communications

Volume 23, Issue 7, 13 March 2000, Pages 653-666
Computer Communications

Group leader election under link-state routing

https://doi.org/10.1016/S0140-3664(99)00224-8Get rights and content

Abstract

In this work, we place a long-established distributed computing problem in a new context. Specifically, the group leader election problem is studied “inside the network,” meaning that participants in the election process are network switches/routers, rather than hosts. In this context, an election protocol can take advantage of certain internal operations of the network, such as the underlying routing protocol, to meet stringent fault-tolerance criteria while minimizing the network traffic overhead.

A robust solution to the problem, called the Network Leader Election (NLE) protocol, is proposed. The protocol is designed for use in networks based on link-state routing (LSR). The protocol is robust, for it achieves leadership consensus in the presence of adverse events, such as leader failures and network partitioning. The correctness of the protocol is proved formally. A simulation study reveals that the NLE protocol incurs low overhead in handling leader failures and in-group creation. In addition, it is shown how important network functions, including hierarchical routing, address resolution, and multicast core management, can benefit from the NLE protocol.

Introduction

The problem of leader election concerns the selection of a distinguished member from a set of computing systems that are interconnected by a network. This problem has been extensively studied in the context of distributed computing systems, for example, in coordinating access to shared resources [1] and in implementing fault-tolerant objects [2]. Generally speaking, solutions to the problem are distributed “host-level” algorithms that make use of various network services, such as reliable delivery of messages, in order to monitor the working status of the established leader or cast ballots for a new leader. Well-known contributions in this area include the Bully algorithm [3] and the Ring algorithm [4]; more recent developments are described in Ref. [5].

In this paper, we address the leader election problem as it occurs “inside” the network. The participants in the election process are assumed to be switches (or, interchangeably in this work, routers), rather than hosts or application processes. Solutions to this problem are intended to support underlying network functions, as opposed to being directly invoked by user applications. Whereas a host-level election protocol typically considers the underlying network as a “black box,” we show that a network-level election protocol can access and take advantage of the internal operation of the network, in particular, the underlying routing protocol.

Network functions that can make use of an efficient leader election protocol are several. In this paper, we focus on three examples. First, in Asynchronous Transfer Mode (ATM) networks and other hierarchical networks, switches in a low-level subnetwork (called a routing domain) select a switch to represent the domain in the next routing level [6]; a solution to this domain leader election problem supports routing operations within the network. Second, many address-mapping services, such as the mapping between group addresses and member addresses [7] and the mapping between network addresses and link-layer addresses [8], use a central server approach; a solution to the server assignment problem selects a leader to undertake the server responsibilities. Third, some IP multicast protocols, such as CBT [9], identify a network node, called a core node, as the traffic transit center for each multicast group; a solution to this multicast core management problem supports multicast services provided by the network. A common requirement of solutions to the above problems is fault tolerance: since network functions/services are expected to survive not only single-point failures, but also component failures that partition the network, the solutions to these problems must also survive these adverse scenarios.

In this work, we explore the problem of leader election in networks based on link-state routing (LSR) [10], [11], an increasingly popular type of network routing. An LSR protocol makes complete knowledge of the network available to all switches. For this purpose, the local status of each switch, including the bandwidth available at incident links, buffer capacity, delays across links, and so forth, is learned by the network via the broadcasting, or flooding, of link-state advertisements (LSAs). Based on received advertisements, each switch locally maintains a complete topology image of the network, which it uses to make routing decisions. One of the major advantages of LSR is its fault tolerance. Since every link is monitored by its incident switches, and every switch is monitored by neighboring switches, malfunctioning components and congested areas are made known to all functioning switches promptly. Even the earliest LSR protocols were able to survive disastrous situations, such as network partitioning [11]. The Open Shortest Path First (OSPF) protocol [10], introduced by the Internet community, is one of the most well known LSR unicast protocols. LSR has also been adopted as the routing standard for ATM networks [6].

Our proposed solution, called the Network Leader Election (NLE) protocol, takes advantage of state information provided by an underlying LSR protocol. The NLE protocol also extends LSR to include group-leader binding LSAs, which are used by group members to advertise their choice of leader. Upon receiving such an LSA, other switches in the network either accept this selection, or choose and advertise an alternative leader. The objective of the NLE protocol is to achieve network-wide consensus on leader bindings. Specifically, we will show that the NLE protocol achieves the following properties:

  • 1.

    Leadership consensus property. Given a group G and a network that has been partitioned into a set of segments S1,S2,…,Sk, k≥1, there will be consensus on the leader binding for G within each segment Si, and that leader will be an operational switch within the segment.

  • 2.

    Mutual consensus property. By requiring group members to report to the established leader, the NLE protocol ensures that, within each network segment Si, the leader maintains a member list for the group that includes those, and only those, group members in Si.

Simply put, the NLE protocol can handle leader failures and work properly under adverse scenarios such as network partitioning. When the network is not partitioned, the above consensus properties hold throughout the network. Results of a simulation study show that these features can be achieved with minimum protocol overhead.

The remainder of this paper is organized as follows. Section 2 reviews the operation of LSR and a previous network-level leader election method, the ATM domain leader election protocol which satisfies the two consensus properties discussed above. The design of the NLE protocol is presented in Section 3, and the correctness of the protocol, which is modeled as a consensus problem under LSR, is formally proved in Section 4. The performance of the NLE protocol and that of the ATM domain leader election protocol are compared via simulation in Section 5. Results of this study reveal that the NLE protocol incurs only a small fraction of the overhead of the ATM election protocol. In Section 6, we discuss the application of the NLE protocol to the address resolution problem and to the multicast core management problem; included are simulation results regarding the performance of NLE in creating multicast groups. Finally, conclusions are given in Section 7.

Section snippets

Link-state routing

A routing protocol disseminates network information that switches use to find paths for relaying communication traffic. In the case of LSR, a complete image of the network is made available to every switch. In most LSR protocols, every switch periodically floods an LSA that describes its local state, including the ID of the switch, links incident to the switch, bandwidth of individual links, and so forth. Since the network images must be updated in response to network dynamics, every switch

Overview

The operation of the NLE protocol is summarized below. Since some decision-making processes of the NLE protocol, such as the leader selection policy, are application dependent, we discuss the protocol operation in the context of the domain leader election problem. Adaptation of the protocol to other problems is discussed in Section 6.

  • 1.

    For every group g, each switch x in the network maintains a leader binding, Bindingx(g), whose value is a triple (Leaderx(g), Sourcex(g), Stampx(g)), where Leaderx(

Proof of correctness

We prove in this section that the NLE protocol achieves consensus on group-leader bindings throughout the network. However, we must be careful when defining what can be proved and what cannot be proved. For example, if every newly suggested leader immediately crashes, and this process continues indefinitely, then it is impossible for any leader-management algorithm to maintain stable and consistent leader bindings for the group. We conclude that a more reasonable goal is to study the behavior

Performance evaluation

In this section, we investigate the performance of the NLE protocol in handling leader failures. Specifically, the NLE protocol is compared against the ATM domain leader election protocol [6], which we described in Section 2. In our simulations, networks comprising up to 400 switches were used; such network sizes conform with the reported sizes of LSR-based routing areas in the Internet [13]. For each network size, 40 graphs were generated randomly, and two simulation sessions were conducted on

Other potential uses of the NLE protocol

Thus far, we have discussed the use of the NLE protocol for the domain leader election problem. In this section, we briefly discuss the application of the protocol to two other important network services, namely, multicast address resolution and multicast core management. In addition, we evaluate the performance of the NLE protocol in multicast group creation.

Conclusion

In this paper, we have considered a long-established distributed computing problem in a new context. Specifically, the leader election problem has been studied in a context where participants of the election process are network switches, and in which applications of the problem are network functions, such as hierarchical routing, address resolution, and multicast communication. The proposed solution, called the Network Leader Election protocol, models the group-leader binding problem as a

Further information

A number of related papers and technical reports of the Communications Research Group at Michigan State University are available http://www.cse.msu.edu/~mckinley.

Acknowledgements

This work was supported in part by NSF grants CCR-9503838, CDA-9617310, and NCR-9706285.

Yih Huang received the BS degree (1985) and MS degree (1987) in Information Engineering and Computer Science from the Feng Chia University, Taiwan, Republic of China. He received the PhD degree in Computer Science in 1998 from the Michigan State University. He is currently an Assistant Professor in the Department of Computer Science at George Mason University. Dr Huang is a member of IEEE Computer Society and Communications Society.

References (17)

  • S. Singh et al.

    Electing good leaders

    Journal of Parallel and Distributed Computing

    (1994)
  • D. Menasce et al.

    A locking protocol for resource coordination in distributed databases

    ACM TODS

    (1980)
  • K. Birman

    Implementing fault tolerant distributed objects

    IEEE Transactions on Software Engineering

    (1985)
  • H. Garcia-Molina

    Elections in a distributed computing system

    IEEE Transactions on Computers

    (1982)
  • N. Fredrickson et al.

    Electing a leader in a synchronous ring

    Journal of the ACM

    (1987)
  • ATM Forum, Private network-network interface specification version 1.0, ATM Forum technical specification...
  • G. Armitage, Support for multicast over UNI 3.0/3.1 based ATM networks, Internet RFC 2022, November...
  • M. Laubach, Classical IP and ARP over ATM, Internet RFC 1577, January...
There are more references available in the full text version of this article.

Cited by (2)

Yih Huang received the BS degree (1985) and MS degree (1987) in Information Engineering and Computer Science from the Feng Chia University, Taiwan, Republic of China. He received the PhD degree in Computer Science in 1998 from the Michigan State University. He is currently an Assistant Professor in the Department of Computer Science at George Mason University. Dr Huang is a member of IEEE Computer Society and Communications Society.

Philip K. McKinley received the BS degree in mathematics and computer science from Iowa State University in 1982, the MS degree in computer science from Purdue University in 1983, and the PhD degree in computer science from the University of Illinois at Urbana-Champaign in 1989. He is an Associate Professor in the Department of Computer Science at Michigan State University, where he has been on the faculty since 1990. He was a member of technical staff at Bell Laboratories in Naperville, Illinois from 1982–1990, on leave of absence 1985–1989. Dr McKinley is a member of the IEEE Computer Society and the ACM.

A concise and preliminary version of this paper was published in the Proceedings of the International Conference on Network Protocols, Atlanta, Georgia, October 1997, p. 95–104.

View full text