The anatomy study of consensus agreement in MANETs

https://doi.org/10.1016/j.compeleceng.2009.09.001Get rights and content

Abstract

Reliability is an important research topic of distributed systems. To achieve fault-tolerance in the distributed systems, healthy processors need to reach a common agreement before performing certain special tasks, even if faults exist in many circumstances. This problem is called as the Byzantine Agreement (BA) problem and it must be addressed. In general, the traditional BA problem is solved in well-defined networks. However, the MANETs (Mobile Ad-hoc Network) are increasing in popularity and its network topology is dynamic in nature. In this paper, the BA problem is re-examined in MANETs. Our protocol uses the minimum number of message exchanges to reach an agreement within the distributed system while tolerating the maximum number of faulty processors in MANETs.

Introduction

A distributed computing system consists of a set of processors, which can communicate with each other by exchanging messages. In order that a computer system is reliable, a mechanism allowing a set of processors to agree on a common value is needed [7], [12].

Some examples of such applications are: a commitment problem in a distributed database system [4], [12], a clock synchronization problem [5], and a landing task controlled by a flight path finding system [1]. Such a unanimity problem was first studied by Lamport et al. [7], and called a Byzantine Agreement (BA) [1], [4], [5], [7], [9]. This problem requires a number of independent processors to reach an agreement in cases where some of those processors might be faulty. Furthermore, the goal of BA is that the healthy processors achieve a common value.

In BA problem, the symptoms of processor failure can be classified into two categories, the dormant fault and arbitrary fault [5], [7]. Dormant processor faults include broken processors (crash faults) and message misses (omission faults), and are easy to detect and solve. However, arbitrary faults are unpredictable and damaging, and thus are a more serious problem than dormant faults.

In addition, a closely related sub-problem, the consensus problem, has been extensively studied [5], [12], [16]. The consensus problem has k initial values in k-processors system, and subsequently achieves a common value even if certain processors fail [5], [9], [12], [18]. Therefore, the consensus problem is similar to the BA problem in that it executes k copies BA processes. Subsequently, the result of Fischer [5], showing agreement, is impossible in an asynchronous environment with even one processor failure. In addition, Lamport argues for the consensus problem under the assumption of synchronous behavior BA, showing that 3fp + 1 processors are allowed fp failures where fp is the number of faulty processors in the network [7]. For clearing of this study, the assumptions of the BA are used to explain the concept of the consensus problem.

Traditionally, the BA problem was defined by Lamport et al. [7], as follows:

  • There are k (k > 3) processors, of which at most one-third of the total number of processors could fail without breaking down a workable network;

  • The processors communicate with each other through message exchange in a fully connected network (or well-defined network);

  • The message’s sender is always identifiable by the receiver;

  • An arbitrary faulty processor is chosen as a source, and its initial value is broadcasted to other processors and to itself to execute the protocol.

In general, a healthy source processor sends the same values to all processors and arbitrary processors cannot affect the root values sent from the healthy source processor. However, the source processor, which has arbitrary faults, may transmit different values to different processors. This situation is the worst case of the BA problem and is worth discussing. Therefore, we assume the source processor is an arbitrary processor in (4).

Besides, various protocols for the BA problem have been developed in order to meet the following requirements [1], [4], [5], [7], [9], [16], [18], [19]:

  • (BA1) Agreement: All healthy processors agree on a common value v.

  • (BA2) Validity: If the initial value of the source is vs and the source is healthy, then all healthy processors shall agree on the value vs; i.e., v = vs.

Under these assumptions and requirements, several protocols [1], [4], [5], [7], [9], [16], [18], [19] have been proposed for solving such problems. The protocol in Lamport et al. [7] indicates that fp + 1 (fp(k-1)/3) rounds (a round denotes the interval of message exchange) of message exchange are required to reach a common agreement in a synchronous fully connected network with k processors. Further, Fischer [5] point out that fp + 1 rounds are the minimum number of rounds needed for sufficient messages to achieve BA.

As the network technology continues to grow at a high rate of speed, traditional network topology is improved with wireless topology such as Mobile Ad-hoc NETwork (MANETs). MANETs, consisting of wireless processors that communicate with each other in the absence of a fixed infrastructure, is different from the traditional network structures [1], [7].

MANETs can be used flexibly and quickly in automated battlefields, disaster relief, and rescue. In general, each mobile processor can communicate with another within its own wireless transmission range. The mobile processor sends the message to the destination processor located outside its wireless transmission range by forwarding it via another mobile processor. Its topology, as shown in Fig. 1, can be modeled as a unit-disk graph [2], [6] according to the strength of transmission power.

In general, a MANET is built randomly when mobile processors want to communicate with each other within a specific range. Therefore, there exist several challenges to a MANET due to its dynamic nature, such as low battery power, limited bandwidth, and restricted mobility. As mentioned, the traditional routing protocols focusing on aspect include hierarchical routing [10], linking state [8], and distance vector [2], [3]. Unfortunately, processors may immigrate into or emigrate away from the network at any time, thus the previous routing path will be destroyed. Therefore, many researches use the concept of a virtual backbone for routing MANETs [2], [8], [13], [14], [15], [17].

Regardless of how the processors reach an agreement in the MANETs, the presence of faulty processors needs to be addressed. The symptom of a faulty processor is usually unrestrained, and is commonly called an arbitrary fault [1], [7]. In such a fault, a processor can withhold messages or collude with other faulty processors to send irregular message to others. However, some arbitrary-resilient BA protocols [4], [7] treat all faults as arbitrary faults, even though some faults may be subjected to dormant faults, such as crash or fail-stop faults. This treatment ignores that the faulty behaviors of dormant faults are be served as those of arbitrary faults. When a dormant fault exhibits its faulty behavior, it can be detected and ignored by all healthy processors.

Thus, arbitrary-resilient BA protocols cannot tolerate the maximum number of faults if the dormant faults exist. These observations motivate the study of the BA problem under a dual fault model (arbitrary and dormant faults exist simultaneously) [9], [5], [12], [16]. The goal of such study is to maximize the number of allowable faulty processors in dual fault mode. Therefore, in this paper, the BA problem is re-examined by investigating the dual fault and exploring how healthy processors reach agreement in MANETs.

The rest of this paper is organized as follows: Section 2 illustrates the basic assumption of MANETs. Section 3 shows the basic concept and approaches in this study. The detail of the protocol GCAP we propose is shown in Section 4. Section 5 illustrates examples of GCAP in detail. Subsequently, correctness and complexity are illustrated in Section 6. Finally, the conclusion is presented in Section 7.

Section snippets

The basic assumption of MANETs

MANETs have enjoyed an amazing rise in popularity. MANETs requires no infrastructure due to its dynamic nature. Therefore, previous research [2], [13], [14], [15], [17] has proposed the concept of a virtual backbone to organize MANETs. To build a virtual backbone, the processors in MANETs can be classified into gateway or non-gateway processors. In this gateway/non-gateway model, the gateway processors can organize the entire network and forward messages for non-gateway processors. Namely, the

Basic concepts and approaches

Initially, we assume the virtual backbone of MANETs is constructed by a CDS construction algorithm and the elected gateway processors of the virtual backbone have higher capability than non-gateway processors in MANETs. Subsequently, the proposed protocol Group Consensus Agreement Protocol (GCAP) is introduced to solve the BA/consensus problem in MANETs. There are four parts of the GCAP: group agreement process, consensus agreement process, broadcasting agreement process, and maintenance

Protocol GCAP

In this section, the protocol GCAP is introduced to solve the BA/consensus problem in MANETs. GCAP can tolerate ((G-1-Gpd)/3) + [(Σ (ni-1-npd)/2)  (Gpa + Gpd)] faulty processors and requires (σ) rounds of message exchange to reach an agreement. Namely, all healthy processors can reach an agreement under a MANET environment where ((G-1-Gpd)/3) + [(Σ(ni-1-npd)/2)  (Gpa + Gpd)] faulty processors exist.

However, processors in MANETs have another serious challenge, low battery power. To save the

Example of execution

In this section, two examples are shown to illustrate the GCAP protocol. The first example illustrates how GCAP facilitates the healthy processors achieving agreement when a new processor immigrates into a new region. The processor moving away from its region will be shown as the second example.

In Fig. 8a, Fig. 8b, Fig. 8c, Fig. 8d, Fig. 8e, Fig. 8f, Fig. 8g, Fig. 8h, Fig. 8i, Fig. 8j, Fig. 8k, Fig. 8l, there are 15 processors in the original MANETs. In this paper, the CDS protocol [13], [14],

The Correctness and complexity of GCAP

The following proofs for the agreement and validity are given in this section that the BA problem needs to meet. The lemmas and theorems are used to prove the correctness and complexity of GCAP.

Conclusion

In this study, the consensus/BA problem in MANETs is revisited with respect to dual failure mode in fallible processors. The processors in MANETs may immigrate into or emigrate away their region due to the dynamism of their nature. Therefore, previous attempts [5], [7], [12], [18] cannot adapt to this feature of MANETs. Furthermore, a virtual backbone to organize the MANETs in the absence of a fixed infrastructure is necessary. Therefore, our GCAP protocol uses the gateway/non-gateway model of

Mao-Lun Chiang received his M.S. from the Department of Information Management at Chaoyang University of Technology, Taiwan, and his Ph.D. in the Department of Computer Science at National Chung-Hsing University, Taiwan.

Currently, he is an Assistant Professor with the Department of Information and Communication Engineering, Chaoyang University of Technology, Taiwan. His current research interests include distributed data processing, fault tolerant computing, and mobile computing.

References (19)

There are more references available in the full text version of this article.

Cited by (2)

Mao-Lun Chiang received his M.S. from the Department of Information Management at Chaoyang University of Technology, Taiwan, and his Ph.D. in the Department of Computer Science at National Chung-Hsing University, Taiwan.

Currently, he is an Assistant Professor with the Department of Information and Communication Engineering, Chaoyang University of Technology, Taiwan. His current research interests include distributed data processing, fault tolerant computing, and mobile computing.

Shu-Ching Wang received her B.S. in Computer Science from Feng-Chia University, her M.S. in Electrical Engineering from National Chen-Kung University, and her Ph.D. in Information Management from National Chiao-Tung University, Taiwan.

Currently, she is a Professor at the Graduate Institute of Informatics, Chaoyang University of Technology, Taichung County, Taiwan. Her current research interests include distributed data processing, parallel processing, mobile computing, algorithm analysis and design, and fault-tolerant computing.

Lin-Yu Tseng received his B.S. in Mathematics from the National Taiwan University, his M.S. in Computer Science from National Chiao-Tung University, and his Ph.D. in Computer Science from National Tsing-Hua University, Taiwan.

Currently, he is a Professor with the Department of Computer Science at the National Chung-Hsing University, Taichung County, Taiwan. His current research interests include design and analysis of algorithms, pattern recognition, and genetic algorithms.

View full text