1 Introduction

Wireless sensor and actuator networks (WSANs) can provide low-cost continuous monitoring. However, building WSAN applications is particularly challenging. Because of the complexity of concurrent and distributed programming, networking, real-time requirements, and power constraints, it can be hard to find a configuration that satisfies these constraints while optimizing resource use. A common approach to address this problem is to perform an informal analysis based on conservative worst-case assumptions and empirical measurements. This can lead to schedules that do not utilize resources efficiently. For example, a workload consisting of two periodic tasks would be guaranteed to be safe only if the sum of the two worst-case execution times (WCET) were less than the shorter period, whereas it is possible in practice to have many safe schedules violating this restriction.

A second approach is trial and error. For example, in [18], an empirical test-and-measure approach based on binary search is used to find configuration parameters: worst-case task runtimes, timeslot length of the communication protocols, etc. Trial and error is a laborious process, which nevertheless fails to provide any safety guarantees for the resulting configuration.

A third possibility is to extend scheduling techniques that have been developed for real-time systems [19] so that they can be used in WSAN environments. Unfortunately, this turns out to be difficult in practice. Many WSAN platforms rely on highly efficient event-driven operating systems such as TinyOS [12]. Unlike a real-time operating system (RTOS), event-driven operating systems generally do not provide real-time scheduling guarantees, priority-based scheduling, or resource reservation functionality. Without such support, many schedulability analysis techniques cannot be effectively employed. For example, in the absence of task preemption and priority-based scheduling, unnecessarily conservative assumptions must be used to guarantee correctness in the general case.

We propose an actor-based modeling approach that allows WSAN application programmers to assess the performance and functional behavior of their code throughout the design and implementation phases. The developed models are analyzed using model checking to determine the parameter values resulting in the highest system efficiency. Note that our use of model checking is similar to the work of Jorgerden et al. who use it to maximize the life-time of batteries in embedded systems [14].

We represent a WSAN application as a collection of actors [2]. The model can be incrementally extended and refined during the application design process, adding new interactions and scheduling constraints. We use Timed Rebeca [25] as the modeling language and its model checking tool Afra [1, 15] for analysis of WSAN applications. Timed Rebeca is a high-level actor-based language capable of representing functionality and timing behavior at an abstract level. Afra supports modeling and analysis of both of Rebeca and Timed Rebeca models; we use the timed model checking engine. Afra uses the concept of Floating Time Transition System (FTTS) [15] for the analysis of Timed Rebeca models. FTTS significantly reduces the state space that needs to be searched. The idea is to focus on event-based properties while relaxing the constraint requiring the generation of states where all the actors are synchronized. As the examples in [16] suggest, this approach can reduce the size of the state space by 50 to 90 %. Using FTTS fits with the computation model of WSAN applications and the properties that we are interested in.

We present a case study involving real-time continuous data acquisition for structural health monitoring and control (SHMC) of civil infrastructure [18]. This system has been implemented on the Imote2 wireless sensor platform, and used in several long-term development of several highway and railroad bridges [29]. SHMC application development has proven to be particularly challenging: it has the complexity of a large-scale distributed system with real-time requirements, while having the resource limitations of low-power embedded WSAN platforms. Ensuring safe execution requires modeling the interactions between the CPU, sensor and radio within each node, as well as interactions among the nodes. Moreover, the application tasks are not isolated from other aspects of the system: they execute alongside tasks belonging to other applications, middleware services, and operating system components. In the application we consider, all periodic tasks (sample acquisition, data processing, and radio packet transmission) are required to complete before the next iteration starts. Our results show that a guaranteed-safe application configuration can be found using the Afra model checking tool. Moreover, this configuration improves resource utilization compared to the previous informal schedulability analysis used in [18], supporting a higher sampling rate or a larger number of nodes without violating schedulability constraints.

Contributions. This paper makes the following contributions:

  • We show how a WSAN application may be modeled naturally as a system of actors. The abstraction and modularity of the actor model makes the approach scalable.

  • We present a real-world case study that illustrates the effectiveness of our approach for a real WSAN application.

  • We show how model checking toolsets can be used for an efficient schedulability analysis of WSAN application. Our case study shows we can compare the effects of different communication protocols on system performance.

2 Preliminaries

A WSAN application is a distributed system with multiple sensor nodes, each comprised of the independent concurrent entities: CPU, sensor, radio system, and bridged together via a wireless communication device which uses a transmission control protocol. Interactions between these components, both within a node and across nodes, are concurrent and asynchronous. Moreover, WSAN applications are sensitive to timing, with soft deadlines at each step of the process needed to ensure correct and efficient operation.

Due to performance requirements, and latencies of operations on sensor nodes, sensing, data processing, and communication processes must be coordinated. In particular, once a sample is acquired from a sensor, its corresponding radio transmission activities must be performed. Concurrently, data processing tasks–such as compensating sensor data for the effects of temperature changes–must be executed. Moreover, the timing of radio transmissions from different nodes must be coordinated using a communication protocol.

2.1 The Actor Model of WSAN Applications

The Actor model is a well-established paradigm for modeling distributed and asynchronous component-based systems. This model was originally introduced by Hewitt as an agent-based language where goal directed agents did logical reasoning [11]. Subsequently, the actor model developed as a model of concurrent computation for open distributed systems where actors are the concurrently executing entities [2]. One way to think of actors is as a service oriented framework: each actor provides services that may be requested via messages from other actors. A message is buffered until the provider is ready to execute the message. As a result of processing a message, an actor may send messages to other actors, and to itself. Extensions of the actor model have been used for real-time systems, in particular: RT-synchronizer [24], real-time Creol [6], and Timed Rebeca [25].

The characteristics of real-time variants of the actor model make them useful for modeling WSAN applications: many concurrent processes and interdependent real-time deadlines. Observe that common tasks such as sample acquisition, sample processing, and radio transmission are periodic and have well-known or easily measurable periods. This makes analysis of worst-case execution times feasible. However, because of the event-triggered nature of applications, initial offsets between the tasks are variable.

Fig. 1.
figure 1

Modeling the behavior of a WSAN application in its real-world installation in the actor model

We represent components of each WSAN node capable of independent action as an actor. Specifically, as shown in Fig. 1, a sensor node is modeled using four actors: Sensor (for the data acquisition) CPU (processor), RCD (a radio communication device) and Misc (carrying out miscellaneous tasks unrelated to sensing or communication). Sensor collects data and send it to CPU for further data processing. Meanwhile, CPU may respond to messages from Misc by carrying out other computations. The processed data is sent to RCD to forward it to a data collector node actor. We model the communication medium as an actor (Ether) and the receiver node also by the actor RCD. Using the actor Ether facilitates modularity: specifically, implementation of the Media Access Control (MAC) level details of communication protocols is localized, making it is easy to replace component sub-models for modeling different communication protocols without significantly impacting the remainder of the model. During the application design phase, different components, services, and protocols may be considered. For example, TDMA [8] as a MAC-level communication protocol may be replaced by B-MAC [23] with minimal changes.

Although schedulability analysis of WSAN applications can be challenging in the absence of a real-time scheduler, we reduce the problem of checking for deadline violations to the problem of reachability from a relatively small set of possible initial configurations. Model checking is the natural approach to this class of problems, and it is the approach we explore in this paper.

2.2 Timed Rebeca and the Model Checking Toolset

A Timed Rebecca (TR) model consists of reactive classes and a main program which instantiates actors (called rebecs in TR). As usual, actors have an encapsulated state, a local time, and their own thread of control. Each actor contains a set of state variables, methods and a set of actors it knows. An actor may only send messages to actor that it knows. Message passing is implemented by method calls: calling a method of an actor (target) results in sending a message to the target. Each actor has a message bag in which arriving messages may be buffered; the maximum capacity of the bag is defined by the modeler.

Timing behavior in TR is represented using three timing primitives: delay, after, and deadline. A delay term models the passing of time for an actor. The primitives after and deadline can be used in conjunction with a message send: after n indicates it takes n time units for the message to be delivered to its receiver; deadline n indicates that if the message is not taken in n time units, it should be purged from the receiver’s bag.

Afra 1.0 supports model checking of Rebeca models against LTL and CTL properties. Afra 2.0 supports deadlock detection and schedulability analysis of TR models; we use Afra 2.0 in this work. TR and Afra toolset have previously been used to model and analyze realtime actor based models such as routing algorithms and scheduling policies in NoC (Network on Chip) designs [26, 27].

3 Schedulability Analysis of a Stand-Alone Node

We now illustrate our approach using a node-level TR model of a WSAN application to check for possible deadline violations. Specifically, by changing the timing parameters of our model, we find the maximum safe sampling rate in the presence of other (miscellaneous) tasks in the node. Then, we show how the specification of a node-level model can be naturally extended to network-wide specifications.

Following the mapping in Fig. 1, the TR model for the four different reactive classes in Fig. 2 through Fig. 4.

Fig. 2.
figure 2

Reactive class of the Sensor

Fig. 3.
figure 3

Reactive class of the CPU

As shown in Fig. 2, the maximum capacity of the message bag of Sensor is set to 10, the only actor Sensor knows about is of type CPU (line 4), and Sensor does not have any state variables (line 5). The behavior of Sensor is to periodically acquire data and send it to CPU. Sensor is implemented using a message server sensorLoop (lines 13–17) which sends the acquired data to CPU (line 15). The sent data must be serviced before the start time of the next period, specified by the value of period as the parameter of deadline. Recall that there is a nondeterministic initial offset after which the data acquisition becomes a periodic task. To represent this property, Sensor which sends a sendLoop message to itself; the message is nondeterministically delivered after one of 10, 20, and 30 (line 11). After this random offset, a sensor’s periodic behavior is initiated (line 13). Note that in line 1, the sampling rate is defined as a constant. A similar approach is used in the implementation of the Misc reactive class.

The behavior of CPU as the target of Sensor and Misc events is more complicated (Fig. 3). Upon receiving a miscEvent, CPU waits for miscTaskDelay units of time; this represents computation cycles consumed by miscellaneous tasks. Similarly, after receiving the sensorEvent message from Sensor, CPU waits for sensorTaskDelay units of time; this represents cycles required for intra-node data processing. Data must be packed in a packet of a specified bufferSize. The number of collected samples \(+~1\) is computed (line 16) and when the threshold is reached (line 17), CPU asks senderDevice, to send the collected data in one packet (line 18). As this is a node-level model, communication between nodes is omitted. The behavior of RCD is limited to waiting for some amount of time (line 6); this represents the sending time of a packet.

Fig. 4.
figure 4

The node-level implementation of RCD

Note that computation times (delay’s) depend on the low-level aspects of the system and are application-independent; they can be measured before the application design. For schedulability analysis, we set the deadline for messages in a way that any scheduling violations are caught by the model checker.

4 Schedulability Analysis of Multi-node Model with a Distributed Communication Protocol

Transitioning from a stand-alone node model a network model requires that the wireless communication medium Ether to be specified in order to model the communication protocol it supports. Then both the node-level and multi-node models must be considered. Recall that nodes in the multi-node model periodically send their data to an aggregator node (Fig. 1). The sending process is controlled by a wireless network communication protocol. The reactive class of Ether (Fig. 5) has three message servers: these are responsible for sending the status of the medium, broadcasting data, and resetting the condition of the medium after a successful transmission. Broadcasting data takes place by sending data to a RCD which is addressed by the receiverDevice variable. So, we can easily examine the status of the Ether using the value of receiverDevice (i.e., medium is free if receiverDevice is not null, line 13). This way, after sending data, the value of receiverDevice and senderDevice must be set to null to show that the transmission is completed (lines 28 and 29). Data broadcasting is the main behavior of Ether (lines 15 to 26). Before the start of broadcasting, the Ether status is checked (line 16) and data-collision error is raised in case of two simultaneous broadcasts (line 24). With a successful data broadcast, Ether sends an acknowledgment to itself (line 19) and the sender (line 20), and informs the receiver of the number of packets sent to it (line 21). In addition to the functional requirements of Ether, there may be non-functional requirements. For example, the Imote2 radio offers a theoretical maximum transfer speed of 250 kbps. When considering only the useful data payload (goodput), this is reduced to about 125 kbps.

We now extend RCD to support communication protocols. Figure 6 shows the model of TDMA protocol implementation. TDMA protocol defines a cycle, over which each node in the network has one or more chances to transmit a packet or a series of packets. If a node has data available to transmit during its alloted time slot, it may be sent immediately. Otherwise, packet sending is delayed until its next transmission slot. The periodic behavior of TDMA slot is handled by handleTDMASlot message server which sets and unsets inActivePeriod to show that whether the node is in its alloted time slot. Upon entering into it’s slot, a device checks for pending data to send (line 31) and schedules handleTDMASlot message to leave the slot (line 30). On the other hand, when CPU sends a packet (message) to a RCD, the message is added to the other pending packets which are waiting for the next alloted time slot. tdmaSlotSize is the predefined size of the tdma slots, and currentMessageWaitingTime is the waiting time of this message in the bag of its receiver.

For the sake of simplicity, the details of RCD are omitted in Fig. 6. The complete source code (which implements the B-MAC protocol) is available on the Rebeca web page [1].

Fig. 5.
figure 5

Reactive class of the Ether

Once a complete model of the distributed application has been created, the Afra model checking tool can verify whether the schedulability properties hold in all reachable states of the system. If there are any deadline violations, a counterexample will be produced, indicating the path—sequence of states from an initial configuration—that results in the violation. This information can be helpful with changing the system parameters, such as increasing the TDMA time slot length, to prevent such situations.

5 Experimental Results and a Real-World Case Study

We examined the applicability of our approach using a WSAN model intended for use in structural health monitoring and control (SHMC) applicationsFootnote 1. Wireless sensors deployed on civil structures for SHMC collect high-fidelity data such as acceleration and strain. Structural health monitoring (SHM) involves identifying and detecting potential damages to the structure by measuring changes in strain and vibration response. SHM can also be employed with structural control, where it is fed into algorithms that control centralized or distributed control elements such as active and semi-active dampers. The control algorithms attempt to minimize vibration and maintain stability in response to excitations from rare events such as earthquakes, or more mundane sources such as wind and traffic. The system we examine has been implemented on the Imote2 wireless sensor platform [18], which features a powerful embedded processor, sufficient memory size, and a high-fidelity sensor suite required to collect data of sufficient quality for SHMC purposes. These nodes run the TinyOS operating system, supported by middleware services of the Illinois SHM Services Toolsuite [13].

Fig. 6.
figure 6

Reactive class of the RCD

This flexible data acquisition system can be configured to support real-time collection of high-frequency, multi-channel sensor data from up to 30 wireless smart sensors at frequencies up to 250 Hz. As it is designed for high-throughput sensing tasks that necessitate larger networks sizes with relatively high sampling rates, it falls into the class of data-intensive sensor network applications, where efficient resource utilization is critical, since it directly determines the achievable scalability (number of nodes) and fidelity (sampling frequency) of the data acquisition process. Configured on the basis of network size, associated sampling rate, and desired data delivery reliability, it allows for near-real-time acquisition of 108 data channels on up to 30 nodes—where each node may provide multiple sensor channels, such as 3-axis acceleration, temperature, or strain—with minimal data loss. In practice, these limits are determined primarily by the available bandwidth of the IEEE 802.15.4 wireless network and sample acquisition latency of the sensors. The accuracy of estimating safe limits for sampling and data transmission delays directly impacts the system’s efficiency.

To illustrate the applicability of this work, we considered applications where achieving the highest possible sampling rate that does not result in any missed deadline is desired. This is a very common requirement in WSAN applications in the SHMC domain in particular. We begin by setting the value of OnePacketTT to 7 ms (i.e., the maximum transmission time of this type of applications) and fixed the value of sensorTaskDelay, miscPeriod, and miscTaskDelay to some predefined values. In addition to the sampling rate, the number of nodes in the network and the packet size remain variable. By assuming different values for the number of nodes and the packet size, different maximum sampling rates are achieved, shown as a 3D surface in Fig. 7. As shown in the figure, higher sampling rates are possible when the buffer size is set to a larger number (there is more space for data in each packet). Similarly, increasing the number of nodes decreases the sampling rate: in competition among three different parameters of Fig. 7, the cases with the maximum buffer size (i.e., 9 data points) and minimum number of nodes (i.e., 1 node) results in the highest possible maximum sampling rates. Decreasing the buffer size or increasing the number of nodes, non-linearly reduces the maximum possible sampling rate.

A server with Intel Xeon E5645 @ 2.40 GHz CPUs and 50 GB of RAM, running Red Hat 4.4.6-4 as the operating system was used as the model-checking host. We varied the size of the state space from \(< 500\) to \(>\)140 K states, resulting in model checking times ranging from 0 to 6 s. Analyzing the specifications of the state spaces, some relations between the size of the state spaces and the configurations of the models are observed. For example, the largest state spaces correspond to configurations where sensorTaskDelay, bufferSize, and numberOfNodes are set to large values.

Fig. 7.
figure 7

The maximum sampling rate in case of using TDMA protocol and setting the value of sensorTaskDelay to 2 ms

We also wanted to compare the effect of the communication protocol and the value of sensorTaskDelay in the supported maximum sampling rate, considering 648 different configurations. The maximum sampling rates found for each configuration is depicted in Fig. 8; they show that increasing the value of sensorTaskDelay as the representor of intra-node activities, decreases the sampling rate dramatically. They also show that using B-MAC results in achieving higher sampling rates in comparison to TDMA.

Fig. 8.
figure 8

Maximum possible sampling rate in case of different communication protocols, number of nodes, sensor internal task delays, and radio packet size

The parameters used in our analysis of configurations were determined through a real-world installation of an SHMC application. Our results show that the current manually-optimized installation can be tuned to an even more optimized one: by changing the configuration, the performance of the system can be safely improved by another 7 %.

6 Related Work

Three different approaches have been used for analysis of WSANs: system simulation, analytical approach, and formal verification.

System Simulation. Simulation of WSAN applications is useful for their early design exploration. Simulation toolsets for WSANs have enabled modeling of networks [17], power consumption [28], and deployment environment [31]. Simulators can adequately estimate performance of systems and sometimes detect conditions which lead to deadline violations. But even extensive simulation does not guarantee that deadline misses will never occur in the future [5]. For WSAN applications with hard real-time requirements this is not satisfactory. Moreover, none of available simulators is suitable for the analysis WSAN application software.

Analytical Approach. A number of algorithms and heuristics have been suggested for schedulability analysis of real-time systems with periodic tasks and sporadic tasks with constraints, e.g. [20]. Although these classic techniques are efficient in analyzing schedulability of real-time systems with periodic tasks and sporadic tasks, their lack of ability to model random tasks make them inappropriate for WSAN applications.

Formal Verification. Real-time model checking is an attractive approach for schedulability analysis with guarantees [5]. Model checking tools systematically check whether a model satisfies a given property [4]. The strength of model checking is not only in providing a rigorous correctness proof, but also in the ability to generate counter-examples, as diagnostic feedback in case a property is not satisfied. This information can be helpful to find flaws in the system. Norström et al. suggest an extension of timed automata to support schedulability analysis of real-time systems with random tasks [21]. Feresman et al. studied an extension of timed automata which its main idea is to associate each location of timed automata with tasks, called task automata [10].

TIMES [3] is a toolset which is implemented based on the approach of Feresman et al. [9] for analysis of task automata using UPPAAL as back-end model checker. TIMES assumes that tasks are executed on a single processor. This assumption is the main obstacle against using TIMES for schedulability analysis of WSAN applications, which are real-time distributed applications. De Boer et al. in [7] presented a framework for schedulability analysis of real-time concurrent objects. This approach supports both multi-processor systems and random task definition, which are required for schedulability analysis of WSAN applications. But asynchronous communication among concurrent elements of WSAN application results in generation of complex behavioral interfaces which lead to a state space explosion even for small size examples.

Real-Time Maude is used in [22] for performance estimation and model checking of WSAN algorithms. The approach supports modeling of many details such as communication range and energy use. The approach requires some knowledge of rewrite logic. Our tool may be easier to use by engineers unfamiliar with rewriting logic: our language extends straight-forward C-like syntax with actor concurrency constructs and primitives for sensing and radio communication. This requires no formal methods experience from the WSAN application programmer, as the language and structure of the model closely mirror those of the real application.

7 Conclusion

We have shown one of the applications of real-time model checking method in analyzing schedulability and resource utilization of WSAN applications. WSAN applications are very sensitive to their configurations: the effects of even minor modifications to configurations must be analyzed. With little additional effort required on behalf of the application developer, our approach provides a much more accurate view of an WSAN application’s behavior and its interaction with the operating system and distributed middle-ware services than can be obtained by the sort of informal analysis or trial-and-error methods commonly in use today.

Our realistic—but admittedly limited—experimental results support the idea that the use of formal tools may result in more robust WSAN applications. This would greatly reduce development time as many potential problems with scheduling and resource utilization may be identified early.

An important direction for future research is the addition of probabilistic behavior analysis support to the tool. In many non-critical applications, infrequent scheduling violations may be considered a reasonable trade-off for increased efficiency in the more common cases. Development of a probabilistic extension is currently underway.