EDGES: Efficient data gathering in sensor networks using temporal and spatial correlations

https://doi.org/10.1016/j.jss.2009.08.004Get rights and content

Abstract

In this paper, we present an approximate data gathering technique, called EDGES, for sensor networks that utilizes temporal and spatial correlations. The goal of EDGES is to efficiently obtain the sensor reading within a certain error bound. To do this, EDGES utilizes the multiple model Kalman filter, which is for the non-linear data distribution, as an approximation approach. The use of the Kalman filter allows EDGES to predict the future value using a single previous sensor reading in contrast to the other statistical models such as the linear regression and multivariate Gaussian. In order to extend the lifetime of networks, EDGES utilizes the spatial correlation. In EDGES, we group spatially close sensors as a cluster. Since a cluster header in a network acts as a sensor and router, a cluster header wastes its energy severely to send its own reading and/or data coming from its children. Thus, we devise a redistribution method which distributes the energy consumption of a cluster header using the spatial correlation. In some previous works, the fixed routing topology is used or the roles of nodes are decided at the base station and this information propagates through the whole network. But, in EDGES, the change of a cluster is notified to a small portion of the network. Our experimental results over randomly generated sensor networks with synthetic and real data sets demonstrate the efficiency of EDGES.

Introduction

A wireless sensor network (WSN) is typically composed of a large number of sensors. A lot of tiny and inexpensive sensors are scattered in the sensor field to measure quantitative data. The sensors in a wireless sensor network generate a large amount of data that must be communicated to the network root (generally called the base station) using radio transmission.

Sensor nodes in a sensor network are severely constrained in terms of the computation power, communication bandwidth, and battery power. Among these limitations, the power is of utmost importance. Replacing the battery of a sensor is either too expensive or impossible. Thus, energy preservation is a major research issue since it directly impacts the life time of the network. Recent researches have shown that radio communication is more expensive than the computation or sensing. Thus, many techniques in diverse areas such as the routing protocol (Heinzelman et al., 2000, Li et al., 2001, Lindsey et al., 2001), event detection (Abadi et al., 2005, Yang et al., 2007), and in-network aggregation (Madden et al., 2002, Trigoni et al., 2006) have been proposed in order to reduce the communication overhead.

In-network aggregation provides a great opportunity for reducing the communication overhead using the summary data (e.g., SUM) and/or exemplary data (e.g., MIN and MAX). But, since a single aggregated value represents the overall sensing field, it may be insufficient to analysis the correlation among subregions of the sensor field. In addition, outliers may occur large errors in a single aggregation value.

Since a user may want to collect all sensor readings without any aggregation to get a data set that will support further off-line analysis, a common mode of sensor networks is gathering and detection of critical events in a physical environment (Chu et al., 2006). Furthermore, in a large sensor network, sensor readings may not accurately reflect the current state of the network due to the device noise, network failure and so on. Thus, in many cases, the users are interested in individual readings of sensors, rather than aggregated data. However, periodic reporting of sensor readings drains the energies of sensors since it results in excessive communication.

In a sensor field, there are following correlations among sensor readings:

  • Temporal correlation: the change pattern of future sensor readings is equal to or similar to that of previous sensor readings in a sensor.

  • Spatial correlation: the change patterns of sensor readings of the adjacent sensors are the same or similar.

Using the above properties, some techniques such as temporal suppression and spatial suppression which reduce the communication overhead have been proposed. Additionally, some approximation techniques for sensor networks have been proposed. The in-network approximation exploits the fact that a large number of applications can tolerate approximate sensor readings.

A representative approach of approximation is the lossy compression which is suitable for the collection of historical data through long-term queries. In the data compression approach, generally, a set of sensor readings for a certain period is stored in a sensor, and compressed by a lossy compression tool. Since a compressed form of a set of sensor readings is transmitted instead of transmitting a single sensor reading, the number of transmission and the volume of data are reduced. However, the base station cannot get a sensor reading in a real-time fashion.

Another approach for the approximation is based on statistical models such as the linear regression and multivariate Gaussian (Chu et al., 2006, Deshpande et al., 2004, Kotidis et al., 2005, Xue et al., 2006). In this approach, if the difference between an actual reading and an estimated reading derived from a statistical model is within a certain boundary (e.g., error bound), the estimated reading is used as an approximate sensor reading. Thus, data transmission overhead can be reduced. However, to construct a statistical model, a set of sensor readings is transmitted to the base station during the learning phase, generally. Also, if the constructed model does not reflect the current data distribution, the expensive learning phase should be re-applied.

In this paper, we propose EDGES which is an Efficient Data Gathering method in sensor networks using tEmporal and Spatial correlations. The goal of EDGES is to monitor the sensor readings within a certain error bound efficiently. For this, EDGES uses the multiple model Kalman filter as an approximation method as well as utilizes the spatial correlation to extend the life time of a sensor network.

Jain et al. (2004) proposed the Dual Kalman filter to reduce the amount of data communication between a sensor and the base station. Like Jain et al. (2004), EDGES is also based on the Kalman filter to predict the future value of a sensor reading. But, the Dual Kalman filter is based on the discrete Kalman filter. In contrast to the Dual Kalman filter, EDGES is based on the multiple model Kalman filter for the non-linear data distribution.

In addition, while only the temporal correlation is utilized in Jain et al. (2004), EDGES considers both the temporal correlation and the spatial correlation. In order to utilize the spatial correlation, a clustering mechanism which groups the spatially close sensors is adapted. Since a cluster header acts as a sensor and router, the energy of a cluster header is wasted. In order to distribute the energy consumption of a cluster header, we devise a localized technique in which the change of a cluster is notified to a small portion of a network. EDGES has the following combination of contributions to gather sensor readings in an energy efficient manner:

  • Our proposed method is based on the multiple model Kalman filter. By using the Kalman filter, an estimate of a sensor reading is available without the expensive learning phase since the Kalman filter processes sensor readings in a sequential and recursive manner. In Jain et al. (2004), the discrete Kalman filter was adopted. However, since the discrete Kalman filter is for the discrete and linear process, it does not fit to the continuous and/or non-linear process which often appears in the real world. Therefore, we applied another variant of the Kalman filter called the multiple model Kalman filter which is for a non-linear process.

  • We group spatially close sensors as a cluster. In a cluster, each member of a cluster reports its approximate sensor reading to the cluster header. The energy consumption of a cluster header is greater than the other sensors. Thus, it should avoid that a sensor acts as a cluster header permanently. The energy consumption of a cluster header should be distributed to extend the lifetime of the network. Thus, we devise a localized mechanism using the spatial correlation in which all or certain portions of a cluster header’s children can migrate to other nodes. Although a localized mechanism cannot achieve an optimal solution, our localized mechanism notifies the change of a cluster to a small portion of the network.

  • We provide an extensive experimental study of our framework using real and synthetic data sets and compare our framework with the previous approaches. Experimental results show that our proposed technique is effective and accurate compared with the other approaches.

Organization of the paper. In the remainder of the paper, we present the detail of EDGES with following organization. Section 2 presents various sensor data management techniques. In Section 3, we describe the basics of the Kalman filter. In Section 4, we describe the data model and the mechanism of EDGES. Section 5 contains the experimental results. Finally, in Section 6, we summarize our work and suggest some future studies.

Section snippets

Related work

In the database literature, several sensor data management systems such as Cougar (Demers et al., 2003) and TinyDB (Madden et al., 2003, Madden et al., 2005) have been introduced. One of the well known approaches to reduce the energy drain of sensor networks is the in-network aggregation since the in-network aggregation provides a great opportunity to reduce communication overheads.

Although aggregation measures are sufficient in many applications, there are situations where they may not be

Preliminary

To estimate a sensor reading, we adopt the Kalman filter which is used in diverse applications such as signal processing and pattern matching. In this section, we describe a brief overview of the Kalman filter, for more details refer to Welch et al. (2001).

The Kalman filter (Kalman, 1960) is introduced as a recursive data processing algorithm to the discrete-data linear filtering problem by Rudolph E. Kalman. The recursive filtering approach means that a received data can be processed

Overview

In this section, we present the overview of EDGES and some assumptions for EDGES which are commonly used in the sensor network field. It is widely accepted that the energy consumed for one bit transfer of data can be used to perform a large number of arithmetic operations in the sensor processor. Thus, we do not consider the computation cost in this work. Also, in our work, like related literatures (Yang et al., 2007, Trigoni et al., 2006, Madden et al., 2003, Tulone and Madden, 2006), we

Experiment

In this section, we demonstrate the efficiency of EDGES. To show the efficiency of EDGES, we empirically compared the performance of EDGES with the temporal suppression (TS), approximate temporal suppression (ATS) (Silberstein et al., 2007), snapshot approach (SS) (Kotidis et al., 2005), SAF (Tulone and Madden, 2006), and the Dual Kalman filter (DK) (Jain et al., 2004) on synthetic and real-life data sets. We use our own simulator which uses the energy consumption equation presented in Section

Conclusion

A sensor is a device that measures quantitative data by some external stimulus and generate electrical signals. Recent advances in micro-electro-mechanism systems enable the development of wireless sensor networks. Sensors in a network are severely constrained in terms of battery power. Thus, there has been much effort to reduce the energy for data management in the context of sensor networks.

In this work, we suggest an energy efficient data gathering method, called EDGES, for sensor networks.

Acknowledgements

This work was supported by National Research Foundation of Korea Grant funded by the Korean Government (Grant Number 2009-0072612).

Jun-Ki Min is a professor in the school of Internet-Media at the Korea University of Technology and Education (KUT) in Korea. He received a Ph.D. degree from the Korea Advanced Institute of Science and Technology (KAIST), Korea in 2002. He was a senior researcher in Electronics and Telecommunications Research Institute (ETRI), Korea. While at ETRI, he developed UbiCore, which is a large volume stream data management system. He has written and published several articles in international journals

References (29)

  • Abadi, D.J., Madden, S., Lindner W., 2005. Reed: Robust, efficient filtering and event detection in sensor networks. In...
  • Chu, D., Deshpande, A., Hellerstein, J.M., Hong, W., 2006. Approximate data collection in sensor networks using...
  • Crossbow Inc., 2003. MPR-Mote Process Radio Board User’s...
  • C. de Morais Cordeiro et al.

    AD HOC & SENSOR NETWORKS Theory and Applications

    (2006)
  • Deligiannakis, A., Kotidis, Y., 2004. Roussopoulos, N., Hierarchical in-network data aggregation with quality...
  • A. Demers et al.

    The cougar project: a work-in-progress report

    SIGMOD Record

    (2003)
  • Deshpande, A., Guestrin, C., Madden, S., Hellerstein, J.M., Hong, W., 2004. Model-driven data acquisition in sensor...
  • Gupta, H., Navda, V., Das, S.R., Chowdhary, V., 2005. Efficient gathering of correlated data in sensor networks. In:...
  • Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H., 2000. Energy-efficient communication protocol for wireless...
  • Jain, A., Chang, E.Y., Wang, Y.-F., 2004. Adapative stream resource management using kalman filters. In: Proceedings of...
  • Julier, S.J., Uhlmann, J.K., 1997. A new extension of the kalman filter to nonlinear systems. In: Proceedings of...
  • R.E. Kalman

    A new approach to linear filtering and prediction problem

    Transctions of ASME Journal of Basic Engineering

    (1960)
  • Kotidis, Y., 2005. Snapshot queries: towards data-centric sensor networks. In: Proceedings of the 22nd International...
  • Li, Q., Aslam, J., Rus, D., 2001. Hierarchical power-aware routing in sensor networks. In: Proceedings of DIMACS...
  • Cited by (0)

    Jun-Ki Min is a professor in the school of Internet-Media at the Korea University of Technology and Education (KUT) in Korea. He received a Ph.D. degree from the Korea Advanced Institute of Science and Technology (KAIST), Korea in 2002. He was a senior researcher in Electronics and Telecommunications Research Institute (ETRI), Korea. While at ETRI, he developed UbiCore, which is a large volume stream data management system. He has written and published several articles in international journals and conference proceedings. His current research interests include XML, the semantic Web, sensor network and stream data management.

    Chin-Wan Chung received a Ph.D. degree from the University of Michigan, Ann Arbor in 1983. He was a Senior Research Scientist and a Staff Research Scientist in the Computer Science Department at the General Motors Research Laboratories (GMR). While at GMR, he developed Dataplex, a heterogeneous distributed database management system integrating different types of databases. Since 1993, he has been a professor in the Division of Computer Science at the Korea Advanced Institute of Science and Technology (KAIST), Korea. At KAIST, he developed a full scale object-oriented spatial database management system called OMEGA, which supports ODMG standards. His current research interests include the semantic Web, the mobile Web, sensor networks and stream data management, and multimedia databases.

    View full text