An adaptive and efficient dimension reduction model for multivariate wireless sensor networks applications
Graphical abstract
Highlights
► We propose an efficient CCIPCA-based dimension reduction model for multivariate sensor network data. ► The proposed APCADR model is adaptive with the dynamic environmental changes. ► The model achieves 33.33% and 50% reduction of multivariate data in dynamic and static environments. ► 97–99% of original data is successfully approximated at cluster heads in both environments. ► The model outperforms the compared models in terms of efficiency and approximation accuracy.
Introduction
Wireless sensor networks (WSNs) are networks of tiny, low cost, low energy, and multifunctional sensors that are densely deployed for monitoring environments, tracking objects or controlling industrial operations [1]. There are many application domains for WSNs that include but not limited to; home automation, sales tracking, industrial process control and even enemy target tracking in military operations [1], [2], [3]. Based on structure, WSNs are categorized into flat-based and hierarchical-based WSNs. In flat-based WSN, all nodes have the same functionality and resources whereas, in hierarchical WSNs, sensor nodes are grouped into clusters in which each cluster has cluster head (CH) and normal sensor nodes (SN) [4]. In Hierarchical WSNs, CH usually has some additional resources to carry out additional tasks. This hierarchical structure aims at making the design of WSNs more scalable and may contain different levels of hierarchy [4].
The main reason of quick sensor energy depletion in WSNs is the data transmission among nodes. Hill et al. [5] stated that, the transmission of one bit of data consumes a power needed to process thousands of bits in sensors. This means that, most of sensor energy is consumed in radio communication rather than sensing or processing data. Therefore, data dimensionality reduction will minimize the power consumption in radio communication.
Sensor data is categorized into univariate and multivariate data based on the phenomenon's characteristics. Univariate data express a sample of one phenomena variable (i.e. temperature) whereas multivariate data represents different variables of the phenomena (i.e. ambient temperature, surface temperature) [6]. Nowadays, sensor nodes are equipped with different types of sensors (i.e. TelosB TPR2420CA motes) that provide the ability to monitor different phenomena variables (i.e. temperature, humidity, light and voltage) [7]. In these nodes, the multivariate samples may originate from different sensors of a specific node or from different nodes. It is clear that, transmission of multivariate data will increase the power consumption of sensors because of the high radio communication cost involved for each variable. Besides, a large scale deployment in some WSN applications makes the power consumption in data transmission even higher.
The dynamic change of monitored variables in the WSN environment introduces a need for adaptive dimensionality reduction mechanisms that cope with the dynamic changes of these variables. Therefore, a lightweight incremental learning technique is required to update the reduction model without incurring additional energy consumption.
Principal component analysis (PCA) is a well-known multivariate data analysis technique used for dimensionality reduction of correlated data observations by transforming them into a set of uncorrelated variables called Principal components (PCs) [8], [9]. The basic PCA algorithm is based on the calculation of the covariance matrix and then projecting it on the new space using different methods such as singular value decomposition (SVD). Unfortunately, the computational complexity of these methods makes basic PCA algorithm inefficient for real time applications. The reason behind that is, PCA needs to learn new PCs for any change in the phenomenon by repeating complex matrix operations involved in SVD operations. Therefore, the basic PCA is not suitable for incremental learning of PCs in WSNs because of the high energy consumption required for SVD calculations.
In this paper, we propose a new efficient and adaptive dimensionality reduction model for hierarchical WSNs based on the candid covariance-free incremental PCA (CCIPCA) algorithm originally proposed in [10]. The algorithm does not require covariance matrix calculation, thus, called covariance-free. The contribution of this paper is summarized as follows: a new lightweight efficient dimensionality reduction model for WSNs is proposed based on the CCIPCA technique. The proposed model is adaptive so that new changes in the environment are incrementally learned whenever necessary. Besides, the proposed model is designed for multivariate data whereas most of the existing models deal with univariate data.
The rest of this paper is organized as follows. Section 2 reviews some related PCA and CCIPCA works for data dimensionality reduction and other purposes in WSNs. Section 3 gives an overview on CCIPCA. Meanwhile, the proposed model is presented in Section 4. In Section 5, experimental results and performance evaluation are presented. A comparison with other existing models in terms of computational and communication complexity, reconstruction error, and memory utilization is also covered in Section 5. Finally, this paper is concluded in Section 6.
Section snippets
Related works
The most related works of applying PCA for dimensionality reduction in WSNs were proposed in [11], [12]. A PCA-based data compression model was first proposed in [11]. In this model, an unsupervised and supervised data compression algorithms relying on a synchronized routing layer was proposed. The principal components are calculated offline at the sink node after sending enough collected data from nodes. The components are then sent back to the nodes to be used for reducing the future real
CCIPCA overview
This section gives an overview of CCIPCA, a dimension reduction technique proposed in [10] to compute the principal components of a sequence of samples incrementally without estimating the covariance matrix. Consider a data matrix, Sm×n ∈ Rm×n with m observations and n variables collected from normal operation (phenomenon). This matrix is normalized to zero mean to get the standardized matrix in order to avoid a biased estimation of principal weights against the center of observation set.
The proposed APCADR model
In this section, the network and the proposed adaptive PCA based dimensionality reduction (APCADR) model is presented. Fig. 2 shows the general structure of the proposed model which is composed of two phases; Initialization phase and implementation phase. The Initialization phase can either be implemented offline prior to real deployment of APCADR model. On the other hand, the implementation phase is applied in real time during normal operation of WSN applications. Details on each phase are
Datasets
The proposed APCADR model is evaluated using three benchmark datasets; Intel Berkeley Research Lab (IBRL) dataset [35], Grand-St-Bernard (GSB) dataset [36], and Lausanne Urban Canopy Experiments (LUCE) dataset [37]. The IBRL dataset was commonly used to evaluate the performance of some existing models in WSNs [38], [39], [40], [41], [42]. This dataset was collected using a WSN deployed in Intel Research Laboratory at University of Berkeley. The network consists of 54 Mica2Dot sensor nodes which
Conclusion
The importance of WSNs in many application areas poses a need for efficient data transmission models that guarantee an acceptable level of data quality while minimizing the power consumption in sensor nodes. Dimensionality reduction of transmitted data is one way for an efficient transmission as it reduces power consumption wasted in radio transmission. PCA proves to be a robust dimensionality reduction method in multivariate data analysis. However, its computation complexity hinders it to be a
References (51)
- et al.
Wireless sensor networks: a survey
Computer Networks
(2002) - et al.
Wireless sensor network survey
Computer Networks
(2008) - et al.
On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix
Journal of Mathematical Analysis and Applications
(1985) - et al.
CCIPCA-OPCSC: an online method for detecting shared congestion paths
Computer Networks
(2012) - et al.
Group-based intrusion detection system in wireless sensor networks
Computer Communications
(2008) - et al.
Clustering ellipses for anomaly detection
Pattern Recognition
(2011) - et al.
A survey of applications of wireless sensors and wireless sensor networks
- et al.
Routing techniques in wireless sensor networks: a survey
IEEE Wireless Communications
(2004) - et al.
System architecture directions for networked sensors
SIGPLAN Notices
(2000) - et al.
Multivariate reduction in wireless sensor networks
Telos: Enabling Ultra-low Power Wireless Research
A Survey of Dimension Reduction Techniques
Principal Component Analysis
Candid covariance-free incremental principal component analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised and supervised compression with principal component analysis in wireless sensor networks
Distributed principal component analysis for wireless sensor networks
Sensors
Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide
Algorithm of data compression based on multiple principal component analysis over the WSN
Reducing the data transmission in wireless sensor networks using the principal component analysis
A Framework for sensor stream reduction in wireless sensor networks, presented at the SENSORCOMM 2011
Diagnosing anomalies and identifying faulty nodes in sensor networks
IEEE Sensors Journal
Outlier aware data aggregation in distributed wireless sensor network using robust principal component analysis
Distributed PCA-based anomaly detection in wireless sensor networks
A PCA-based distributed approach for intrusion detection in wireless sensor networks
Data fault detection for wireless sensor networks using multi-scale PCA method
Cited by (28)
Multisignal 1-D compression by F-transform for wireless sensor networks applications
2015, Applied Soft ComputingCitation Excerpt :These approaches are usually used for dense and sparse networks respectively [4], even if in dense networks, spatio-temporal correlation allows the use of both the distributed and local approaches (e.g. [5]). Distributed approaches are also well suited for multivariate data [6]. As suggested by Wagner [7] one can refer to a spatio-temporal processing, i.e. to a distributed approach, when the ratio between the number of nodes and the length of the time-series stored at each node is high.
Adaptive and online data anomaly detection for wireless sensor systems
2014, Knowledge-Based SystemsCitation Excerpt :The CCIPCA which was originally proposed in [30], is a dimensionality reduction technique proposed to compute the principal components of a sequence of samples incrementally without estimating the covariance matrix. It was successfully used for adaptive and efficient data reduction for multivariate data in WSNs in [31]. The UNPCC was developed for multiclass problem of intrusion detection in computer networks.
Smart sensor/actuator node reprogramming in changing environments using a neural network model
2014, Engineering Applications of Artificial IntelligenceCitation Excerpt :The traditional option to solve this problem has been to send the sensed data to a central unit, where a person interprets the data and reprogram the microcontroller with the new set of rules (Han et al., 2005; Wang et al., 2006; Shaikh et al., 2010). Different reprogramming techniques have been proposed as a way of dynamically changing the behaviour of the sensors without having to manually reprogram them, because traditional reprogramming requires in most cases the interruption of the process for loading the new binary code, with the consequent loss of time and energy, involved in the communication process to the central unit (Rassam et al., 2013; Aiello et al., 2011). A first step towards reducing the previous effects has been to incorporate machine learning systems in the decision-making process, automating the response of the microcontroller without interrupting its execution and sending just a small fraction of code to the microcontroller (Urda et al., 2012; Canete et al., 2012; Farooq et al., 2010).
Fault detection and identification spanning multiple processes by integrating PCA with neural network
2014, Applied Soft Computing JournalCitation Excerpt :It is highly useful in analyzing state data which contains relationships between variables. It is proved successfully in many applications such as reducing dimensionality, data compression, and fault detection [16–20]. Scores are values of original measured variables that have been transformed into the reduced dimension space.
Incremental Linear Discriminant Analysis Dimensionality Reduction and 3D Dynamic Hierarchical Clustering WSNs
2022, Computer Systems Science and Engineering