An unsupervised anomaly-based detection approach for integrity attacks on SCADA systems

doi:10.1016/j.cose.2014.07.005

Computers & Security

Volume 46, October 2014, Pages 94-110

https://doi.org/10.1016/j.cose.2014.07.005 Get rights and content

Abstract

Supervisory Control and Data Acquisition (SCADA) systems are a core part of industrial systems, such as smart grid power and water distribution systems. In recent years, such systems become highly vulnerable to cyber attacks. The design of efficient and accurate data-driven anomaly detection models become an important topic of interest relating to the development of SCADA-specific Intrusion Detection Systems (IDSs) to counter cyber attacks. This paper proposes two novel techniques: (i) an automatic identification of consistent and inconsistent states of SCADA data for any given system, and (ii) an automatic extraction of proximity detection rules from identified states. During the identification phase, the density factor for the k-nearest neighbours of an observation is adapted to compute its inconsistency score. Then, an optimal inconsistency threshold is calculated to separate inconsistent from consistent observations. During the extraction phase, the well-known fixed-width clustering technique is extended to extract proximity-detection rules, which forms a small and most-representative data set for both inconsistent and consistent behaviours in the training data set. Extensive experiments were carried out both on real as well as simulated data sets, and we show that the proposed techniques provide significant accuracy and efficiency in detecting cyber attacks, compared to three well-known anomaly detection approaches.

Introduction

SCADA systems control and monitor industrial and infrastructure processes such as transportation, oil and gas refining and energy and water distribution networks (Yu et al., 2011, Fahad et al., 2013). In recent years, the incorporation of Commercial-Off-The-Shelf (COTS) products such as standard hardware and software platforms have begun to be used in SCADA systems. This incorporation allowed various products from different vendors to be integrated with each other to build a SCADA system at low cost. In addition, the integration of standard protocols (e.g. TCP/IP) into COTS products has increased their connectivity, thereby increasing productivity and profitability. However, this shift from proprietary and customized products to standard ones exposes these systems to cyber threats (Oman et al., 2000). Undoubtedly, any attack targeting SCADA systems could lead to high financial losses and serious impacts on public safety and the environment. The attack on the sewage treatment system in Maroochy Shire (Australia) is an example of such attacks on critical infrastructures (Slay and Miller, 2007), where the attacker took over the control devices of a SCADA system. The Stuxnet (Falliere et al., 2011) worm, which was designed to damage nuclear power plants in Iran, is a recent example of threats targeting control systems. Both of the aforementioned attacks are classified as man-in-the-middle (MITM) attacks, where control devices are compromised to perform malicious actions, and meanwhile false information is sent to the Master Terminal Unit (MTU) to avoid detection. Such cyber threats allow attackers to perform high-level control actions (Wei et al., 2011, Queiroz et al., 2011, Nicholson et al., 2012), and pose potential threats to SCADA systems.

An awareness of the potential threats to, as well as the need to reduce the various vulnerabilities of SCADA systems have recently become an important research focus in the area of security. A number of (security) measures have been used in traditional IT systems, including management, filtering, encryption and intrusion detection. However, such measures cannot be directly applied to SCADA systems without considering their specific characteristics. Additionally, none of these traditional IT security solutions can completely protect SCADA systems from potential cyber attacks. However, properly adapting/extending such IT solutions can create robust protection of SCADA systems against cyber attacks. IDS (Intrusion Detection System) is one of the security solutions that has showed promising results in detecting malicious activities in traditional IT systems, and this is one of the reasons for using and adapting it to SCADA environments.

To illustrate the intrusion detection problem, two well-known scenarios (Verba and Milvich, 2008) are considered. Fig. 1 illustrates an attacker compromising the front end processor (FEP) by carrying out three actions: (i) initialising a connection with a remote terminal unit (RTU_1.1) and sending a command without receiving a corresponding command from the application server; (ii) dropping the command sent from the application server to RTU_1.1, and frogging feedback information sent back to the application server to meet the attack; and (iii) frogging the command sent from the application server to RTU_1.1, as well as frogging feedback information sent back from RTU_1.1 to the application server. All commands sent to RTU_1.1 will be trusted, as they are syntactically valid and sent from an FEP.

Two inconsistent data can be identified in this scenario: an inconsistent network traffic pattern and (ii) an inconsistent SCADA data. The former relates to the following: (i) an FEP is not an intelligent device that can make a decision and send a command to RTU_1.1 without receiving a corresponding command; (ii) and the dropped command at FEP will be shown up in the network stream from the application server to the FEP, but not in the network stream from the FEP to the RTU_1.1, while the frogged commands between the application server and RTU_1.1 can be identified by the inconsistent SCADA data. For example, the command in the network stream from the application server to the FEP shows that the status of pump₁ is ON, while in the network stream from the FEP to the RTU_1.1, it is OFF. Clearly, the inconsistencies in this scenario shows that the aforementioned MITM attacks are performed by the FEP. In what follow, however, we show a scenario where the monitoring of inconsistencies fails to detect MITM attacks.

Let's consider the example shown in Fig. 2. This example illustrates an attacker compromising an intelligent application server that can initiate independent actions. It drops commands sent from the operator, and therefore an unsafe situation could be created. An attacker initialises a command from the application server to turn off pump₁, and it can be seen that both the network traffic stream and the SCADA data between RUT_2.1 and the application server are consistent for this command. However, the SCADA data, such as the speed and the status of pump1, could be inconsistent with the sensory node of the water level in RTU_2.2, as they are set to values that violate the specifications of the system from the operational perspective.

The evolution of SCADA data can reflect the system's state: consistent or inconsistent. Therefore, the monitoring of the SCADA data has been proposed as an efficient tailored IDS for SCADA environments. The detection methods are broadly categorized into two types: signature-based and anomaly-based. The former can detect only an attack whose signature is already known, while the latter can detect unknown attacks by looking for activities that deviate from an expected patterns (or behaviours). Learning the anomaly-based detection models can be performed via three modes, namely supervised, semi-supervised and unsupervised. The class labels must be available for the first mode; however, this type of learning is costly and time-consuming because domain experts are required to label hundreds of thousands of data observations. The second mode is based on the assumption that the training data set represents only one behaviour, either normal or abnormal. There are a number of issues pertaining to this mode. The system has to operate for a long time under normal conditions in order to obtain purely normal data that comprehensively represent normal behaviours. However, there is no guarantee that any anomalous activity will occur during the data collection period. On the another hand, it is difficult to obtain a training data set that covers all possible anomalous behaviours that could occur in the future. Alternatively, the unsupervised mode can be an appropriate solution to address the aforementioned issues, where the anomaly detection models can be learned from unlabelled data without prior knowledge about normal/abnormal behaviours. However, the poor efficiency and low accuracy this type of learning are challenging.

This paper proposes a novel unsupervised SCADA data-driven anomaly detection approach intended to be used as a passive SCADA IDS. That is, it only raises alarms when suspicious activities are detected, and the appropriate responses will be left for a system administrator. The SCADA data, which are generated by sensors/actuators, are used as valuable information in the proposed approach. Fig. 3 shows the two main steps of the proposed approach: the identification of consistent/inconsistent states from unlabelled SCADA data, and the extraction of proximity-based detection rules for each behaviour.

The use of control data has attracted the attention of many researchers studying SCADA data-driven anomaly detection models that are able to learn the mechanistic behaviour of SCADA systems without knowledge of the physical behaviour of such systems (Rrushi, April 2009, Marton et al., 2013, Gao et al., 2010, Zaher et al., 2009). Such studies however can operate only in two learning modes: supervised and semi-supervised. Despite the promising results of these learning modes, there are a number of issues that restrict their use (see the previous Section 1.1). This paper proposes an unsupervised learning approach, which consists of two novel techniques. The first one is used to identify consistent/inconsistent states from unlabelled data. This is performed by giving an inconsistency score to each observation using the density factor for the k-nearest neighbours of the observation. An optimal inconsistency threshold is later computed to separate inconsistent from consistent observations. The second proposed technique extracts proximity-based detection rules for each behaviour, whether inconsistent or consistent. During this phase, the fixed-width clustering technique (Eskin et al., 2002) is used to cluster each behaviour individually into micro-clusters with a constant fixed width, which is statistically determined. The centroids of all the created micro-clusters are used as the proximity-detection rules that are assumed to form a small and most representative data set for both inconsistent and consistent behaviours in the training data set.

The proposed approach is evaluated on both real and simulated data sets; two are generated by a simulation of a SCADA system that uses well-known models as discussed in Section 4.1, while the third is real and consists of consistent/inconsistent observations. In particular, we compared the effectiveness of our unsupervised approach with existing unsupervised and semi-supervised anomaly detection approaches.

This paper is organised as follows. Section 3 provides a characterisation of consistent/inconsistent observation states for SCADA data, as well as the details of the proposed approach. Section 4 presents the experimental setup, followed by results and analysis in Section 5. Finally, we conclude the work in Section 6.

Section snippets

Related work

In the design of an IDS, two main processes are often considered. First is the selection of the information source (e.g. network-based, application-based) to be used, through which anomalies can be detected. Second is the development of a learning (or analysis) method that is used to efficiently build the detection model using the specified information source. SCADA-specific IDSs can be broadly grouped into three categories in terms of the latter process: misuse (signature-based) detection (

The proposed intrusion detection approach

This section describes consistent/inconsistent states of SCADA data, as well as the techniques that contribute to the development of an unsupervised intrusion detection method to detect SCADA-based integrity attacks. Specifically, the proposed approach consists of (i) a technique that identifies consistent and inconsistent multivariate SCADA data, and (ii) a technique that extracts proximity-based detection rules used to perform a near-real-time monitoring of integrity attacks. Fig. 3

Experimental setup

The main focus of this section is to set up an experimental environment to evaluate the robustness of the proposed approach. In what follows, we describe the simulation system used and two integrity attacks. We also describe the data sets used and the experimental parameters chosen for this evaluation.

Results and analysis

This section evaluates the accuracy of anomaly detection of the proposed unsupervised approach, and in addition, a comparison between this approach and two existing unsupervised and semi-supervised anomaly detection approaches is carried out. The detection accuracy for each approach is separately evaluated because the existing approaches that have been chosen as a basis for comparison with the proposed approach are inherently different in terms of the required parameters for learning anomaly

Conclusion

In this paper, we proposed an innovative unsupervised SCADA data-driven anomaly detection approach to detect integrity attacks tailored to SCADA systems. This has been done by initially identifying the consistent and inconsistent states of SCADA data automatically, and then also automatically extracting proximity-based detection rules from the identified states to detect inconsistent states. Experimental results show the ability of the proposed approach to automatically identify consistent and

Abdulmohsen Almalawi received his B.S. degree in Computer Science from King Abdul Aziz University, Jeddah, Saudi Arabia, in 2003. He received his M.S. degree in 2008 from RMIT University, Melbourne, Australia, and he is currently a Ph.D. candidate in the Department of Computer Science and Information Technology at the University of RMIT. His research interests are in the areas of machine learning, and SCADA security.

References (56)

C. Alcaraz et al.
WASAM: a dynamic wide-area situational awareness model for critical domains in smart grids
Future Gener Comput Syst
(2014)
C. Alcaraz et al.
Diagnosis mechanism for accurate monitoring in critical infrastructure protection
Comput Stand Interfaces
(2014)
A. Fahad et al.
Ppfscada: privacy preserving framework for scada data publishing
Future Gener Comput Syst
(2014)
A. Nicholson et al.
Scada security in the light of cyber-warfare
Comput Secur
(2012)
A. Almalawi et al.
SCADAVT – a framework for SCADA security testbed based on virtualization technology
F. Angiulli et al.
Outlier mining in large high-dimensional data sets
IEEE Trans Knowl Data Eng
(2005)
M. Ankerst et al.
Optics:ordering points to identify the clustering structure
A. Arning et al.
A linear method for deviation detection in large databases
A. Beygelzimer et al.
Cover trees for nearest neighbor
M.M. Breunig et al.
Optics-of: identifying local outliers

M.M. Breunig et al.

Lof: identifying density-based local outliers

A. Carcano et al.

A multidimensional critical state analysis for detecting intrusions in SCADA systems

IEEE Trans Ind Inform

(2011)

S. Cheung et al.

Using model-based intrusion detection for SCADA networks

Digitalbond

IDS-signatures of Modbus/TCP

(2013)

E. Eskin et al.

A geometric framework for unsupervised anomaly detection

A. Fahad et al.

Toward an efficient and scalable feature selection approach for internet traffic classification

Comput Netw

(2013)

N. Falliere et al.

W32. stuxnet dossier: version 1.4

(2011)

E.B. Fernandez et al.

Designing secure scada systems using security patterns

I.N. Fovino et al.

Modbus/DNP3 state-based intrusion detection system

I.N. Fovino et al.

Critical state-based filtering system for securing SCADA network protocols

IEEE Trans Ind Electron

(2012)

A. Frank et al.

UCI machine learning repository

(2013)

K. Fukunaga et al.

A branch and bound algorithm for computing k-nearest neighbors

IEEE Trans Comput

(1975)

W. Gao et al.

On scada control system command and response injection and intrusion detection

P. Gross et al.

Secure selecticast for collaborative intrusion detection systems

M. Hall et al.

The weka data mining software: an update

ACM SIGKDD Explor Newsl

(2009)

K. Hempstalk et al.

One-class classification by combining density and class probability estimation

M. IDA

Modbus messaging on TCP/IP implementation guide v1.0a

(2013)

M. Jianliang et al.

The application on intrusion detection based on k-means cluster algorithm

Cited by (107)

Detection and pre-localization of anomalous consumption events in water distribution networks through automated, pressure-based methodology
2024, Water Resources and Industry
Anomalous water-consumption events (AEs) can significantly impact the functioning of water distribution networks, and their prompt identification can improve the service provided by water utilities. This study proposes a new methodology for AE detection and pre-localization in water distribution networks relying exclusively on pressure-data collected in the field, which are exploited to evaluate differential-pressure trends for all possible pressure-sensors couples located in the WDN. In greater detail, AEs are detected and pre-localized by analysing differential-pressure trends over time. The level of deviation of these trends from the standard is considered to provide information about (i) AE alert levels and (ii) the area of the network where the AE is most likely to occur. The application of the methodology to two real case studies featuring different characteristics in terms of residential and industrial users demonstrated method effectiveness in detecting and pre-localizing individual and simultaneous AEs of different magnitude and occurring at different times of the day, providing useful information about the presence of AEs without the need for hydraulic models, and allowing the evaluation of their effects in terms of piezometric head alteration in the different areas of the system.
A hybrid behavior- and Bayesian network-based framework for cyber–physical anomaly detection
2023, Computers and Electrical Engineering
In recent years, the increasing Internet connectivity and heterogeneity of industrial protocols have been raising the number and nature of cyber-attacks against Industrial Control Systems (ICS). Such cyber-attacks may lead to cyber anomalies and further to the failure of physical components, thus leading to cyber–physical attacks. In this paper, we present a novel unsupervised cyber–physical anomaly detection framework based on a hybrid “multi-formalism” approach that combines the outcomes of multiple unsupervised behavior-based anomaly detectors through a Bayesian network-based probabilistic modeling of the ICS. More precisely, the framework consists of two behavior-based anomaly detection modules that monitor separately and simultaneously the behavior of cyber and physical data acquired from the ICS. The outputs of such modules are filtered and combined through a Bayesian network-based modeling in order to improve the trustworthiness of the detected anomalies and to provide the detection probability of cyber, physical, and cyber–physical anomalies, taking into account possible cascading effects over the cyber–physical process. The outcomes achieved through the implementation of our framework on the hardware-in-the-loop Water Distribution Testbed (WDT) dataset show very high detection performance with a strong ability to reject false positive events and to isolate and localize the anomalies over the cyber–physical process.
A novel bi-anomaly-based intrusion detection system approach for industry 4.0
2023, Future Generation Computer Systems
Today, industry 4.0 is becoming a major target for cybercriminals due to its hyper-connectivity. Fortunately, there are several advanced means of securing industrial systems such as Intrusion Detection Systems (IDS). However, one of the main limitations of industrial IDS is the high rate of false positives and how to distinguish a real attack from an industrial failure. This paper deals precisely with the two latter points and proposes a way to reduce the rate of false positives and to distinguish attacks from industrial failures. The proposed approach combines two kinds of IDS using Neural Network (NN) through a Decision Making System (DMS). It was tested on a real industrial environment. The performance results are promising with a high percentage of accuracy and a low false positive rate.
A new normative approach to intrusion detection in manufacturing 4.0
2023, IFAC-PapersOnLine
Today, cybercrime is eased by the emergence of the fourth industrial revolution, industry 4.0. The fourth industrial revolution is characterized by the convergence of Information Technology (IT) and Operation Technology (OT) worlds’, the huge generated data, the use of Cloud as new storage means and the limitation of the security mechanisms. All these factors have made industrial systems more vulnerable. Researchers focused on the issue of the security of industrial systems. However, the whole proposed intrusion detection systems papers are targeting either PLC (Programmable Logic Controller) or SCADA (Supervisory Control and Data Acquisition) levels. None of the proposed approaches has focused on the MES (Manufacturing Executive System) level. This paper proposes a new normative approach based on the ISA95 standard and ISO 22400 to detect intrusions in this specific level.
Security Perspective Analysis of Industrial Cyber Physical Systems (I-CPS): A Decade-wide Survey
2022, ISA Transactions
Considering the exceptional growth of Cyber Physical Systems (CPSs), multiple and potentially grave security challenges have emerged in this field. Different vulnerabilities and attacks are present in front of new generation CPSs, such as Industrial CPS (I-CPS). The underlying non-uniform standards, device heterogeneity, network complexity, etc., make it difficult to offer a systematized coverage on CPS security in an industrial environment. This work considers the security perspective of I-CPSs, and offers a decade-wide survey including different vulnerabilities, attacks, CPS components, and various other aspects. The comparative year-wise analysis of the existing works w.r.t objective, approach referred, testbed used and derived inference, is also presented over a decade. Additionally, the work details different security issues and research challenges present in I-CPS. This work attempts to offer a concise and precise literature study focused on the state-of-the-art I-CPS security. This work also encourages the young researchers to explore the wide possibilities present in this emerging field.
A novel approach for accurate detection of the DDoS attacks in SDN-based SCADA systems based on deep recurrent neural networks
2022, Expert Systems with Applications
Citation Excerpt :
During identification, density factor and discrepancy score were calculated with KNN. An optimal inconsistency threshold was determined to distinguish between consistent and inconsistent states (Almalawi, Yu, Tari, Fahad, & Khalil, 2014). Hindy et al. have created a model that detects anomalies in the water system controlled by SCADA.
Supervisory Control and Data Acquisition (SCADA) systems supervise and monitor critical infrastructures and industrial processes. However, SCADA systems running on conventional network architecture have scalability and manageability limitations. Through its programmable dynamic architecture, Software Defined Network (SDN) technology offers rapid configuration, scalability, and better manageability for SCADA systems. Combining existing SCADA systems with SDN has produced more practical SDN-based SCADA systems. However, due to their sensitive positions, SCADA systems are the targets of highly dangerous cyberattacks. In particular, failure to accurately detect and take action against cyberattacks like Distributed Denial of Service (DDoS) may lead to service disruption in SDN-based SCADA systems which may cause loss of life or massive financial losses. This study suggested the Recurrent Neural Network (RNN) classifier model, including two separate parallel deep learning-based methods, Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), to better the detection of DDoS attacks targeting SDN-based SCADA systems. The proposed parallel structure was trained from end to end with a training dataset and tested with the validation dataset. This model was processed in the transfer learning procedure. The features were extracted with the training dataset, and the extracted features were classified with Support Vector Machines (SVM). While in transfer learning, the validation data was used in feature extraction and obtained features were classified with a trained SVM classifier. As part of the work, a sample dataset containing both DDoS attacks and regular network traffic data was created using an experimentally generated SDN-based SCADA topology. While experimental works yielded an accuracy of 97.62% for DDoS attack detection, transfer learning allowed a performance improvement of around 5%. The results have shown that the proposed RNN deep learning classifier model can effectively detect DDoS attacks targeting SDN-based SCADA systems.

View all citing articles on Scopus

Xinghuo Yu Xinghuo Yu is currently with the RMIT University, Melbourne, Australia, where he is the Director of the RMIT Platform Technologies Research Institute. He has published over 350 refereed papers in technical journals, books, and conference proceedings. His research interests include variable structure and nonlinear control, complex and intelligent systems, and industrial applications. Prof. Yu is a Fellow of the Institution of Engineers Australia and the Australian Computer Society. He is currently serving as an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART I, IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS.

Zahir Tari is a full professor at RMIT University. He is also the direction of the DSN (Distributed Systems and Networking) discipline at the School of Computer Science and IT, RMIT (Australia). His main research areas are in performance and security in various areas of application (e.g. Web servers, Content Delivery Networks, SCADA systems etc). Prof Tari regularly publishes in reputable journals and conferences. He acted as the program committee chair as well as general chair over fifteen international conferences (e.g. DOA, CoopIS, ODBASE, GADA, IFIP DS 11.3 on Database Security). He is also the co-author of a few books. He has also been General Chair of more than 12 conferences. He is the recipient of 14 Australian Research Council (ARC) grants. More details about Zahir and his team can be found at http://www.cs.rmit.edu.au/dsn.

Adil Fahad received his B.S. degree in Computer Science from King Abdul Aziz University, Jeddah, Saudi Arabia, in 2003. He received his M.S. degree (with high distinction) in 2008 from RMIT University, Melbourne, Australia, and he is currently a Ph.D. candidate in the Department of Computer Science and Information Technology at the University of RMIT. He joined the University of Albaha as a lecturer in 2009 and took a leave of absence in 2010 for his Ph.D. studies. His research interests are in the areas of wireless sensor networks, mobile networks, SCADA security and ad-hoc networks with emphasis on data mining, statistical analysis/modelling and machine learning.

Ibrahim Khalil received the Ph.D. degree from the University of Berne, Berne, Switzerland, in 2003. He is a Senior Lecturer in the School of Computer Science and IT, RMIT University, Melbourne, Australia. He has several years of experience in Silicon Valley-based companies working on Large Network Provisioning and Management software. He also worked as an academic in several research universities. Before joining RMIT, he worked for EPFL and University of Berne in Switzerland and Osaka University in Japan. His research interests are quality of service, wireless sensor networks, and remote healthcare.

View full text

An unsupervised anomaly-based detection approach for integrity attacks on SCADA systems

Abstract

Introduction

Section snippets

Related work

The proposed intrusion detection approach

Experimental setup

Results and analysis

Conclusion

Future Gener Comput Syst

Comput Stand Interfaces

Future Gener Comput Syst

Comput Secur

SCADAVT – a framework for SCADA security testbed based on virtualization technology

Outlier mining in large high-dimensional data sets

IEEE Trans Knowl Data Eng

Optics:ordering points to identify the clustering structure

A linear method for deviation detection in large databases

Cover trees for nearest neighbor

Optics-of: identifying local outliers

Lof: identifying density-based local outliers

A multidimensional critical state analysis for detecting intrusions in SCADA systems

IEEE Trans Ind Inform

Using model-based intrusion detection for SCADA networks

IDS-signatures of Modbus/TCP

A geometric framework for unsupervised anomaly detection

Toward an efficient and scalable feature selection approach for internet traffic classification

Comput Netw

W32. stuxnet dossier: version 1.4

Designing secure scada systems using security patterns

Modbus/DNP3 state-based intrusion detection system

Critical state-based filtering system for securing SCADA network protocols

IEEE Trans Ind Electron

UCI machine learning repository

A branch and bound algorithm for computing k-nearest neighbors

IEEE Trans Comput

On scada control system command and response injection and intrusion detection

Secure selecticast for collaborative intrusion detection systems

The weka data mining software: an update

ACM SIGKDD Explor Newsl

One-class classification by combining density and class probability estimation

Modbus messaging on TCP/IP implementation guide v1.0a

The application on intrusion detection based on k-means cluster algorithm