ARTINALI#: An Efficient Intrusion Detection Technique for Resource-Constrained Cyber-Physical Systems
Introduction
A CPS is the key element of the Internet of Things (IoT). It is composed of a cyber system, a physical system, sensors, actuators, and networking components, by which it integrates computations and physical environment. The cyber system (aka control program) can control physical environment via actuators, and can receive feedback from physical environment via sensors in real-time. As the interaction between the physical domain and the cyber domain increases, the physical system becomes more susceptible to the security vulnerabilities that might exist in the control program [1], [2]. Therefore, the security of the entire system strongly depends on the security of the CPS’ control program. Recently, CPSes have been widely deployed in critical infrastructure such as smart medical devices [3], robots [4], smart grids [5], and Autonomous Vehicles [6], [7], [8]. These systems perform sensitive tasks and are therefore potential targets for cyberattack. However, the rapid growth of IoT has led to deployment of CPSes without support for enforcing important security properties
Intrusion Detection Systems (IDSes) are used to monitor computer systems and detect security attacks. Typical IDSes fall into two major categories: Signature-based, and behavior-based. Signature-based IDSes compare the real-time behavior of the system against known security attacks. As they rely on known attack models (signatures) they cannot detect unknown attacks [9]. This is significantly important for CPSes since they are working autonomously for long periods of time, and hence are difficult to be interrupted for frequently patching or upgrading in the field. In contrast, behavior-based systems detect intrusions by watching a system’s dynamic execution to identify suspect behavior and are able to detect both known and unknown attacks. We can further divide behavior-based systems into anomaly-based and specification-based according to how suspect behavior is defined. In an anomaly-based IDS, we build a model of normal behavior and flag deviations from that model as intrusions; in a specification-based IDS, we assume that we have the correct specifications and we look for violations of those specifications.
Specification based IDSes are proposed as the best fit for CPS security [10], [11], [12], [13]. A specification-based IDS implements two core functions: data monitoring and data analysis. Data monitoring is the process by which an IDS observes system behaviour and accumulates data logs. Data analysis is the process by which an IDS periodically analyzes the collected logs and checks them against the specifications derived from the CPS’s correct behavior. Data monitoring can be performed at host level (host-based IDS) or network level (network-based IDS). Host-based IDS is tailored to the CPS system and monitors operations of CPS application, and application and operating system. Network-based IDS; however, is attached to the network, and monitors all the incoming and outgoing traffic. Host-based IDS provide more visibility than network-based IDS into the individual CPS applications, thus is able to quickly detect CPS misbehavior. Another important advantage of using host-based IDS is distributed control over attack detection; this is especially the case for high-volume configurations like smart grids.
The locations in the system where data monitoring happens are called security monitors. IDSes depend on the information collected by the security monitors, so it is important that they capture adequate information about the run-time behaviour of the system. However, smart security solutions for CPSes need to support light-weight intelligence. On one hand, deployment of security monitors using complete information maximizes the chance of attack detection at the cost of memory usage and performance overhead, which may limit scalability. On the other hand, CPSes have specific constraints that make the current IDSes challenging for them to deploy. These constraints are:
- •
Limited memory:An IDS that is tailored to a CPS system should satisfy resource constraints. For instance, an essential module of an IDS is a pre-trained model (i.e., a set of mined specifications) that represents the correct behavior of the CPS. In some cases, the available memory is not even sufficient to hold this model; further, a large number of security monitors create large log files, that, in turn, may make the system run out of memory. These scenarios make many existing intrusion detection techniques inapplicable to CPSes [5].
- •
Real-time requirements: Real-time CPS applications place strict constraints on processing and reaction time. For example, self-driving cars need to quickly detect objects and make decisions on lane or speed changes or detecting pedestrians. From a security point of view, taking the real-time requirements into account is vital as the IDS performance overhead must not delay the expected response time of the system; particularly, in decision-making scenarios. Hence, the performance overhead must be small enough to address the CPS real-time constraints.
Due to these constraints, existing IDSes are not a good fit for CPS platforms. There has been a lot of research on intrusion detection techniques using static analysis [14], [15], [16], [17], dynamic analysis [18], [19], [20], [21], artificial intelligence [22], [23], [24], and provenance [25], [26], [27]. The static analysis-based specification mining techniques build a model of a system based on code analysis. These techniques are inherently conservative and produce few false positives. However, static analysis alone does not provide enough information about the run-time behavior of the system, which in turn, produces a lot of false negatives. Furthermore, these techniques generate large models, leading to high overheads, often exceeding the resource constraints of a CPS. Dynamic analysis-based specification mining techniques; however, observe the run-time behavior. They log the key points of the program to infer a set of likely invariants (aka specifications). Dynamic analysis-based techniques follow the assumption that common behavior is correct behavior, and hence their mined specifications reflect the common behavior rather than potential behavior like what is identified in static analysis. As most software systems are not provided with adequate test-cases, there is a chance that some execution paths are not seen when mining specifications. The result is a high false positive rate [28], [29], which makes them challenging for mission-critical CPSes. Provenance-based approaches [27] are a particular instance of a dynamic-analysis-based approach. However, collecting data provenance in a fine-grained setting imposes excessive runtime overhead [26]. Deep Learning (DL) algorithms are accurate for modeling the behavior of complex systems and detecting unknown attacks, but they consume a lot of memory, posing a problem for resource-constrained environments.
While there has been a significant amount of work on CPS security [10], [30], [31], [32], these techniques offer no systematic way to find the best trade off between accuracy and efficiency. They reduce the size of the model through coarse data monitoring, or they detect only certain categories of attacks at run-time. As a result, their models do not guarantee full attack coverage of the intrusion detection technique.
We formulate the problem of constructing an intrusion detection technique for CPSes as an optimization problem. Given a set of specifications defining correct behaviour of the CPS, a set of fine-grained security monitors that observe the run-time behavior, and statistics gathered from fault injection, we discover a sparse subset of security monitors that achieve full attack coverage. We present ARTINALI#: a greedy technique based on a feature selection algorithm that uses a Bayesian network to predict the probability of full attack coverage given information from partial attack coverage of security monitors. The use of Bayesian network along with feature selection has been shown to be effective to reduce dimensionality in a variety of applications, including data set creation [33], post-silicon validation [34], and fault diagnosis [35]. We use Bayesian-based feature selection to build an efficient IDS for CPSes. We use Bayesian inference as a scoring function in a feature selection algorithm to select a small subset of security monitors whose attack coverage exceeds a user-provided lower bound. Then, we capture data from only these monitors to evaluate whether the IDS run-time achieves high detection accuracy. To the best of our knowledge, we are the first to design an intrusion detection technique for CPS systems with preserved detection accuracy under resource constraints. We make the following contributions:
- •
We present ARTINALI#, which discovers the core set of security monitors and a rich set of specifications that yield high accuracy with low overhead.
- •
We deploy ARTINALI# in the context of ARTINALI, which is an intrusion detection technique designed for CPS systems.
- •
We build an IDS prototype for two CPS systems, an advanced metering infrastructure and a smart artificial pancreas.
- •
We evaluate our IDS on the two systems using arbitrary attacks emulated by fault injection. We find that ARTINALI# exhibits 64% and 23% reduction in the number of security monitors and specifications, respectively, which in turn, leads to 69% and 52% decrease in IDS runtime and space consumption, respectively, while preserving 98% detection accuracy against arbitrary attacks.
We organized the rest of the paper as follows: Section 2 presents the intrusion detection techniques and their current shortcomings for CPS systems. Section 3 presents background material, including an overview of ARTINALI and Bayesian networks. In Section 4, we present ARTINALI#. Section 5 introduces our case studies and explains how to build an IDS using ARTINALI# and then outlines our experimental procedure. Finally, we present an evaluation of our technique in the face of arbitrary attacks in Section 6.
Section snippets
Previous work
Prior research explored different intrusion detection techniques. However, these approaches have some the following limitations:
- •
They often detect only certain categories of known attacks at run-time.
- •
They analyze only coarse-grain information, which affects detection accuracy, especially for forensic analysis.
- •
They do not consider intrinsic code properties for attack detection.
- •
They are not designed to take into account the resource constraints of the underlying platform.
We explore related work in
ARTINALI
ARTINALI1 is a dynamic specification mining technique that generates models of CPS correct behavior for specification-based IDSes [10]. ARTINALI models the security policy of a system by defining the set of specifications that must hold true during run time. A specification, or interchangeably an invariant, is a logical condition that is preserved at a particular set of program points, e,g., the insulin dosage taken for a diabetic patient by a
Methodology
We present a systematic way to find a good trade-off between IDS attack coverage and CPS resource constraints. First, we present our problem formulation. Second, we introduce the ARTINALI# algorithm as our solution.
Experimental setup
As a proof of concept, we built an IDS based on ARTINALI# and deploy it in the context of two CPSes: an advanced metering infrastructure and an a smart artificial pancreas. We begin by stating research questions (RQs) we address and introducing the two CPS platforms. Then, we detail the procedure that we follow to build the IDS. Finally, we introduce the metrics we use to evaluate the IDS.
Evaluation
We evaluate the IDS on our CPS platforms before and after optimization using the evaluation metrics presented in Section 5.4, addressing each research questions in its own sub-section.
Discussion
We next examine the threats to the validity of our experiments and reflect on ARTINALI#’s generalizability.
Conclusion
Resource constraints of cyber-physical systems make tailoring a security solutions to them challenging. We formulated the problem of constructing an intrusion detection system for cyber-physical systems as an optimization problem. We developed ARTINALI#: a greedy technique based on a hybrid feature selection algorithm that deploys the Bayesian network capabilities for approximating the probability of full attack detection given information from partial detection of security monitors. Given a
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC).
References (79)
- et al.
Security framework for industrial collaborative robotic cyber-physical systems
Comput. Ind.
(2018) - et al.
An efficient feature selection based Bayesian and rough set approach for intrusion detection
Appl. Soft Comput.
(2020) - et al.
Feature selection based on Bayesian network for chiller fault diagnosis from the perspective of field applications
Appl. Thermal Eng.
(2018) - et al.
The daikon system for dynamic detection of likely invariants
Sci. Comput. Programm.
(2007) - et al.
Orthogonal floating search algorithms: from the perspective of nonlinear system identification
Neurocomputing
(2019) - et al.
Challenges for securing cyber physical systems
Workshop on Future Directions in Cyber-Physical Systems Security
(2009) - et al.
Software control and intellectual property protection in cyber-physical systems
EURASIP J. Inf. Secur.
(2016) Researchers fight to keep implanted medical devices safe from hackers
Computer
(2010)- et al.
Design-level and code-level security analysis of IoT devices
ACM Trans. Embedded Comput. Syst. (TECS)
(2019) - et al.
Out of control: stealthy attacks against robotic vehicles protected by control-based techniques
Proceedings of the 35th Annual Computer Security Applications Conference
(2019)
Comprehensive experimental analyses of automotive attack surfaces.
USENIX Security Symposium
Experimental security analysis of a modern automobile
2010 IEEE Symposium on Security and Privacy
A survey of intrusion detection techniques for cyber-physical systems
ACM Comput. Surv. (CSUR)
Artinali: dynamic invariant detection for cyber-physical system security
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
Intrusion detection for advanced metering infrastructures: requirements and architectural directions
2010 First IEEE International Conference on Smart Grid Communications (SmartGridComm)
Anomaly detection in cyber physical systems using recurrent neural networks
2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE)
Specification-based monitoring of cyber-physical systems: a survey on theory, tools and applications
Lectures on Runtime Verification
Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems
Proc. ACM Program. Lang.
Static specification mining using automata-based abstractions
IEEE Trans. Softw. Eng.
Symbolic mining of temporal specifications
Proceedings of the 30th International Conference on Software Engineering
Efficient context-sensitive intrusion detection.
NDSS
Nar-miner: discovering negative association rules from code for bug detection
Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Detecting bugs by discovering expectations and their violations
IEEE Trans. Softw. Eng.
Finding what’s not there: a new approach to revealing neglected conditions in software
Proceedings of the 2007 International Symposium on Software Testing and Analysis
Antminer: mining more bugs by reducing noise interference
Proceedings of the 38th International Conference on Software Engineering
Lstm-based system-call language modeling and robust ensemble method for designing host-based intrusion detection systems
Host based intrusion detection system with combined cnn/rnn model
Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Henet: A deep learning approach on intel® processor trace for effective exploit detection
2018 IEEE Security and Privacy Workshops (SPW)
Unicorn: runtime provenance-based detector for advanced persistent threats
Genealog: fine-grained data streaming provenance at the edge
Proceedings of the 19th International Middleware Conference
Runtime analysis of whole-system provenance
Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
Dynamically discovering likely program invariants to support program evolution
IEEE Trans. Softw. Eng.
General ltl specification mining (t)
2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)
Mobile network intrusion detection for iot system based on transfer learning algorithm
Cluster Comput.
Window-based statistical analysis of timing subcomponents for efficient detection of malware in life-critical systems
2019 Spring Simulation Conference (SpringSim)
Time-based intrusion detection in cyber-physical systems
Proceedings of the 1st ACM/IEEE International Conference on Cyber-Physical Systems
Reducing post-silicon coverage monitoring overhead with emulation and bayesian feature selection
2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
A quantitative methodology for security monitor deployment
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Inferring and asserting distributed system invariants
Proceedings of the 40th International Conference on Software Engineering
Cited by (7)
A model-based mode-switching framework based on security vulnerability scores
2023, Journal of Systems and SoftwareBotnet dataset with simultaneous attack activity
2022, Data in BriefCitation Excerpt :Several botnets carry out attacks at the same time with simultaneous characteristics, which are substantially more intense than sporadic and periodic attacks. Most detection systems, especially models that use clustering and deep learning techniques, consume many resources, causing problems when the detection is carried out at the same time in a short time frame [5,6]. The proposed dataset has the characteristics of simultaneous attacks in a short period, so the security system must survive resource problems when dealing with botnet attacks.
Monitoring the performance of multicore embedded systems without disrupting its timing requirements
2023, Design Automation for Embedded SystemsA Study on Self-Configuring Intrusion Detection Model based on Hybridized Deep Learning Models
2023, Proceedings - 7th International Conference on Computing Methodologies and Communication, ICCMC 2023