ARTINALI#: An Efficient Intrusion Detection Technique for Resource-Constrained Cyber-Physical Systems

https://doi.org/10.1016/j.ijcip.2021.100430Get rights and content

Abstract

Cyber-Physical Systems (CPSes) are integrated into security-critical infrastructures such as medical devices, autonomous vehicles and smart grids. Unfortunately, the pervasiveness and network accessibility of these systems and their relative lack of security measures make them attractive targets for attacks. This makes building Intrusion Detection System (IDS) for CPSes a necessity. However, detecting intrusions requires collecting information about a system’s internal workings; this can be expensive both in runtime and memory consumption. According to prior research, fine-grain monitoring of a CPS maximizes the chance of intrusion detection but incurs overhead that can exceed the resource constraints of these systems. The objective of this study is to propose a solution for adapting IDSes for deployment on resource-limited CPSes without losing detection accuracy.

We propose ARTINALI#; a Bayesian-based search and score technique that identifies the critical points at which to instrument a CPS. Given a set of security monitors that observe run-time behavior of the system, a set of specifications that verify the correct behavior of the system, and statistics gathered from fault injection, ARTINALI# discovers a small set of locations and a rich set of specifications that yield full attack coverage with low (memory and time) overhead. We deploy ARTINALI# to construct an IDS for two CPSes: a smart meter and a smart artificial pancreas. We demonstrate that our technique reduces the number of security monitors by 64% on average, leading to 52% and 69% reductions in memory and runtime overhead respectively, while still detecting over 98% of emulated attacks, on average. ARTINALI# enables the IDSes to be applicable to a wide range of CPS systems with different resource capacities. In addition, it accelerates the attack detection process which is significantly essential for safety-critical systems.

Introduction

A CPS is the key element of the Internet of Things (IoT). It is composed of a cyber system, a physical system, sensors, actuators, and networking components, by which it integrates computations and physical environment. The cyber system (aka control program) can control physical environment via actuators, and can receive feedback from physical environment via sensors in real-time. As the interaction between the physical domain and the cyber domain increases, the physical system becomes more susceptible to the security vulnerabilities that might exist in the control program [1], [2]. Therefore, the security of the entire system strongly depends on the security of the CPS’ control program. Recently, CPSes have been widely deployed in critical infrastructure such as smart medical devices [3], robots [4], smart grids [5], and Autonomous Vehicles [6], [7], [8]. These systems perform sensitive tasks and are therefore potential targets for cyberattack. However, the rapid growth of IoT has led to deployment of CPSes without support for enforcing important security properties

Intrusion Detection Systems (IDSes) are used to monitor computer systems and detect security attacks. Typical IDSes fall into two major categories: Signature-based, and behavior-based. Signature-based IDSes compare the real-time behavior of the system against known security attacks. As they rely on known attack models (signatures) they cannot detect unknown attacks [9]. This is significantly important for CPSes since they are working autonomously for long periods of time, and hence are difficult to be interrupted for frequently patching or upgrading in the field. In contrast, behavior-based systems detect intrusions by watching a system’s dynamic execution to identify suspect behavior and are able to detect both known and unknown attacks. We can further divide behavior-based systems into anomaly-based and specification-based according to how suspect behavior is defined. In an anomaly-based IDS, we build a model of normal behavior and flag deviations from that model as intrusions; in a specification-based IDS, we assume that we have the correct specifications and we look for violations of those specifications.

Specification based IDSes are proposed as the best fit for CPS security  [10], [11], [12], [13]. A specification-based IDS implements two core functions: data monitoring and data analysis. Data monitoring is the process by which an IDS observes system behaviour and accumulates data logs. Data analysis is the process by which an IDS periodically analyzes the collected logs and checks them against the specifications derived from the CPS’s correct behavior. Data monitoring can be performed at host level (host-based IDS) or network level (network-based IDS). Host-based IDS is tailored to the CPS system and monitors operations of CPS application, and application and operating system. Network-based IDS; however, is attached to the network, and monitors all the incoming and outgoing traffic. Host-based IDS provide more visibility than network-based IDS into the individual CPS applications, thus is able to quickly detect CPS misbehavior. Another important advantage of using host-based IDS is distributed control over attack detection; this is especially the case for high-volume configurations like smart grids.

The locations in the system where data monitoring happens are called security monitors. IDSes depend on the information collected by the security monitors, so it is important that they capture adequate information about the run-time behaviour of the system. However, smart security solutions for CPSes need to support light-weight intelligence. On one hand, deployment of security monitors using complete information maximizes the chance of attack detection at the cost of memory usage and performance overhead, which may limit scalability. On the other hand, CPSes have specific constraints that make the current IDSes challenging for them to deploy. These constraints are:

  • Limited memory:An IDS that is tailored to a CPS system should satisfy resource constraints. For instance, an essential module of an IDS is a pre-trained model (i.e., a set of mined specifications) that represents the correct behavior of the CPS. In some cases, the available memory is not even sufficient to hold this model; further, a large number of security monitors create large log files, that, in turn, may make the system run out of memory. These scenarios make many existing intrusion detection techniques inapplicable to CPSes [5].

  • Real-time requirements: Real-time CPS applications place strict constraints on processing and reaction time. For example, self-driving cars need to quickly detect objects and make decisions on lane or speed changes or detecting pedestrians. From a security point of view, taking the real-time requirements into account is vital as the IDS performance overhead must not delay the expected response time of the system; particularly, in decision-making scenarios. Hence, the performance overhead must be small enough to address the CPS real-time constraints.

Due to these constraints, existing IDSes are not a good fit for CPS platforms. There has been a lot of research on intrusion detection techniques using static analysis [14], [15], [16], [17], dynamic analysis [18], [19], [20], [21], artificial intelligence [22], [23], [24], and provenance [25], [26], [27]. The static analysis-based specification mining techniques build a model of a system based on code analysis. These techniques are inherently conservative and produce few false positives. However, static analysis alone does not provide enough information about the run-time behavior of the system, which in turn, produces a lot of false negatives. Furthermore, these techniques generate large models, leading to high overheads, often exceeding the resource constraints of a CPS. Dynamic analysis-based specification mining techniques; however, observe the run-time behavior. They log the key points of the program to infer a set of likely invariants (aka specifications). Dynamic analysis-based techniques follow the assumption that common behavior is correct behavior, and hence their mined specifications reflect the common behavior rather than potential behavior like what is identified in static analysis. As most software systems are not provided with adequate test-cases, there is a chance that some execution paths are not seen when mining specifications. The result is a high false positive rate [28], [29], which makes them challenging for mission-critical CPSes. Provenance-based approaches [27] are a particular instance of a dynamic-analysis-based approach. However, collecting data provenance in a fine-grained setting imposes excessive runtime overhead [26]. Deep Learning (DL) algorithms are accurate for modeling the behavior of complex systems and detecting unknown attacks, but they consume a lot of memory, posing a problem for resource-constrained environments.

While there has been a significant amount of work on CPS security [10], [30], [31], [32], these techniques offer no systematic way to find the best trade off between accuracy and efficiency. They reduce the size of the model through coarse data monitoring, or they detect only certain categories of attacks at run-time. As a result, their models do not guarantee full attack coverage of the intrusion detection technique.

We formulate the problem of constructing an intrusion detection technique for CPSes as an optimization problem. Given a set of specifications defining correct behaviour of the CPS, a set of fine-grained security monitors that observe the run-time behavior, and statistics gathered from fault injection, we discover a sparse subset of security monitors that achieve full attack coverage. We present ARTINALI#: a greedy technique based on a feature selection algorithm that uses a Bayesian network to predict the probability of full attack coverage given information from partial attack coverage of security monitors. The use of Bayesian network along with feature selection has been shown to be effective to reduce dimensionality in a variety of applications, including data set creation [33], post-silicon validation [34], and fault diagnosis [35]. We use Bayesian-based feature selection to build an efficient IDS for CPSes. We use Bayesian inference as a scoring function in a feature selection algorithm to select a small subset of security monitors whose attack coverage exceeds a user-provided lower bound. Then, we capture data from only these monitors to evaluate whether the IDS run-time achieves high detection accuracy. To the best of our knowledge, we are the first to design an intrusion detection technique for CPS systems with preserved detection accuracy under resource constraints. We make the following contributions:

  • We present ARTINALI#, which discovers the core set of security monitors and a rich set of specifications that yield high accuracy with low overhead.

  • We deploy ARTINALI# in the context of ARTINALI, which is an intrusion detection technique designed for CPS systems.

  • We build an IDS prototype for two CPS systems, an advanced metering infrastructure and a smart artificial pancreas.

  • We evaluate our IDS on the two systems using arbitrary attacks emulated by fault injection. We find that ARTINALI# exhibits 64% and 23% reduction in the number of security monitors and specifications, respectively, which in turn, leads to 69% and 52% decrease in IDS runtime and space consumption, respectively, while preserving 98% detection accuracy against arbitrary attacks.

We organized the rest of the paper as follows: Section 2 presents the intrusion detection techniques and their current shortcomings for CPS systems. Section 3 presents background material, including an overview of ARTINALI and Bayesian networks. In Section 4, we present ARTINALI#. Section 5 introduces our case studies and explains how to build an IDS using ARTINALI# and then outlines our experimental procedure. Finally, we present an evaluation of our technique in the face of arbitrary attacks in Section 6.

Section snippets

Previous work

Prior research explored different intrusion detection techniques. However, these approaches have some the following limitations:

  • They often detect only certain categories of known attacks at run-time.

  • They analyze only coarse-grain information, which affects detection accuracy, especially for forensic analysis.

  • They do not consider intrinsic code properties for attack detection.

  • They are not designed to take into account the resource constraints of the underlying platform.

We explore related work in

ARTINALI

ARTINALI1 is a dynamic specification mining technique that generates models of CPS correct behavior for specification-based IDSes [10]. ARTINALI models the security policy of a system by defining the set of specifications that must hold true during run time. A specification, or interchangeably an invariant, is a logical condition that is preserved at a particular set of program points, e,g., the insulin dosage taken for a diabetic patient by a

Methodology

We present a systematic way to find a good trade-off between IDS attack coverage and CPS resource constraints. First, we present our problem formulation. Second, we introduce the ARTINALI# algorithm as our solution.

Experimental setup

As a proof of concept, we built an IDS based on ARTINALI# and deploy it in the context of two CPSes: an advanced metering infrastructure and an a smart artificial pancreas. We begin by stating research questions (RQs) we address and introducing the two CPS platforms. Then, we detail the procedure that we follow to build the IDS. Finally, we introduce the metrics we use to evaluate the IDS.

Evaluation

We evaluate the IDS on our CPS platforms before and after optimization using the evaluation metrics presented in Section 5.4, addressing each research questions in its own sub-section.

Discussion

We next examine the threats to the validity of our experiments and reflect on ARTINALI#’s generalizability.

Conclusion

Resource constraints of cyber-physical systems make tailoring a security solutions to them challenging. We formulated the problem of constructing an intrusion detection system for cyber-physical systems as an optimization problem. We developed ARTINALI#: a greedy technique based on a hybrid feature selection algorithm that deploys the Bayesian network capabilities for approximating the probability of full attack detection given information from partial detection of security monitors. Given a

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC).

References (79)

  • S. Checkoway et al.

    Comprehensive experimental analyses of automotive attack surfaces.

    USENIX Security Symposium

    (2011)
  • K. Koscher et al.

    Experimental security analysis of a modern automobile

    2010 IEEE Symposium on Security and Privacy

    (2010)
  • R. Mitchell et al.

    A survey of intrusion detection techniques for cyber-physical systems

    ACM Comput. Surv. (CSUR)

    (2014)
  • M.R. Aliabadi et al.

    Artinali: dynamic invariant detection for cyber-physical system security

    Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

    (2017)
  • R. Berthier et al.

    Intrusion detection for advanced metering infrastructures: requirements and architectural directions

    2010 First IEEE International Conference on Smart Grid Communications (SmartGridComm)

    (2010)
  • J. Goh et al.

    Anomaly detection in cyber physical systems using recurrent neural networks

    2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE)

    (2017)
  • E. Bartocci et al.

    Specification-based monitoring of cyber-physical systems: a survey on theory, tools and applications

    Lectures on Runtime Verification

    (2018)
  • J. Späth et al.

    Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems

    Proc. ACM Program. Lang.

    (2019)
  • S. Shoham et al.

    Static specification mining using automata-based abstractions

    IEEE Trans. Softw. Eng.

    (2008)
  • M. Gabel et al.

    Symbolic mining of temporal specifications

    Proceedings of the 30th International Conference on Software Engineering

    (2008)
  • J.T. Giffin et al.

    Efficient context-sensitive intrusion detection.

    NDSS

    (2004)
  • P. Bian et al.

    Nar-miner: discovering negative association rules from code for bug detection

    Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

    (2018)
  • P. Bian et al.

    Detecting bugs by discovering expectations and their violations

    IEEE Trans. Softw. Eng.

    (2018)
  • R.-Y. Chang et al.

    Finding what’s not there: a new approach to revealing neglected conditions in software

    Proceedings of the 2007 International Symposium on Software Testing and Analysis

    (2007)
  • B. Liang et al.

    Antminer: mining more bugs by reducing noise interference

    Proceedings of the 38th International Conference on Software Engineering

    (2016)
  • G. Kim et al.

    Lstm-based system-call language modeling and robust ensemble method for designing host-based intrusion detection systems

  • A. Chawla et al.

    Host based intrusion detection system with combined cnn/rnn model

    Joint European Conference on Machine Learning and Knowledge Discovery in Databases

    (2018)
  • L. Chen et al.

    Henet: A deep learning approach on intel® processor trace for effective exploit detection

    2018 IEEE Security and Privacy Workshops (SPW)

    (2018)
  • X. Han et al.

    Unicorn: runtime provenance-based detector for advanced persistent threats

  • D. Palyvos-Giannas et al.

    Genealog: fine-grained data streaming provenance at the edge

    Proceedings of the 19th International Middleware Conference

    (2018)
  • T. Pasquier et al.

    Runtime analysis of whole-system provenance

    Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

    (2018)
  • M.D. Ernst et al.

    Dynamically discovering likely program invariants to support program evolution

    IEEE Trans. Softw. Eng.

    (2001)
  • C. Lemieux et al.

    General ltl specification mining (t)

    2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)

    (2015)
  • L. Deng et al.

    Mobile network intrusion detection for iot system based on transfer learning algorithm

    Cluster Comput.

    (2019)
  • N. Carreon et al.

    Window-based statistical analysis of timing subcomponents for efficient detection of malware in life-critical systems

    2019 Spring Simulation Conference (SpringSim)

    (2019)
  • C. Zimmer et al.

    Time-based intrusion detection in cyber-physical systems

    Proceedings of the 1st ACM/IEEE International Conference on Cyber-Physical Systems

    (2010)
  • R.O. Gallardo et al.

    Reducing post-silicon coverage monitoring overhead with emulation and bayesian feature selection

    2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

    (2015)
  • U. Thakore et al.

    A quantitative methodology for security monitor deployment

    2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

    (2016)
  • S. Grant et al.

    Inferring and asserting distributed system invariants

    Proceedings of the 40th International Conference on Software Engineering

    (2018)
  • Cited by (7)

    • Botnet dataset with simultaneous attack activity

      2022, Data in Brief
      Citation Excerpt :

      Several botnets carry out attacks at the same time with simultaneous characteristics, which are substantially more intense than sporadic and periodic attacks. Most detection systems, especially models that use clustering and deep learning techniques, consume many resources, causing problems when the detection is carried out at the same time in a short time frame [5,6]. The proposed dataset has the characteristics of simultaneous attacks in a short period, so the security system must survive resource problems when dealing with botnet attacks.

    • A Study on Self-Configuring Intrusion Detection Model based on Hybridized Deep Learning Models

      2023, Proceedings - 7th International Conference on Computing Methodologies and Communication, ICCMC 2023
    View all citing articles on Scopus
    View full text