Loading [MathJax]/extensions/MathMenu.js
FL-SERENADE: Federated Learning for SEmi-supeRvisEd Network Anomaly DEtection. A Case Study | IEEE Conference Publication | IEEE Xplore

FL-SERENADE: Federated Learning for SEmi-supeRvisEd Network Anomaly DEtection. A Case Study


Abstract:

The use of connected devices in the industry represents a necessity and, at the same time, a challenge. Building a network of interconnected industry assets can improve p...Show More

Abstract:

The use of connected devices in the industry represents a necessity and, at the same time, a challenge. Building a network of interconnected industry assets can improve performance and scale but can lead to dangerous security risks and attacks. However, the amount of data shared, and the widespread distribution of devices make the protection of industrial resources cumbersome. One problem is to know the type of information flowing and check for anomalies, making the job of anomaly-based Intrusion Detection Systems (IDSs) arduous. In this direction, we explore a semi-supervised approach, “Deep-SAD,” to overcome the partial knowledge of the data. Due to the size of the data, and the need for privacy measures, we combine this model with a federated learning (FL) framework “Flower,” distributing the learning phase through five industrial areas. We evaluate our implementation over the WUSTL-IIoT-2021 dataset, a testbed simulation of an actual plant where threats have been injected. This work presents and evaluates a framework for semi-supervised anomaly detection, starting with feature engineering. The results reveal that the difference in the performance of the federated and centralized settings is minimal, denoting the promising application of the federated approach. Combined with the security and privacy-preserving characteristics of FL, this demonstrates the value of the federated approach to the semi-supervised anomaly-based IDS in the IIoT networks.
Date of Conference: 14-17 November 2023
Date Added to IEEE Xplore: 25 December 2023
ISBN Information:

ISSN Information:

Conference Location: Abu Dhabi, United Arab Emirates

I. Introduction

The ongoing adoption of information technology in the industrial infrastructure is the primary driver of the emerging fourth industrial revolution, also known as Industry 4.0. It aims to radically change industrial sites by interconnecting smart devices, workers, suppliers, and customers to create intelligent systems capable of advanced analytics and decision-making. Industrial IoT is an essential component of Industry 4.0. In a broad sense, it refers to enhancing industrial systems with “smart” devices such as sensors, actuators, and RFID tags. However, hardware and software used in IoT devices are very heterogeneous, and the current technology needs to improve in consistent security standardization and risk assessment norms [1], [2]. At the same time, industrial networks usually leverage legacy systems designed with different requirements, as these systems were traditionally detached from the Internet, the so-called operation technology (OT) sector. As a result, attacks against industrial systems are on the rise due to the larger, more complex attack surface [3]. As the Kaspersky ICS CERT report from 2021 [4], out of all industrial computers that used Kaspersky tools, 39.6% received attacks during the year. Many industrial sites belong to critical infrastructure, and an interruption to their service might result in enormous costs and even lead to catastrophic events. For example, cyber-attacks on the Ukrainian power grid in 2015 resulted in a massive power outage that continued for multiple hours [5]. Therefore, Denial of Service attacks (DoSs) represent a real threat. However, tackling them in such a scenario is cumber-some [6]–[11]. In this context, anomaly-based Intrusion Detection Systems (IDSs) can model the general “regular” traffic and detect unusual activity. Modern IDSs usually require developing and training machine learning (ML) models, demanding large datasets and extensive computing resources. Hence, the deployment of ML applications typically occurs in a cloud environment, often outsourced to third-party vendors. This way, the data flows through the network to the central cloud location for ML training. However, this approach comes with shortcomings in the case of industrial IDS [12]. First, sharing the network traffic with a third-party cloud can breach privacy regulations, mainly if the provider resides in a different country. Furthermore, connecting all end nodes to the Internet and transferring confidential and sensitive data creates additional vulnerabilities. Finally, uploading the massive industrial network data to a distant cloud location results in high bandwidth costs and latency issues [13] in critical industry scenarios with real-time constraints, such as power grids and nuclear plants.

Contact IEEE to Subscribe

References

References is not available for this document.