Device discovery and identification in industrial networks: Geräteerkennung und -identifizierung in industriellen Netzen

Klaus Biß; Jörg Kippe; Markus Karch

doi:10.1515/auto-2023-0135

Open Access Published by De Gruyter (O) September 8, 2023

Device discovery and identification in industrial networks

Geräteerkennung und -identifizierung in industriellen Netzen

Klaus Biß , Jörg Kippe and Markus Karch

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2023-0135

Abstract

The Act on Federal Office for Information Security (BSI Act) explicitly mandates the use of attack detection systems. The BSI works together with operators of process plants as well as discrete manufacturing facilities in order to test sensors, which may be part of such systems, in their networks. This gives the BSI the opportunity to record a collection of network traffic from those plants. One goal is to improve the detection and characterization of devices in industrial networks by implementing new or enhanced features for the open source network monitoring tool suite Malcolm. In this context, the recording of the network traffic represents the starting point for further investigations. This paper highlights what needs to be considered in these recordings to serve as a basis for device identification and characterization.

Zusammenfassung

Das Gesetz über das Bundesamt für Sicherheit in der Informationstechnik (BSI-Gesetz) fordert ausdrücklich den Einsatz von Systemen zur Angriffserkennung. Das BSI arbeitet mit Betreibern von verfahrenstechnischen Anlagen sowie diskreten Fertigungsanlagen zusammen, um mögliche Sensorik hierfür in Anlagen zu testen. Dadurch hat das BSI die Möglichkeit, eine möglichst heterogene Sammlung von Netzwerkmitschnitten aus Anlagen aufzuzeichnen. Ein Ziel ist dabei, die Identifikation und Charakterisierung von Geräten in industriellen Netzwerken zu verbessern. Dadurch sollen neue oder bessere Anwendungen für die Open-Source-Netzwerküberwachungstoolsuite Malcolm bereitgestellt werden. In diesem Zusammenhang bildet die Aufzeichnung des Netzwerkverkehrs den Ausgangspunkt für weitere Untersuchungen. Diese Veröffentlichung zeigt auf, was bei diesen Aufzeichnungen zu beachten ist, um als Grundlage für die Geräteidentifikation und -charakterisierung zu dienen.

Keywords: device discovery; device identification; industrial control systems; Malcolm; PCAP

Schlagwörter: Geräteerkennung; Gerätecharakterisierung; industrielle Steuerungs- und Regelungssysteme; Malcolm; PCAP

1 Introduction

1.1 Motivation

The Act on Federal Office for Information Security (BSI Act – BSIG) explicitly mentions the use of attack detection systems in the Section 8a (1a) BSIG.^[1]

Therefore, operators of Critical Infrastructures are interested in testing such systems in their networks, as they are obliged to take appropriate organizational and technical precautions to prevent disruptions to the availability, integrity, authenticity and confidentiality of their information technology systems, components or processes. An attack detection system represents an effective measure for (early) detection of cyber-attacks and, in particular, supports damage reduction and damage prevention.^[2]

The German Federal Office for Information Security (BSI) works together with operators of process plants as well as discrete manufacturing facilities in order to test intrusion detection systems (IDS), which can be part of an attack detection system, at the operator’s site. This gives BSI the opportunity to record a collection of network traffic from production plants that is as heterogeneous as possible concerning used protocols, device types, and manufactures.

Interested parties (Operators) were acquired through a newsletter at the Alliance for Cybersecurity^[3] and numerous technical presentations.

While operators gain insight into several IDS, the BSI gains a broad overview of industrial networks. One goal is to improve the methods to discover and identify devices in industrial networks in order to provide support to the open source tool suite Malcolm ^[4] for network monitoring system in industrial control systems. In this context, the recording of the network traffic represents the starting point for further investigations. This paper highlights what needs to be considered in these recordings to serve as a basis for device discovery and identification. Also, it describes where and how those network data can be recorded.

1.2 Architecture and structure

To achieve the basis for device discovery and identification, corresponding network data is necessary. A prototypical implementation (see Section 6) has been done and is subject to current work. A generic architecture of the implementation is depicted in Figure 1 and shows the following main components:

Figure 1:

Generic architecture for device detection and identification.

Observation Domain

Refers to the specific locations within a network where packets are monitored and captured, forming a set of observation points from which packet data are collected for further analysis.

Metering and Feature Extraction

Aggregates the packet flows delivered by the observation points and extracts key features suitable for further processing.

Device Discovery

Takes the extracted features and generates a device model from this information, describing the discovered device objects along with some topological information.

Device Identification

Provides additional information helping to determine the identity of the device objects.

Device Inventory

Database collecting and holding the device information.

This paper discusses the individual building blocks of the architecture shown in Figure 1. Section 2 provides an overview of the various aspects that must be considered while capturing network data. Section 3 focuses on the topic of feature extraction. Device discovery and identification are discussed in Sections 4 and 5, respectively. Finally, this paper presents the testing and evaluation environment established for device discovery and identification in this project. The experiences encountered during testing are also summarized in Section 6.

2 Packet data capture

Full packet data provide the most complete information about what is going on in the network. The disadvantage of full packet data are the amount of data and the large file sizes it produces.

2.1 Capture of packet data

Device Discovery and Device Identification in context of this paper deal with the passive observation of packet data as the raw data source. This chapter describes various approaches of packet data capture, i.e. the collection of packet data from the network. This topic is well elaborated because it is an important topic within network monitoring.

Multiple methods exist for packet capture, with three major approaches:

Hubs

A hub is a networking device that repeats a packet on every interface except the interface that transmitted the packet. All hosts connected to a hub see each other’s traffic. As a result, hubs are technological outdated due to speed and collision constraints. Nevertheless, a sensor plugged into an open hub port will see every packet transmitted between other hosts sharing the same hub.

SPAN Ports

A SPAN (Switched Port ANalyzer) port is a designated switch port (may also be found on routers, firewalls or access points). SPAN is an efficient high performance traffic monitoring system to duplicate network traffic to one or more specified ports, usually to export packets to a dedicated monitoring solution. The band width of the output ports limits the amount of captured data. Three types of SPANs have to be distinguished:
1. Local SPAN: Mirrors traffic from one or more interfaces on the switch to one or more interfaces on the same switch
2. RSPAN: Remote SPAN is an extension of SPAN and allows to monitor traffic from source ports distributed over multiple switches
3. ERSPAN: Encapsulated Remote SPAN brings generic routing encapsulation (GRE) for all captured traffic and allows it to be extended across layer 3 domains.

Taps

A tap (test access point) is a networking device specifically designed for monitoring applications. Taps are used to create permanent access ports for passive monitoring on a single line. This is a packet capture solution especially on large or busy networks and it is the most expensive way to do it, but it introduces no performance penalty since it is dedicated hardware. In contrast to the SPAN port a tap works in Layer 1 (the signal level) and therefore forwards faulty packets too.

2.2 Observation domain

The network structure, as well as the location and type of observation points within the network determine which part of the network traffic can be captured:

A SPAN port depending on the configuration of the switch - may observe all the traffic (or all the traffic belonging to a certain VLAN) passing through the switch. If a VLAN extends over multiple switches an observation point per switch/per VLAN is needed for full packet capture. Such a solution requires compatible SPAN-capability on all switches which a running the VLAN. Mixture of star-topology and line-topology as found on many OT networks may require special handling (see also Subsection 2.3).

A Tap observes the traffic on a single line (one or both directions). For a more comprehensive coverage multiple taps or combinations of taps and SPAN ports are used.

2.3 The position of observation points

The optimal placement of observation points within a network depends on the specific objectives and goals of the analysis. If the aim is to provide input data for cyber attack detection it may be sufficient to observe the traffic on central points on gateways where traffic is ingressing/egressing certain network areas. However, if the aim is to discover and identify as much network devices as possible the observation of the complete network traffic is needed (full packet capture, FPC). This may require mirror ports on all respective switches and even additional taps. Multiple observation points also require an infrastructure to collect and merge different mirror streams to draw a complete picture and distribute data to corresponding applications. This can be achieved by a packet broker (PB).

2.4 Storage and replay of packet data

The processing of packet data may take place in real-time or the packets may be stored for later processing. The most common data storage format is the PCAP format,^[5] which is compatible with many analysis tools.

The processing of stored packet data requires the ability to handle the used file formats and in particular a set of separate files if the complete data is stored in several files.

2.5 Preservation of temporal relations

The analysis of temporal relations between network packets requires special considerations. OT protocols providing for real time operations (i.e. PROFINET [1]) expect prioritised processing on network switches. This priority handling has to be observed in the packet broker infrastructure.

If such traffic is stored to PCAP files the time stamps in the PCAP file originate from the PCAP writing entity, which may be far away from the network where the temporal behavior is required. This leads to a distinction between the observation point (switch, tap) and the storage point. Therefore, the PCAP file and its timestamps not always reflect the temporal relations in the original traffic.

In situations where the capturing and analyzing of network traffic are separated, additional difficulties may arise. Specifically, when traffic is first stored and then replayed for analysis later (e.g. using a tool like tcpreplay), the replay may reflect to traffic at the storage point and not the observation point.

3 Feature extraction

This chapter provides a general description of features, which can be extracted from a packet stream and may be used to derive identifying attributes of the communicating devices. Those features are either directly found in the data packets or build by abstraction of a set of packets.

3.1 Packet features/header parameters

Packet features are derived directly from single packets and typically represent implementation choices concerning the implementation of the communication stack. The reason for this is that the respective protocol standards (e.g. the RFCs describing TCP, UDP, ARP, ICMP) are not as precise as the reader may expect. There are some degrees of freedom for the programmer which make implementations distinguishable. As the communication stack is part of the operating system, this type of information is suitable for the detection of the type and version of the operation system (OS fingerprinting).

Two examples of tools that can be used for OS fingerprinting are PRADS^[6] and p0f.^[7]

3.2 Flow features

The most common abstraction on top of the packet stream is the flow model from which further abstractions and features are derived. The flow model originates from Netflow developed by Cisco to monitor network traffic passing through switches or routers (today Netflow is standardized as an RFC [2]). This flow model is based on the following 5-tupel of parameters identifying a flow:

source and destination IP address
source and destination port number
protocol (TCP, UDP, ICMP).

When a packet has the same 5-tupel attribute values then additional data are appended to the existing flow record, e.g.

Accumulated number of bytes and packets
TCP flags.

Such an abstraction concept may also be applied to layer 2 communication using e.g.

source and destination MAC address
network protocol (IPv4, IPv6, PN-RT, etc.).

A more abstract view considers only flows which are TCP connections. In this case a connection starts with a three-way-handshake (SYN, SYN-ACK, ACK) and terminates, when a RST or FIN has been observed.

An examination of concrete implementations is now warranted to provide practical examples that illustrate the theoretical concepts discussed earlier:

Netflow uses the flow approach for TCP, UDP and ICMP flows. A flow starts when a new (unseen so far) 5-tupel is observed. For connection oriented protocols a flow ends when flags like RST or FIN are observed or when no data have been received within a timeout interval (called idle timeout) (i.e. 30 s)(value may be configurable) or when a flow has been open for a certain time (called active timeout)(i.e. 15 min)(value may be configurable). In Netflow flows may be handled as unidirectional or bidirectional flows (depending on the tool and may be on the configuration).

Zeek ^[8] flows follow a more connection oriented model. A connection is started when a three-way-handshake has been observed and closed when RST or FIN has been observed. Zeek’s connection records represent bidirectional flows and cover connectionless protocols as well (e.g. UDP, ICMP).

A differentiation can be drawn between protocol and statistical parameters based on flow features. The subsequent subsections enumerate these parameters.

3.2.1 Protocol parameters

Examples of protocol parameters are:

Portnumber:

The port number is address information on the transport layer with respect to TCP and UDP identifying an communicating entity.

Network Service:

The name of the network service belonging to the protocol number.

Protocol Role:

Identifying the client-role and the server-role to the communicating entities.

Payload Data:

specific payload data depending on the protocol (e.g. DNS Queries).

TCP Communication Relation:

For each TCP connection the following parameters may be used:
1. source and destination address
2. source and destination port
3. number of packets per direction
4. number of bytes per direction
5. observed TCP flags
6. duration of the connection.
Through the process of aggregation, it is possible to derive specific subsets, e.g.
1. same tuple of destination IP and destination port (all clients of a server)
2. same tuple of source IP, destination IP and destination port (all communication of a client with a dedicated server).

UDP Communication Relation:

For each UDP connection the following parameters may be used:
1. source and destination address
2. source and destination port
3. number of packets per direction
4. number of bytes per direction.
Through the process of aggregation, it is possible to derive specific subsets,e.g.
1. same tuple of destination IP and destination port (all clients of a server)
2. same tuple of source IP, destination IP and destination port (all communication of a client with a dedicated server).

ICMP:

ICMP packet type and packet code.

3.2.2 Statistical parameters

Statistical parameters apply to series of values (e.g. number of packets, number of bytes, packet length with respect to a certain connection or a certain communication relation). Most common and well-known statistics are the following:

Minimum value
Maximum value
Standard deviation
Mean value.

4 Discovery

In this paper, Device Discovery and Device Identification are distinguished. In the context of this work, device discovery refers to the mere detection of the existence of a device whereas Device Identification covers determination of information describing the device identity (product role, product type, manufacturer, version number etc. as far as possible).

4.1 Interface discovery

As far as this work is based on packet data observed in the network the subject of discovery is an interface transmitting one or more packets. A packet contains both layer 3 and layer 2 information:

For layer 3:

Source and destination IP address and port number
Transport protocol
Application protocol.

For layer 2:

Source and destination mac address
Network protocol.

4.2 Device discovery

The observed layer 3 and layer 2 interfaces may belong together to the same device (the same communication stack) or the layer 2 interface may belong to a gateway and the layer 3 may belong to a system located in a remote subnet communicating via this gateway. The solution to this problem requires knowledge about the network structure. If it is assumed that the observation point is connected to a VLAN, the observed/discovered layer 2 interfaces are connected to this VLAN. Concerning the layer 3 interface, the local IP address ranges running on the VLAN are required. This allows to distinguish between local and remote layer 3 interfaces. In this manner, it becomes possible to establish a relationship between layer 3 and layer 2 interfaces that belong to the same communication stack. Unfortunately, this information is not sufficient to collect all interfaces belonging to a device. On the other hand, it becomes possible to identify the layer 2 interfaces that belong to a gateway device.

5 Device identification

This section presents the Device Identification methods identified in this project. These methods are classified into data-driven approaches and concrete database lookups.

5.1 Database lookup

Static information in passive gathered network traffic can be used to identify device characteristics by a simple lookup in a dataset:

Media Access Control (MAC) address: The MAC address is a unique hardware address assigned to a network adapter that can be obtained by passive network traffic. EUI-48 [3] is the standardized MAC address format that consists of two parts: the first part identifies the hardware manufacturer (OUI – Organizationally Unique Identifier), while the remaining bits are defined individually by the respective manufacturer for each interface. The IEEE Registration Authority (RA) is responsible for assigning and monitoring MAC address blocks. Information regarding current allocations and the respective vendor names can be obtained from IEEE. These datasets contain the contact information of the address block holders, enabling the identification of an unknown MAC address prefix of a network device to a manufacturer. Numerous analysis tools, such as Wireshark^[9] or Malcolm, leverage this knowledge to exhibit the manufacturer of the associated network adapter. Nevertheless, as highlighted by the developers of macDetec^[10] in [4], this approach does not facilitate the mapping to device specifics, such as specific products, due to the vendors’ freedom to allocate addresses within the assigned space. The tool macDetec [4] attempts to address the challenge of identifying devices based on their MAC addresses when the vendor assignment scheme is unknown. It is based on the underlying assumption that companies typically allocate available MAC address blocks to distinct product families and assign these blocks within each family in a sequential manner. To associate unknown MAC addresses with their respective devices, the availability of a pre-existing database containing information on known devices is crucial. Figure 2 from [4] outlines the basic procedure: The figure shows two known MAC addresses that are stored in a database and compares them to an unknown MAC address. The distance between the unknown address and the two known addresses is calculated, with a smaller distance indicating a higher likelihood of correct identification. Based on this analysis, the unidentified device is determined to be more likely to belong to the same product family as Product X, which can be used for the passive identification. Moreover, it is important to acknowledge that, beyond the fundamental assumption of a sequential assignment of MAC addresses to product families, not all product manufacturers necessarily produce the corresponding network adapters in-house, but rather procures these from external companies and subsequently integrates them into their product. This could pose a potential issue in the process of assigning MAC addresses to their respective device manufacturers.

Figure 2:

Device identification based on MAC addresses in macDetec (Figure taken from [4]).

Fingerprinting using Packet features/Header parameters: As discussed in Section 4.1, header parameters of single network packets contain implementation choices made by developers regarding the communication stack. As a result, different implementations may exhibit distinguishable variations in their packet features, which can be used for fingerprinting purposes to identify hosts on a network and determine the operating system they are running. For example, SYN/ACK packet-based identification approach uses e.g. initial Time to Live (TTL) values for IP and varying TCP window sizes to identify hosts on a network and determine their operating system. Tools like p0f^[11] utilize this approach for passive network fingerprinting. Additionally, p0f can identify application software (e.g. Firefox or Safari) through fingerprints extracted from packet headers of HTTP requests and responses. Besides HTTP, other approaches involve extracting fingerprints from application layer protocol packet headers to identify the operating system. For example, the Satori^[12] tool uses fingerprints from protocols like DHCPv4, DCHPv6 or SMB. Both p0f and Satori offer a database mapping fingerprints to the corresponding operating system or application software.

5.2 Data-driven analysis

The prevalent method employed in research for device identification involves the utilization of machine learning techniques. This approach involves the extraction of relevant features from recorded network traffic, followed by the classification step using supervised machine learning methods. The classification process is facilitated by the availability of establishing ground truth data, which serve as the labels for the classification task. A inquiry into current research has indicated that the topic of device identification is most frequently investigated within the context of Internet of Things (IoT) applications [5–9]. As noted by Sivanathan et al. in [7], within the domain of IoT device manufacturing, it is customary for manufacturers to incorporate network interface cards (NICs) provided by external vendors. Therefore, the Organizationally Unique Identifier (OUI) prefix of the device’s MAC address may not contain any meaningful information about the IoT device itself.

In the case of IOT SENTINEL [5], device fingerprints were established using 23 features extracted from the first 12 transmitted packets, with the MAC address serving as the primary identifier. Packet (e.g. IP options), protocol (e.g. binary feature indicating if one of the following application protocols is used: HTTP, DHCP, NTP) and statistical features (e.g. destination IP counter) were used to perform the classification using Random Forest.

IoTSense [9] employs the packet and protocol features similar to IoT SENTINEL, but removes the statistical features, which were considered irrelevant to the device behaviour [9]. Furthermore, the TCP Payload Length, TCP Window Size, and payload entropy features were added. Based on this feature set a variety of classifiers were evaluated, including k-nearest-neighbours, decision trees, and gradient boosting.

The work of Hamad et al. [8] differs significantly from that of IoTSense and IoT SENTINEL due to their utilization of a wider range of statistical features in conjunction with traditional packet (e.g. TTL) and protocol (e.g. port numbers) attributes. They incorporate features such as inter-arrival time (IAT) and frequency values that are obtained via the application of Fast Fourier Transformation (FFT). Based on the high usage of statistical features, different classifiers such as Support Vector Machines, k-nearest neighbor, and Random Forest were employed.

Sivanathan et al. [7] use naïve Bayes and Random Forest classifiers. To synthesize the attributes from the trace data, the raw pcap files were initially transformed into flows on an hourly basis utilizing the Joy^[13] tool, which employs a flow-oriented model akin to IPFIX or Netflow. Subsequently, the following features were extracted to create the fingerprint: packet features encompassing the employed cyber suites, protocol features encompassing DNS and NTP queries, port numbers, and rather statistical features such as flow volume, duration, rate, and sleep time.

In IoTDevID [6], the authors initially extracted all feasible packet header information from the PCAP file resulting in 111 features. Additionally, they incorporated features such as entropy of the packets or the source and destination port class (well-known (0–1023), registered (1023–49,151) and dynamic ports (49,151–65,535)). Following this, a feature selection process was undertaken to decrease the dimensionality of the feature space. Subsequently, various classifiers were evaluated to ascertain the effectiveness of the approach.

The analysed publications highlight the importance of extracting features from network information, where Layer 2 frames or Layer 3 packets often involve information from two devices. Preprocessing the network data to transform it into a device-based representation is an often-used step before applying machine learning algorithms. This step often involves filtering the network data based on the outgoing network data of a device, typically identified by its MAC or IP address.

6 Experiences

Figure 3 depicts the environment that was established for testing and evaluating device discovery and identification. Two modes of operation are distinguished:

Figure 3:

Architecture of the test environment for device discovery and device identification.

Online-mode

In online-mode, the network traffic is observed in three Subnets using sensor appliances of Malcolm, namely Hedgehog. These sensors are running Zeek, Arkime, and Suricata which provide feature extraction. The detected features (log messages) are forwarded and stored in Malcolm’s OpenSearch database.

Offline-mode

In offline-mode, recorded network traffic as PCAP files is injected into a sensor appliance. The data are processed in the same way as in the online-mode.

Data from the OpenSearch database are read periodically and processed by the discovery logic. Within this module interface and device discovery is performed and the results are stored in a PostgreSQL database. Feature information is also forwarded to identifier modules which provide additional information (manufacturer, product type, product role) related to the device objects and used to enrich the device data records. Throughout the testing process, numerous challenges were encountered that hindered the ability to utilize certain methods, such as machine learning techniques based on the log messages available in OpenSearch. The ensuing sections highlight these experiences.

(E1) Lack of Layer 2 visibility: Both Zeek and Arkime employ a flow model that operates at the network layer (layer 3) of the OSI model. This poses a significant challenge for visibility in OT networks since some network protocols operate solely on layer 2 to meet real-time requirements, like IEC-61850-Goose or Profinet-RT. To address this issue, Zeek has introduced an extension as presented in [10], which enables the integration of lower-layer protocol dissectors. Arkime has also recognized the problem of visibility of lower-layer protocols.^[14] However, the Zeek and Arkime instances available in Malcolm v23.04.0 do not capture information below the network layer.

(E2) Limited information when initial TCP hand-shake is missing: In a default Malcolm installation without additional configured log sources (such as commercial NIDS solutions), Zeek is used not only to collect flow-based information, but also to generate application layer transcripts, including e.g. details about the services used in an OPC-UA TCP connection. In this study, it was observed that access to application layer information is contingent upon the availability of the initial TCP handshake. If the handshake was not observed, no application logs were generated. Long-lived connections are frequently encountered in OT-based systems, resulting in potentially significant delays before the initial TCP handshake can be observed. Similarly, loss of packets in the mirror stream can halt tcp-based Zeek log generation, significantly reducing visibility in OT networks. Zeek has mitigated this issue by implementing a relatively new error recovery mechanism that can resynchronize the TCP stream. However, implementation of this feature for individual dissector plugins is required prior to its use.

(E3) Contextual information loss due to buffering live traffic in multiple PCAP files: The analysis of live network traffic through a monitoring interface in Malcolm can be carried out by either (1) a standalone installation or (2) network sensor appliances (Hedgehog) that forward logs to an OpenSearch instance. Throughout the analysis in this study, it was noted that there were discernible discrepancies between the generated logs in (1) and (2) even upon replaying identical network traffic to the monitoring interfaces: (1) When using the standalone installation, live network traffic is first cached in PCAP chunk files (usually 500 MB) before being analyzed individually with a tool such as Zeek (zeek -r <file>) or Suricata. In comparison (2), a hedgehog sensor runs Zeek directly on the network interface (e.g. zeek -i <interface>) for real-time analysis. The utilization of the standalone version (1) for real-time traffic analysis through buffered PCAP files presents notable drawbacks: Long-standing connections are stored in Zeek’s conn.log in a fragmented manner due to the incapacity to preserve previous context when new PCAP files are read. Moreover, referencing (E2) highlights that the majority of Zeek dissectors operating on TCP-based protocols usually do not produce logs for midstream-starting TCP connections. Upon contacting the Malcolm maintainers regarding the aforementioned issues, it was stated that the variance in implementation between the standalone (1) and network sensor (2) versions is due to technical limitations present within Arkime, Docker, and OpenSearch. Further information can be found in the related issue.^[15] Nevertheless, the maintainers of Malcolm have acknowledged the necessity of dispensing with the practice of analysing live network traffic with Zeek and Suricata through buffered PCAP files.

(E4) Missing logs due to log rotation: During processing log data, inconsistencies were observed in the Zeek logs whereby certain logs present on the Hedgehog sensor were not successfully ingested into the OpenSearch database. Further examination indicated that the absent log entries were consistently those which had been written to the disk immediately preceding the hourly log rotation process on the Hedgehog. During log rotation, the log files are renamed and then archived. These findings were communicated to the maintainers of Malcolm, who subsequently identified an issue with respect to the handling of renamed files.^[16] This issue has since been rectified.

(E5) Identification of logical connections: Quantitative features, e.g. the number of incoming connections per IP address, are frequently evaluated in the domain of cybersecurity analysis. These features are useful not only for security analysts but also for data-driven methods like machine learning. For example, such features are helpful for detecting sudden spikes in connection attempts to an IP address, which could indicate an anomaly that requires further investigation. However, the analysis of actual disruptive traffic has shown that even the seemingly straightforward task of counting connections per IP address can present challenges: Table 1 illustrates a subset of the OpenSearch database in Malcolm, which has ingested Zeek conn.log entries, revealing two interconnected connections from analysis data. The excerpt illustrates a UDP-based SNMP communication between a client at IP 172.8.5.60 and a server at 172.18.2.1, generating 7 individual log entries in 12 s. A simplistic enumeration of connections per IP address would indicate that the server had 7 incoming connections within the 12-s timeframe, assuming no further network traffic. The same calculation applies to the client regarding outgoing connections. However, the number of connections keeps rising, as expected, in this scenario. Over a span of 10 min, the number of connections between the two devices reached 182, with the pattern remaining constant: the source port changed while the destination port remained constant. In the given scenario, the network communication took place between a single client software running on a device with the IP address 172.8.5.60 and a Hirschmann-manufactured network device with the management interface accessible via IP 172.18.2.1.

Table 1:

An excerpt from the conn.log capturing UDP-based SNMP communication.

Time	src.ip	src.port	dst.ip	dst.port
8:53:55.05	172.16.5.60	52803	172.16.2.1	161
8:53:54.99	172.16.5.60	52801	172.16.2.1	161
8:53:50.58	172.16.5.60	52786	172.16.2.1	161
8:53:44.47	172.16.5.60	52752	172.16.2.1	161
8:53:37.53	172.16.5.60	52718	172.16.2.1	161
8:53:35.22	172.16.5.60	52707	172.16.2.1	161
8:53:33.49	172.16.5.60	52693	172.16.2.1	161

A similar scenario is shown in Table 2: Four connections in the hundredth-of-a-second range can be observed between a PROFINET IO-Controller with IP address 192.1.1.1 and an IO-Device at 192.1.1.2. The utilization of PROFINET-CM through DCE/RPC can be inferred from the presence of port 34,964. The DCE/RPC protocol functions by deploying a server-side endpoint mapper that listens for incoming calls. Upon receiving a request from the client, the endpoint mapper responds by providing access to a specific interface via a separate connection. Although the PROFINET start-up originates from one device, namely the IO-Controller with IP 192.1.1.1, extracting quantitative features from the conn.log of Zeek, such as the number of incoming connections, would reveal that there are two outgoing connections from the IO-Controller and two outgoing connections from the IO-Device.

Table 2:

An excerpt from the conn.log capturing the PROFINET start-up between a IO-Controller 192.1.1.1 and IO-Device 192.1.1.2.

Time	src.ip	src.port	dst.ip	dst.port
15:04:47.347	192.1.1.1	49153	192.1.1.2	34964
15:04:47.351	192.1.1.2	49152	192.1.1.1	49153
15:04:47.367	192.1.1.2	49153	192.1.1.1	34964
15:04:47.369	192.1.1.1	49156	192.1.1.2	49153

After the analysis of the information presented in Tables 1 and 2, taking into account the relevant context, it can be argued that the connections observed in both cases are part of a single logical connection between the devices. While deriving quantitative features from flow-based data obtained from Zeek or Arkime, it is essential to bear in mind that the connections identified in this manner are not necessarily logical connections.

(E6) Insufficient log data: Throughout the pre-processing of log data for machine learning-based approaches, a recurring challenge is the insufficiency of information provided in the log files: Table 3 displays an excerpt of the iso_cotp.log generated by the Amazon dissector,^[17] based on the input data of a publicly available PCAP file.^[18] The COTP protocol, which is utilized by S7comm for data transfer, is the subject of entries in the log. Each packet transferred via COTP generates a corresponding log entry. The initial log entry is recorded when data are transmitted from IP address 134.249.62.206 to IP address 134.249.61.163, while the second log entry is generated by the response transmitted from IP address 134.249.61.163. However, based on the information presented in the iso_cotp.log represented in Table 3, it is not possible to determine the direction of communication. Hence, it is unfeasible to obtain device-centric features, such as the count of transmitted COTP messages per IP address. An approach proposed in an issue on GitHub^[19] to tackle this issue involves augmenting the log with an additional binary attribute, such as is_orig, which denotes whether the data originates from the initiator of the connection or not.

Table 3:

An excerpt from the iso_cotp.log from the S7comm Zeek dissector by Amazon. The timestamp has been omitted to avoid redundancy. Both log entries correspond to the time of 4:33:19.

orig_h	orig_p	resp_h	resp_p	pdu_type
134.249.62.206	52446	134.249.61.163	102	Data
134.249.62.206	52446	134.249.61.163	102	Data

7 Conclusions

This paper serves as a comprehensive guideline for identifying devices based on passive network traffic by examining individual aspects involved in this process. The presented topics cover the key steps in the device identification process, including packet capturing, feature extraction, device discovery, and various methodologies for device identification. It is evident that each of these steps plays a crucial role in achieving the ultimate goal of identifying devices based on network traffic.

Moreover, this paper highlights a common limitation in publications that often overlook practical aspects that can arise when implementing research approaches in real-world scenarios. By sharing concrete experiences and problems encountered during the implementation of state-of-the-art approaches to device identification in Section 6, the aim is to contribute to the research community’s understanding of the practical challenges associated with device identification based on passive network traffic.

In order to transfer network recordings into a device database automatically and with as little manual maintenance as possible, machine learning methods are being investigated in addition to established methods. However, initial applications show that a larger amount of data are needed. At the same time, this data have to be well-documented. Otherwise, an overfitting to few features has to be expected. The extent to which this method can be applied in practice is still an open question.

The goal of the work is to reduce the effort for operators maintaining their device database. With a sophisticated database, unknown devices can be detected easily. Furthermore, newly reported vulnerabilities described via a security advisory can be compared with the device database. A high degree of automation for device management similar to the security advisories by CSAF^[20] would be desirable.

Corresponding author: Klaus Biß, Bundesamt für Sicherheit in der Informationstechnik (BSI), Referat OC25, Bonn, Germany, E-mail: klaus.biss@bsi.bund.de

References

[1] M. Popp, Industrial Communication with PROFINET, Profibus Nutzerorganisation, 2007.Search in Google Scholar

[2] B. Claise, Ed., RFC 3954: Cisco Systems NetFlow Services Export Version 9, 2004.10.17487/rfc3954Search in Google Scholar

[3] IEEE Standard for Local and Metropolitan Area Networks: Overview and Architecture in IEEE Std 802-2014 (Revision to IEEE Std 802-2001), 2014, pp. 1–74.Search in Google Scholar

[4] M. Niedermaier, T. Hanka, S. Plaga, A. Von Bodisco, and D. Merli, “Efficient passive ICS device discovery and identification by MAC address correlation,” in 5th International Symposium for ICS & SCADA Cyber Security Research 2018 (ICS-CSR 2018), 2018.10.14236/ewic/ICS2018.3Search in Google Scholar

[5] M. Miettinen, S. Marchal, I. Hafeez, N. Asokan, A.-R. Sadeghi, and S. Tarkoma, IoT SENTINEL: Automated Device-Type Identification for Security Enforcement in IoT, 2017, pp. 2177–2184.10.1109/ICDCS.2017.283Search in Google Scholar

[6] K. Kostas, M. Just, and M. A. Lones, “IoTDevID: a behavior-based device identification Method for the IoT in,” IEEE Internet Things J., vol. 9, no. 23, pp. 23741–23749, 2022, https://doi.org/10.1109/JIOT.2022.3191951.Search in Google Scholar

[7] A. Sivanathan, H. H. Gharakheili, F. Loi, et al.., “Classifying IoT devices in smart environments using network traffic characteristics,” IEEE Trans. Mobile Comput., vol. 18, pp. 1745–1759, 2019, https://doi.org/10.1109/tmc.2018.2866249.Search in Google Scholar

[8] S. A. Hamad, W. E. Zhang, Q. Z. Sheng, and S. Nepal, “IoT device identification via network-flow based fingerprinting and learning,” in 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2019.10.1109/TrustCom/BigDataSE.2019.00023Search in Google Scholar

[9] B. Bezawada, M. Bachani, J. Peterson, H. Shirazi, I. Ray, and I. Ray, “Behavioral fingerprinting of IoT devices,” in Proceedings of the 2018 Workshop on Attacks and Solutions in Hardware Security (ASHES ’18), New York, NY, USA, Association for Computing Machinery, 2018, pp. 41–50.10.1145/3266444.3266452Search in Google Scholar

[10] J. Grashöfer, P. Oettig, R. Sommer, T. Wojtulewicz, and H. Hartenstein, Advancing Protocol Diversity in Network Security Monitoring, 2021.Search in Google Scholar

Received: 2023-07-20

Accepted: 2023-07-21

Published Online: 2023-09-08

Published in Print: 2023-09-26

This work is licensed under the Creative Commons Attribution 4.0 International License.