skip to main content
research-article
Open access

OptiClass: An Optimized Classifier for Application Layer Protocols Using Bit Level Signatures

Published: 10 January 2024 Publication History

Abstract

Network traffic classification has many applications, such as security monitoring, quality of service, traffic engineering, and so on. For the aforementioned applications, Deep Packet Inspection (DPI) is a popularly used technique for traffic classification because it scrutinizes the payload and provides comprehensive information for accurate analysis of network traffic. However, DPI-based methods reduce network performance because they are computationally expensive and hinder end-user privacy as they analyze the payload. To overcome these challenges, bit-level signatures are significantly used to perform network traffic classification. However, most of these methods still need to improve performance as they perform one-by-one signature matching of unknown payloads with application signatures for classification. Moreover, these methods become stagnant with the increase in application signatures. Therefore, to fill this gap, we propose OptiClass, an optimized classifier for application protocols using bit-level signatures. OptiClass performs parallel application signature matching with unknown flows, which results in faster, more accurate, and more efficient network traffic classification. OptiClass achieves twofold performance gains compared to the state-of-the-art methods. First, OptiClass generates bit-level signatures of just 32 bits for all the applications. This keeps OptiClass swift and privacy-preserving. Second, OptiClass uses a novel data structure called BiTSPLITTER for signature matching for fast and accurate classification. We evaluated the performance of OptiClass on three datasets consisting of twenty application protocols. Experimental results report that OptiClass has an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.

1 Introduction

Network traffic classification is an essential task for Internet Service Providers (ISPs), network administrators, researchers, and security architects to identify the type of applications flowing in the network and perform analysis on protocols of different network packets for the efficient management of a network [31]. Network traffic classification serves various applications for network management, such as providing quality of service, network security, trend analysis, fault diagnosis, anomaly detection, and so on. Traditionally, the port-based method was used to identify network traffic applications. This method was useful for identifying applications whose port numbers are registered with Internet Assigned Number Authority (IANA) [9]. The port-based method works well with fixed/standard port numbers [24] that can detect application protocols running on standard ports such as HTTP, FTP, SMTP, DNS, POP, NTP, SSH, and Telnet, and so on. However, this technique failed to operate with peer-to-peer (P2P) application protocols and proprietary protocols that use dynamic port numbers such as Skype [12], eDonkey [1], emule [2], BitTorrent [3] that are not assigned, controlled, or registered by IANA. Moreover, P2P-based applications like Tor [4] use tunneling to bypass the network traffic, making the port-based method unreliable and obsolete for network traffic classification. To address these issues, statistics based methods [11, 25, 30, 36, 45], correlation-based methods [13, 14, 34, 43, 44], behavior-based methods [22, 29], and DPI-based methods [16, 17, 18, 41, 42] are used for traffic classification. Statistics-based methods rely on statistical features, including the number of packets, minimum, maximum, mean packet size, and so on. This type of method protects the user’s privacy but involves the occurrence of too many redundant features. Correlation-based methods classify network traffic by finding the correlation between network flows of the same packets using five-tuple information: source IP address, destination IP address, source port address, destination port address, and transport layer protocol. This type of method avoids feature redundancy but still has high computational overhead. Another new perspective on traffic classification is behavior-based methods that perform traffic classification by looking at behaviors of the end devices, like their IP address, protocol, and port used for communication. This type of method has high classification accuracy, but classification results are not fine-grained [46], and the methods are not scalable for many end devices. To overcome these challenges, DPI methods are proposed that analyze the payload for detailed and accurate information. DPI-based methods are popular in traffic classification because of their high classification accuracy. DPI-based methods identify applications by generating signatures using payload. A major challenge faced by DPI-based methods is to capture a unique signature for each application with minimal signature length. This is a primary requirement to reduce computational overhead. Generating signatures using byte-level information is a traditional and mostly manual effort that is error-prone. Few works in the literature propose signature generation automatically [20, 27, 33, 37, 39]. However, they focus on the extraction of byte-level information from the payload content of a packet. Byte-level information seems unconventional nowadays because many application protocols use data formats that operate on bit level [18, 40, 41]. Moreover, the available network traffic classification methods that use bit-level signatures perform linear signature matching. But this method shows performance fall with the increase in application signatures. To fill this gap, we propose OptiClass, which generates bit-level signatures to identify application protocols with parallel and accurate signature matches. We summarize our contribution in this article as follows:
We propose OptiClass, an optimized classifier that uses an application-specific DPI-based method for classifying application layer protocols using bit-level signatures. The classifier performs parallel matching of all available application signatures simultaneously with an unknown application flow. This makes OptiClass highly scalable.
OptiClass uses a novel data structure called BiTSPLITTER for accurate and parallel signature matching. BiTSPLITTER is created by inheriting the properties of well known CROWN graph and LADDER graph.
OptiClass is computationally inexpensive and less susceptible to user’s privacy as it generates signatures from the first ‘n’ bits of bidirectional flows. Our experiments showed that only 32 bits of application signatures are sufficient for accurate classification.
We performed extensive experiments on three different datasets consisting of text, binary, and proprietary application protocols. OptiClass has achieved average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.
We organize the remainder of this article as follows. In Section 2, we describe related work. In Section 3, we explain our proposed method OptiClass. We then perform the complexity analysis of each module of OptiClass in Section 4. We show the experiments performed on OptiClass and we give the results in Section 5. We end this article with the conclusion and future direction in Section 6.

2 Related Work

In this section, we discuss state-of-the-art network traffic classification methods. These methods mainly fall into two categories: Shallow Packet Inspection (SPI) based and Deep Packet Inspection (DPI) based methods. These methods are discussed as follows:

2.1 Shallow Packet Inspection-based Methods

SPI based methods are lightweight inspection methods that screen the header of the network packets for classifying network traffic. Karagiannis et al. [21] developed a method to classify peer-to-peer network traffic using two packet-level features, IP-TCP/IP-UDP pair and IP-Port pair. However, this method is limited to peer-to-peer applications only. Lin et al. [23] utilized packet size distribution associated with each transport layer connection. Each application has a distinct distribution combined with a port to accelerate the application classification. However, a limited set of applications were tested during the experiments performed on this method. Hence, the method may need to be more scalable. Grimaudo et al. [15] proposed a method that uses 40 transport layer features and then trained the hierarchical classifier using these features for the application classification. However, the experimental evaluation only considers applications with TCP flows and no applications with UDP flows. Zhang et al. [43] proposed an unsupervised classifier that utilized flow-based statistical features, such as the number of packets per flow, volume of flow, and so on, along with the correlation information between flows like source-destination IP pairs, port pairs, and so on. for traffic classification. This method has high time complexity because the method has a high prediction time to find correlated flows. Zhang et al. [44] proposed a supervised classification method that uses statistical correlation information between network flows for traffic classification. However, the method takes significant classification time because it combines different variants of Nearest Neighbor algorithms (AVG-NN, MIN-NN, and MVT-NN) for classification. Divakaran et al. [14] implemented semi-supervised classification based on correlation information between the network packets of dynamic network traffic. However, significant classification accuracy is not attained when the classifier is trained with a large amount of training data. This may not be possible with dynamic network traffic, which is prevalent nowadays. Wang et al. [36] utilized distributed Spark platform to achieve parallel optimization of Convolutional Neural Networks to perform real-time network traffic classification. Experimental results show the high classification accuracy of the datasets. However, the datasets are balanced, which is not valid with real-time network traffic.

2.2 Deep Packet Inspection-based Methods

Deep Packet Inspection-based methods examine a packet’s content (payload) for more detailed information to make more accurate network traffic classifications than SPI-based methods. However, the DPI-based methods are slower than the SPI-based methods. Wang et al. [38] proposed ProDecoder for the automatic generation of protocol message format without prior knowledge about the protocol specifications based on the n-grams with the same semantics (such as relationship among multiple common byte sequences) of protocol network traces. However, ProDecoder will not work on binary application protocols as these protocols operate on a bit level. Yun et al. [42] proposed SECURITAS that uses the packet’s payload to generate n-grams, which are then combined to create keywords based on the latent relationship between n-grams to identify each application protocol from mixed network traffic. However, this method employs Gibbs sampling to identify keywords from n-grams, which are computationally expensive. Tongaonkar et al. [33] presented SANTaCLASS, which automatically extracts keywords from the packets’ payload based on their occurrence frequency in payloads. These keywords are arranged in occurrence order to generate the application signatures, which are further used in classification. However, the method is suitable only for text-based application protocols. Swarnkar et al. [32] proposed RDClass, which is based on Relative Distance Constraint Counting Automata (RDCCA). It accepts a set of keywords and their relative distances extracted from the payload in an encoded format to identify unknown applications and classify network flows. However, this method also classifies only text-based protocols. Yuan et al. [41] proposed BitMiner, which is a bit-level classifier that utilizes correlation between bit-values and bit-positions in each network flow for the automatic generation of signatures. Experimental evaluation is done only on UDP traffic of six protocols. Hence, the granularity of the method on other types of traffic is unknown. Yuan et al. [40] proposed BitsLearning which extended the same idea by utilizing machine learning algorithms. This method uses features such as packet sizes (PS) and flow ports along with bit values as their respective positions for traffic classification. However, experiments on TCP-based network traffic were not furnished in their article. Hubballi et al. [18] proposed BitCoding, which is a bit-level classification method. The method uses Transition Constraint Counting Automata (TCCA) as an application signature that takes the encoded format of n-bit signatures as input for bit-level matching signatures with unknown traffic flows. As BitCoding only uses invariant bits as part of a signature, the remaining bits have no role in the classification. Hence, if a signature contains a large portion of variant bits, then ignoring that complete portion causes information loss. Hubballi et al. [19] proposed BitProb that constructs a space-efficient state transition machine that uses probabilistic bit-level signatures for classifying applications. The BitProb uses a threshold value to decide whether a new test flow belongs to a particular application. If a threshold value is set to a higher number, then the new test flow will not cross it and will remain unclassified, and if the threshold value is set to a lower value, then the new test flow will be classified into another class. Hence, deciding the threshold every time would be a critical scenario. Hubballi et al. [16] proposed KeyClass that does fast searching of keywords in the payload by skipping certain parts of the payload for the quick identification of applications. The authors designed a new finite state machine based on the AhoCorasick algorithm [10] for achieving the task. The KeyClass works with bytes. Hence, it is incompatible with binary application protocols. Recent applications, including intelligent grid networks [28], industrial applications [26], and intrusion detection systems [35] indicate the importance of DPI in the upcoming future and invoke the need for designing robust methods for network traffic classification.

3 Proposed Method

In this section, we describe our proposed method OptiClass, which is an optimized framework for accurately classifying application protocols through bit-level signatures. The proposed method is a lightweight DPI-based method that uses signatures extracted from bit sequences of the payload of network flows for classifying applications. The architecture diagram of OptiClass is shown in Figure 1. We can see from Figure 1 that the OptiClass has two phases: The training and testing phases. These two phases are explained in the following two subsections.
Fig. 1.
Fig. 1. OptiClass Architecture

3.1 Training Phase

In this phase, OptiClass generates the bit-level signatures of each application protocol from the network traces. After that, all the generated bit signatures are inserted in a novel data structure called BiTSPLITTER, which is then used for efficient, accurate, and parallel signature matching of test flows against the signatures. This phase consists of three modules: Traffic Preprocessing, Bit Signature Generation, and BiTSPLITTER Creation. Each of the modules is explained below.

3.1.1 Traffic Preprocessing.

The input to this module is the network traffic traces of applications, and the output is the bidirectional binary flows reconstructed from the packets of the network traces. A network flow is a series of communication bounded by exchanging packets between two hosts. A unique combination of five tuples identifies a flow: source IP address, destination IP address, source port address, destination port address, and transport layer protocol. There are two transport layer protocols: TCP and UDP. Therefore, flows are also of two types: TCP flow and UDP flow. A TCP flow is reconstructed from all the packets exchanged between two hosts, say Host A to Host B, in a bidirectional flow sequence. This sequence starts with TCP connection establishment via a three-way handshake and closes using a two-way handshake. All packets are exchanged between a TCP connection, and its disconnection becomes a part of that flow. Following shows TCP connection and disconnection between two hosts Host A and Host B:
TCP Connection:
(1)
A →[SYN] →B
(2)
A ←[\(SYN/ACK\)] ←B
(3)
A →[ACK] →B
TCP Disconnection:
(1)
A →[FIN] →B or A ←[FIN] ←B
(2)
A ←[ACK] ←B or A →[ACK] →B
On the other hand, UDP is a connectionless protocol, and flows are constructed using timing information. A UDP flow consists of all the packets exchanged between two hosts with inter-packet timing of not more than a threshold value \(\delta\) whose value is defined by the user and is fixed suitably. Timing information along with the same tuples: source IP address, destination IP address, source port number, and destination port number is a part of a flow.

3.1.2 Bit Signature Generation.

The input to this module is the reconstructed flows of an application protocol, and the output is the generated binary signature for the same application. This makes OptiClass a supervised method that takes network traces of known applications to generate its bit-level signature. To keep OptiClass lightweight and privacy-preserving, the method generates the signature of an application by using only the first “n” bits from the payload of each of the “f” bidirectional binary flows extracted from the network trace of application “A”. Therefore, the signature of the application is also of length “n”. Application signature generation happens under the following conditions:
(1)
The \(i^{th}\) bit of signature has value as “1” only if all the bits at \(i^{th}\) position of every flow is “1”.
(2)
The \(i^{th}\) bit of signature has value as “0” only if all the bits at \(i^{th}\) position of every flow is “0”.
(3)
The \(i^{th}\) bit of signature has value as ASTERICK\(*\)” if the bits at \(i^{th}\) position are varying.
An example of an application signature of length “10” generated from three bidirectional binary flows of the same application with each flow of length “10” is shown in Figure 2.
Fig. 2.
Fig. 2. Bit Signature Generation
We can see from Figure 2 that the bits at position \(B_2\), \(B_3\), and \(B_6\) in each flow is “0” and therefore, the bit values at the corresponding positions of the signature are also “0”. Similarly, the bits at position \(B_5\), \(B_7\), \(B_8\), and \(B_9\) in all the application flows are always “1” and hence the bit values at the corresponding positions of the signature are also “1”. However, the bits at all other remaining positions, i.e., \(B_1\), \(B_4\), and \(B_{10}\), are varying bits; therefore, these bits are inconsistent and represented by “\(*\)”. Since there is a high possibility that the network trace of an application may contain a small number of broken flows due to packet re-transmission, network congestion, bit errors, and so on, which may result in incorrect signature generation, we kept a threshold of 0.9 to overcome this problem. In other words, if the bits at position “i” remain constant 90 out of 100 times, then the signature will have that bit value at the “\(i^{th}\)” position.

3.1.3 BiTSPLITTER Creation.

The input to this module is the generated signatures of each application, and the output is the generated BiTSPLITTER. Unlike most methods that perform linear signature matching with a test flow, BiTSPLITTER performs parallel signature matching. BiTSPLITTER is a novel data structure created by inheriting the properties of the CROWN Graph [5] and LADDER Graph [6]. We constructed the BiTSPLITTER by taking motivation from the CROWN and the LADDER graph. BiTSPLITTER is formally defined as a weighted, directed, and binary-valued acyclic graph G(\(V, E, W\)) such that:
V represents the two sets of vertices {\(x,y\)} such that \(x=\){\(x_{1}, x_{2}, \dots , x_{n}\)} and \(y=\) {\(y_{1}, y_{2}, \dots , y_{n}\)} where value at vertex \(x_i \in x\) = 0 and value at vertex \(y_i \in y\) = 1.
Edge \(e_{ij} \in E\) is a directed edge either from \(x_i \rightarrow x_j\) or \(x_i \rightarrow y_j\) or \(y_i \rightarrow x_j\) or \(y_i \rightarrow y_j\) where \(j=i\)+1.
Root vertex is the start vertex of the BiTSPLITTER such that set x is present to the left and set y is present to the right of the root, respectively.
\(w_{ij} \in W\) is a weight on each directed edge \(e_{ij}\) which is initially an empty set {\(\phi\)}. \(w_{ij}\) is updated on each edge based on application signatures generated in the previous step. For an application signature \(S_i\) which is a sequence of n bits consisting of {0,1,*}, the weights on the edges are updated as follows:
The BiTSPLITTER is traversed from the root, and \(w_{ij}\) is updated on edge \(e_{ij}\) with \(S_i\) based on the bit value of that signature at the first value. If the bit value of \(S_i\) at first position is “1” then \(w_{root,y_1}=\lbrace S_1\rbrace\). Similarly, if the bit value of \(S_i\) at first position is “0” then \(w_{root,x_1}=\lbrace S_1\rbrace\). However, If the bit value of \(S_i\) at first position is “*” then \(w_{root,x_1}=\lbrace S_1\rbrace\) as well as \(w_{root,y_1}=\lbrace S_1\rbrace\). In the same way, the weights on other edges are updated based on bit values at respective positions.
When another application signature \(S_2\) is introduced, the BiTSPLITTER is traversed, and weights are updated in the same way as it is done for application signature \(S_1\). However, for any edge \(e_{ij}\) which has weight \(w_{ij}=\lbrace S_1\rbrace\) instead of \(w_{ij}=\phi\) and \(S_2\) need to update the weight on the same edge then its weight is updated as \(w_{ij}=\lbrace S_1, S_2\rbrace\)
The algorithm to generate BiTSPLITTER is shown in Algorithm 1. BiTSPLITTER generation process takes application signatures {\(S_{1}, S_{2}, \dots , S_{k}\)} as input and output is a BiTSPLITTER. Initially, empty BiTSPLITTER is generated with a root and “n” vertices on the left and “n” vertices on the right, respectively, where “n” is the length of application signatures. Moreover, all directed edges are connected as per the constraints discussed above. Then weights on each edge initiate with {\(\phi\)}. Subsequently, each bit of signature is mapped with vertices of the BiTSPLITTER such that weights present on the edge are updated with the signature label to which the bit belongs. If a bit of a signature is “0”, then there is a directed edge either from root to \(x_{1}\) or \(x_{i}\) to \(x_{i+1}\) or \(y_{j}\) to \(x_{i+1}\). Similarly, a bit of a signature is “1”, then there is a directed edge either from root to \(y_{1}\) or \(y_{j}\) to \(y_{j+1}\) or \(x_{i}\) to \(y_{j+1}\). Finally, if a bit is “*” then there is a directed edge from root to \(x_{1}\), \(y_{1}\) or \(x_{i}\) to \(x_{i+1}\), \(y_{j}\) to \(y_{j+1}\) or \(y_{j}\) to \(x_{i+1}\), \(x_{i}\) to \(y_{j+1}\). Each new bit weight on the directed edge is updated with \(S_{k}\).
Let us take an example to understand the BiTSPLITTER generation from three sample signatures, each of length three as \({S_{1}=101}\), \({S_{2}=*11}\), and \({S_{3}=111}\). This example is shown in Figure 3. Since the signature length is 3, initially, a BiTSPLITTER is generated with all weights as empty, denoted by {\(\phi\)}, which is shown as the first in Figure 3(a). Let us insert the first signature \(S_1\)=“101” in the BiTSPLITTER. This is shown in Figure 3(a). The first bit of \(S_1\) is “1”, which is inserted on the edge between root and \(y_1\) (right vertex at level 1 of the BiTSPLITTER), and the weight \(w_{root,y_1}\) is updated with {\(S_1\)}. This is shown in Figure 3(a). The second bit of \(S_1\) is “0”, which is inserted on the edge between \(y_1\) (the current vertex) and \(x_2\) (left vertex at level 2 of the BiTSPLITTER), and weight \(w_{y_1x_2}\) is updated as {\(S_1\)}. Then, the third bit of \(S_1\), which is “1”, is also inserted in the same way and shown in Figure 3(a). Next, we insert the Signature \(S_2\)=“*11”, shown in Figure 3(b). The first bit of \(S_2\) is “*”, which is a varying bit, and hence both the weights \(w_{root,x_1}\) and \(w_{root,y_1}\) are updated with \(S_2\) as {\(S_2\)} and {\(S_1\), \(S_2\)} respectively. Similarly, weights for the other two bits, “1” and “1”, are updated with \(S_2\) in the BiTSPLITTER. Finally, the third application signature \(S_3\) is updated on the BiTSPLITTER as shown for application signatures \(S_1\) and \(S_2\). This is shown in Figure 3(c).
Fig. 3.
Fig. 3. BiTSPLITTER Generation Example

3.2 Testing Phase

The testing phase of OptiClass consists of two modules: the Traffic Preprocessing Module and the Flow Classification Module. The traffic preprocessing module of the testing phase is the same as that of the training phase. The flow classification module is explained as follows:

3.2.1 Flow Classification Module.

The input to this module is the BiTSPLITTER obtained from the training phase and test flows received from the flow reconstruction module. A test flow is matched against the BiTSPLITTER to identify the application to which this flow belongs. To perform efficient and parallel matching of one test flow against all signatures, flow classification of new test flows using BiTSPLITTER \(G(V, E, W)\) is explained as follows:
Each test flow “f” is passed into a BiTSPLITTER at the root “R.” When a bit of a test flow is “0” or “1,” we traverse to the left or right of the root, respectively.
Initially, we have a universal set “U” that consists of a set of signatures as {\(S_{1}, S_{2}, \dots , S_{k}\)}.
With each traversed bit, we perform intersection on the weights associated with \(e_{ij}\).
For the first bit of a test flow “f,” the first vertex of BiTSPLITTER is visited such that weight \(w_{ij}\) of \(e_{ij}\) is the intersection of “U” and \(w_{ij}\).
For the next subsequent bits of an “f,” \(w_{ij}\) of \(e_{ij}\) is the intersection of \(w_{ij}\) and resultant of the intersection of “U” and \(w_{ij}\) computed from the previous step.
These steps are repeated until we reach the last vertex of the BiTSPLITTER. We are left with a final set of weights at the end vertex.
If a final set contains only {\(\phi\)} then “f” is unclassified. However, if a set contains one weight or more than one weight, then “f” is classified, undecided, or misclassified respectively.
Let us take an example to understand the flow classification using the BiTSPLITTER we have already created in Figure 3. We have taken three test flows as “011”, “111”, and “000” in Figure 4. Each test flow is passed into a BiTSPLITTER, and labeled as classified, undecided, misclassified, or unclassified. Initially, we have a universal set “U” that consists of all three signature labels such that U = {\(S_{1}\), \(S_{2}\), \(S_{3}\)}. The first test flow “011” is passed into a BiTSPLITTER from the root, as shown in Figure 4(a). The first bit of this test flow is “0” and \(w_{root,x_1}\) is {\(S_{2}\)} such that U \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, the second bit is “1” and \(w_{x_1,y_2}\) is {\(S_{2}\)} such that {\(S_{2}\)} \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, and the third bit is “1” and \(w_{y_2,y_3}\) are {\(S_{2}\), \(S_{3}\)} such that {\(S_{2}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\)}. Since, intersection set consists single signature as {\(S_{2}\)} thus test flow “011” belongs to signature {\(S_{2}\)} and hence this is a case of classification. Similarly, the second test flow “111” is passed into BiTSPLITTER, as shown in Figure 4(b). The first bit of this test flow is “1” and \(w_{root,y_1}\) is {\(S_{1}\), \(S_{2}\), \(S_{3}\)} such that U \(\cap\) {\(S_{1}\), \(S_{2}\), \(S_{3}\)} = {\(S_{1}\), \(S_{2}\), \(S_{3}\)}, second bit is “1” and \(w_{y_1,y_2}\) is {\(S_{2}\), \(S_{3}\)} such that {\(S_{1}\), \(S_{2}\), \(S_{3}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\), \(S_{3}\)}, and third bit is “1” and \(w_{y_2,y_3}\) is {\(S_{2}\), \(S_{3}\)} such that {\(S_{2}\), \(S_{3}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\), \(S_{3}\)}. Since the intersection set consists of two signature labels as {\(S_{2}\), \(S_{3}\)} thus, test flow “111” belongs to both {\(S_{2}\), \(S_{3}\)}, and hence this is a case of indecision or ambiguity. In the same case, if a set consists of a signature label as {\(S_{3}\)}, then this is a case of misclassification. Finally, the third test flow “000” is passed into BiTSPLITTER, as shown in Figure 4(c). The first bit of this test flow is “0,” and \(w_{root,x_1}\) is {\(S_{2}\)} such that U \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, the second bit is “0” and \(w_{x_1,x_2}\) is {\(\phi\)} such that {\(S_{2}\)} \(\cap\) {\(\phi\)} = {\(\phi\)}, and the third bit is “0” and \(w_{x_2,x_3}\) is {\(\phi\)} such that {\(\phi\)} \(\cap\) {\(\phi\)} = {\(\phi\)}. Since the intersection set consists {\(\phi\)}, test flow “000” does not belong to any signature; hence, this is a case of not classification.
Fig. 4.
Fig. 4. Flow Classification Example
Flow classification process takes input parameters as BiTSPLITTER, universal set U consisting of a known set of weights {\(S_{1}\), \(S_{2}\), \(S_{3}\)}, empty SignatureSet, and test flow “f” that needs to be classified. If a bit of a test flow is “0” and the current vertex is the root, then the new weight of the directed edge from the current vertex to its left vertex is an intersection of U and \(w_{ij}\) such that U is updated with the intersection result. Similarly, the new weight is an intersection of updated U and \(w_{ij}\) for the next vertex. Similarly, if a bit of a new test flow is “1”, then the same process is repeated to the current vertex’s right. Final weights are stored in SignatureSet{}. If this set contains {\(\phi\)}, then “f” remains unclassified. Else, “f” can either be classified or misclassified. When SignatureSet{} consists of only one weight, then “f” is classified. However, if SignatureSet{} consists of more than one weight, then “f” is misclassified. The step-by-step process is shown in Algorithm 2.

4 Complexity Analysis

This section discusses the asymptotic complexity of each module of OptiClass. We show the complexity of modules of OptiClass in Table 1. The first module is the traffic preprocessing module, which examines each packet header and adds it to the corresponding flow to reconstruct the flow. If there are “f” flows and each flow is reconstructed with “p” packets, then it has a \(O(f \times p)\) complexity. The following module is the signature generation module, where the application signature is generated from “f” application flows with each flow of “n” bits. Since the number of bits “n” is a constant, the complexity of this module becomes \(O(f)\). The following module is the BiTSPLITTER creation, which is generated from “m” application signatures with each signature of length “n.” Since n is a constant value, the effect on the module’s complexity depends only on m, and the module’s complexity is \(O(m)\). In the flow classification module, each test flow of “n” bits is matched with the BiTSPLITTER generated from application signatures with the signature length of “n.” However, all signatures are compared simultaneously with each of the “n” bits of the “f” test flows, and the complexity becomes \(O(f \times n)\).
Table 1.
ModuleComplexityExplanation
Traffic Preprocessing ModuleO(\(f \times p\))f are the total number of flows and p are the number of packets in a flow.
Signature GenerationO(f)f is the total number of flows of an application A.
BiTSPLITTER CreationO(m)m is the number of application signatures to be inserted.
Flow ClassificationO(\(f \times n\))n is the number of bits in f test flows.
Table 1. Module-Wise Complexity Analysis of OptiClass

5 Experiments and Results

This section describes the experiments performed to assess the efficiency of OptiClass. We conducted extensive experiments to check the efficiency of OptiClass across various network scenarios. We first discuss the datasets used to perform the experiments. Next, we executed the OptiClass on the datasets to evaluate its performance. After that, we performed the sensitivity analysis of OptiClass by varying the key parameters to identify the responsiveness of the method to those parameters. The performance of OptiClass is compared with five closely related recent state-of-the-art methods.

5.1 Dataset Description

Opticlass experiments were conducted based on three datasets. The first dataset was mentioned in previous articles [17, 18]. The second and third datasets are from two publicly available sources: Digital Corpora [7] and the Swedish Defense Research Agency’s FOI Information Warfare Lab [8]. Our datasets consist of twenty application protocols, which contain both text and binary application protocols. Moreover, among these application protocols, few are proprietary protocols, while others are open. Table 2 provides the list of protocols used in the experiments and their types (text/binary, open/proprietary). In the subsequent sections, we named the first dataset “Private”, the second and third datasets are named “Public-1” and “Public-2”, respectively. We split the datasets into two equal sets, a training set and a testing set, in which each set contains 50% application flows of each of the three datasets. Training datasets are used for generating signatures of each application protocol, and testing datasets are used for identifying traffic flows. Training and testing dataset divisions of Private, Public-1, and Public-2 are shown in Tables 3, 4, and 5 respectively.
Table 2.
AbbreviationProtocolTypeProprietariness
BACnetBuilding Automation and Control networkBinaryASHRAE
BitTorrentBit torrent protocolTextNo
BJNPUsed to communicate with printerBinaryCanon
BootpBootstrap protocolBinaryNo
CUPSCommon Unix Printing SystemTextApple Inc.
DNSDomain Name SystemBinaryNo
DropboxDropbox LAN Sync protocolTextDropbox
GsmIpGSM over Internet protocolTextNo
HTTPHyper Text Transfer ProtocolTextNo
KerberosKerberos protocolBinaryNo
MWBPMicrosoft Windows Browsing ProtocolTextMicrosoft
NBNSNetBIOS Name ServiceBinaryNo
NBSSNetBIOS Session ServiceBinaryNo
NTPNetwork Time ProtocolBinaryNo
POPPost Office ProtocolTextNo
QUICQuick UDP Internet ConnectionsBinaryNo
RPCRemote Procedure CallBinaryNo
SIPSession Initiation ProtocolTextNo
SMTPSimple Mail Transfer ProtocolTextNo
SSHSecure ShellBinaryNo
Table 2. Application Protocols used in the Experiments
Table 3.
ProtocolTCP/UDPTrainingTesting
  FlowsSize (MB)FlowsSize (MB)
BitTorrentTCP00789245.800791150.4
DNSUDP32576005.732762005.7
DropboxUDP01138098.201128153.4
HTTPTCP48834220.448878328.3
SIPUDP00609194.100640191.4
SMTPTCP00597010.100608022.9
SSHTCP01104006.201106006.2
Total\(-\)\(\mathbf {85647}\)\(\mathbf {7 8 0 . 5}\)\(\mathbf {85913}\)\(\mathbf {8 5 8 . 3}\)
Table 3. Private Dataset Statistics
Table 4.
ProtocolTCP/UDPTrainingTesting
  FlowsSize (MB)FlowsSize (MB)
BACnetUDP00009000.09700011000.074
BJNPUDP00034000.02600038000.031
BootpUDP00086004.40000081004.500
CUPSUDP00047000.10700045000.218
DNSUDP25469012.90025850011.100
DropboxUDP00026000.10900025000.319
HTTPTCP17964151.10017968133.600
MWBPUDP00008000.56500007000.574
NBNSUDP00982007.80000982007.500
NTPUDP00201000.65200201000.141
QUICUDP00127000.11000093000.115
SMTPTCP00520010.10000521009.900
Total\(-\)\(\mathbf {45473}\)\(\mathbf {187.996}\)\(\mathbf {45822}\)\(\mathbf {168.072}\)
Table 4. Public-1 Dataset Statistics
Table 5.
ProtocolTCP/UDPTrainingTesting
  FlowsSize (MB)FlowsSize (MB)
BootpUDP0009100.080000910.096
DNSUDP0096300.865009581.200
GsmlpTCP0000900.007000090.015
HTTPTCP0025704.800002539.000
KerberosUDP0066901.600006721.900
NBNSUDP0029000.853002890.680
NBSSTCP0037702.700003733.900
NTPUDP0020200.145002000.648
POPTCP0005600.035000570.036
RPCTCP0000700.020000070.141
Total\(-\)\(\mathbf {2921}\)\(\mathbf {1 1 . 1 0 5}\)\(\mathbf {2909}\)\(\mathbf {1 7 . 6 1 6}\)
Table 5. Public-2 Dataset Statistics

5.2 Evaluation

OptiClass is built using Java programming language with JnetPcap packet parsing library and can generate bidirectional flows from the network traces. We conducted three different experiments to perform an in-depth evaluation of OptiClass. These three experiments are homogeneous, heterogeneous, and grand experiments. The training and testing parts are taken from the same dataset in homogeneous experiments. These experiments check the efficiency of OptiClass for one site at once. In heterogeneous experiments, the training part is taken from one dataset and tested from the testing part of the other two datasets. These experiments evaluate the OptiClass for site independence, i.e., if OptiClass is trained from one site and can perform on another site. All three datasets are combined for training and testing in a grand experiment. This experiment checks whether the OptiClass is trained from multiple locations and can become more robust and accurate for classification. We reported the performance evaluation of OptiClass for all three experiments using Recall, Precision, and F1-Score, whereas we also showed the confusion matrix for a grand experiment.

5.2.1 Homogeneous Experiments.

These experiments are done by selecting training and testing sets from the same datasets. Results of all homogeneous experiments are compiled in Table 6. Experiments on the Private dataset are shown in Table 6 a. These tables show that the recall rate of all application protocols except HTTP and SIP is 100%. This means all the testing flows are correctly matched with their respective signature except for HTTP and SIP under this experiment. The recall rates of HTTP and SIP are 90.95% and 97.96 %, respectively, because some testing flows remained unmatched. The precision of HTTP is 100%, except for other protocols. The average F1-score of all protocols is 98.37%. Similarly, experiments done on the \(Public-1\) dataset are shown in Table 6 b. This table shows that the recall rate of most application protocols is 100% except for a few, which are Bacnet, HTTP, NBNS, and NTP. We can see a low recall rate of NTP because there is a high similarity between signatures of NTP and QUIC. Moreover, the differentiating bit positions also have “*” values, resulting in misclassification. The average precision for most protocols is 90.52% except MWBP and NTP. This is because other application protocols match with MWBP and NTP. The average F1-score is 97.12% except NTP. Finally, experiments on the \(Public-2\) dataset are shown in Table 6 c. This table shows that the recall of all applications is more than 98% except NBSS. The reason is that the testing flows of NBSS matched with the signature of Kerberos and NBNS because of the relatively high number of “*” in Kerberos and NBNS. The precision of all application protocols is 100% except Kerberos because other protocols match with Kerberos. The average F1-score is 98.62%.
Table 6.
Table 6. Recall, Precision, and F1-Score for Homogeneous Experiments

5.2.2 Heterogeneous Experiments.

These experiments are done by selecting a training set from one dataset and testing sets from the other two datasets. The first heterogeneous experiment is performed by using training data from the Private dataset and testing data from the Public-1 and Public-2 datasets. The recall for this experiment is shown in Table 7. We can see from Table 7 a that a 100% recall is achieved except for HTTP applications because these flows remain misclassified. Precision is 100% except for DNS. The reason is other protocols match with DNS because of a relatively higher number of “*” in DNS. The average F1-score is 93.67%. Similar results can be seen in Table 7 b. The second heterogeneous experiment is performed using training data from the Public-1 dataset and testing data from Private and Public-2 datasets. The recall for this experiment is shown in Table 8. We can see in Table 8 a that more than 90% recall is obtained. Precision is 100% for all application protocols. The average F1-score is 99.16% However, in Table 8 b, NBNS has a comparatively lower recall rate because the remaining flows of NBNS are misclassified as NTP. Precision is 100% except for DNS. The average F1-score is 94.04%. The third heterogeneous experiment is performed by using training data from the Public-2 dataset and tested on testing data of the Private and Public-1 datasets. The recall, precision, and F1-score for this experiment are shown in Table 9. In both Tables 9 a and Table 9 b, we found that all the recall was more than 93%. In Table, 9 a precision is 100% for HTTP except DNS. The average F1-score is 99.22%. In Table, 9 b precision is 100% except for DNS and NBNS. The average F1-score is 94.04%.
Table 7.
Table 7. Recall, Precision, and F1-Score for Training with Private Dataset
Table 8.
Table 8. Recall, Precision, and F1-Score for Training with Public-1 Dataset
Table 9.
Table 9. Recall, Precision, and F1-Score for Training with Public-2 Dataset

5.2.3 Grand Experiment.

In this experiment, we combined the training data of all three datasets to create a BiTSPLITTER. Then, we tested the efficiency of BiTSPLITTER by testing the grand dataset, Private dataset, Public-1 dataset, and Public-2 dataset. The recall, precision, and F1-score for each testing data is shown in Table 10. We obtained an overall improved recall rate, precision, and F1-Score compared to those in the homogeneous and heterogeneous experiments. On grand testing data, we got an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively. Thus, it can be deduced that the BiTSPLITTER generated from the training data of the grand dataset is more robust than that of homogeneous and heterogeneous experiments. We showed the grand experiment’s flow-wise classification of testing data as a confusion matrix. The confusion matrix is shown in Table 11. The first column of the confusion matrix represents application signatures, and the first row represents the number of testing flows, respectively. This confusion matrix shows the classification of each test flow when tested against all the available application signatures. Also, in Table 11, the misclassified flows are shown in red. For example, BACnet has 11 testing flows, all of which only match the signature of BACnet. This represents no misclassification for BACnet test flows. Similarly, no misclassification is obtained for BitTorrent, BJNP, Bootp, CUPS, Dropbox, GsmIpa, Kerberos, MWBP, POP, QUIC, RPC, SMTP, and SSH. However, DNS has 59,577 testing flows, out of which 34, 02, 01, and 38 are misclassified with Kerberos, MWBP, NTP, and QUIC, respectively. Similarly, HTTP has 67099 testing flows, out of which 101, 26, 4, 4, 1, 1, 8864, and 67 are misclassified with CUPS, Kerberos, MWBP, NBNS, NBSS, NTP, QUIC, and SIP, respectively. A similar case appears for NBNS, NBSS, NTP, and SIP. In our confusion matrix, more than one value appears in red, indicating that testing flows of one application protocol are misclassified with more than one signature of other application protocols.
Table 10.
ProtocolGrand DatasetPrivate DatasetPublic-1 DatasetPublic-2 Dataset
 RecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-Score
BACnet100.00100.00100.00\(-\)\(-\)\(-\)100.00100.00100.00\(-\)\(-\)\(-\)
BJNP100.00100.00100.00\(-\)\(-\)\(-\)100.00100.00100.00\(-\)\(-\)\(-\)
BitTorrent100.00100.00100.00100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)
Bootp100.00100.00100.00\(-\)\(-\)\(-\)100.00100.00100.00100.00100.00100.00
CUPS100.00100.00100.00\(-\)\(-\)\(-\)100.00095.74097.82\(-\)\(-\)\(-\)
DNS099.87099.93099.89099.77099.96099.86099.99099.53099.75099.89092.37095.98
Dropbox100.00100.00100.00100.00100.00100.00100.00100.00100.00\(-\)\(-\)\(-\)
GsmIp100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)100.00100.00100.00
HTTP086.48086.74086.60090.95100.00095.26074.30100.00085.25088.53100.00093.91
Kerberos100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)100.00092.68096.20
MWBP100.00100.00100.00\(-\)\(-\)\(-\)100.00087.50093.33\(-\)\(-\)\(-\)
NBNS084.18084.31084.24\(-\)\(-\)\(-\)087.67100.00093.42072.31100.00083.93
NBSS085.79085.79085.79\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)085.79100.00092.35
NTP097.25097.25097.25\(-\)\(-\)\(-\)094.52100.00097.18100.00100.00100.00
POP100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)100.00100.00100.00
QUIC100.00100.00100.00\(-\)\(-\)\(-\)100.00068.88081.57\(-\)\(-\)\(-\)
RPC100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)100.00100.00100.00
SIP093.75093.75093.75084.37099.44091.28\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)
SMTP100.00100.00100.00100.00100.00100.00100.00100.00100.00\(-\)\(-\)\(-\)
SSH100.00100.00100.00100.00100.00100.00\(-\)\(-\)\(-\)\(-\)\(-\)\(-\)
Table 10. Recall, Precision, and F1-Score for Grand Dataset
Table 11.
ProtocolsBACnetBitTorrentBJNPBootpCUPSDNSDropboxGsmIpaHTTPKerberosMWBPNBNSNBSSNTPPOPQUICRPCSIPSMTPSSH
 (11)(791)(38)(172)(45)(59577)(1153)(9)(67099)(672)(7)(1271)(373)(401)(57)(93)(7)(640)(1129)(1106)
BACnet110000000000000000000
BitTorrent0791000000000000000000
BJNP003800000000000000000
Bootp0001720000000000000000
CUPS00004500010100000000000
DNS00000595020000019900000000
Dropbox00000011530000000000000
GsmIpa00000009000000000000
HTTP000000005803100000000000
Kerberos0000034002667200530000000
MWBP00000200407000000000
NBNS00000000400107000000000
NBSS0000000010003200000000
NTP0000010010000390000000
POP000000000000005700000
QUIC000003800886400201109304000
RPC00000000000000007000
SIP00000000670000000060000
SMTP00000000000000000011290
SSH00000000000000000001106
Table 11. Confusion Matrix for Evaluation of Grand Dataset with 32 bits

5.3 Sensitivity Analysis

We performed the sensitivity analysis of OptiClass to identify the responsiveness of the method by varying the number of bits used for signature generation and classification. We have done two experiments for sensitivity analysis. The first experiment is done by varying signature lengths. Second, by varying threshold values. Results for both experiments are done with Recall, Precision, and \(F1-Score\) evaluation metrics.

5.3.1 Signature Length.

Setting the best signature length to obtain decent classification results with minimum classification duration is highly significant. Moreover, finding a suitable length signature hinders minimal user privacy. Therefore, we conducted experiments on the grand dataset by training and testing on 16, 32, and 48 bits, respectively. Table 12 shows the results for these experiments. We can notice that OptiClass results in a lower average recall, precision, and F1-score with 16 bits. The reason is that the number of bits used for classification is lesser than the required bits. It means that introducing a few more bits in the signature may result in good classification. Thus, the improvement in the recall is observed when tested with 32 bits of signatures. Moreover, the average recall, precision, and F1-score is decent, with 32 bits of application signatures. However, OptiClass with 48 bits of signatures did not show any significant recall improvement. This happened because the increased bits contain more “*” than the fixed values in their signatures.
Table 12.
Protocols16 Bits32 Bits48 Bits
 RecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-Score
BACnet100.00100.00100.00100.00100.00100.00100.00100.00100.00
BJNP97.36100.0098.66100.00100.00100.00100.00100.00100.00
BitTorrent100.00100.00100.00100.00100.00100.00100.00100.00100.00
Bootp100.00100.00100.00100.00100.00100.00100.00100.00100.00
CUPS88.88100.0094.11100.00100.00100.00100.0049.4566.17
DNS100.00100.00100.0099.8799.9399.8999.8799.7099.78
Dropbox100.00100.00100.00100.00100.00100.00100.00100.00100.00
GsmIp100.00100.00100.00100.00100.00100.00100.00100.00100.00
HTTP86.48100.0092.7486.4886.7486.6086.48100.0092.74
Kerberos93.15100.0096.45100.00100.00100.0090.6282.8586.56
MWBP85.71100.0092.30100.00100.00100.00100.00100.00100.00
NBNS100.0012.1221.6184.1884.3184.2484.18100.0091.41
NBSS99.73100.0099.8685.7985.7985.7978.55100.0087.98
NTP91.27100.0095.4397.2597.2597.2597.25100.0098.60
POP100.00100.00100.00100.00100.00100.00100.00100.00100.00
QUIC100.00100.00100.00100.00100.00100.00100.00100.00100.00
RPC100.00100.00100.00100.00100.00100.00100.00100.00100.00
SIP91.25100.0095.4293.7593.7593.7591.2591.6791.45
SMTP100.00100.00100.00100.00100.00100.00100.00100.00100.00
SSH100.00100.00100.00100.00100.00100.00100.00100.00100.00
Average96.6995.6094.3397.3697.3897.3796.4196.1895.73
Table 12. Sensitivity Analysis by Varying Signature Length

5.3.2 Threshold Value.

In OptiClass, if a few flows are broken or incomplete, these may influence the signature generation due to a mismatch in bit positions. Thus, if we keep the threshold as “1,” then a large number of bit positions will have a “*” value due to broken flows. This will result in a weak application signature and reduce the classification accuracy of the OptiClass. Thus, a reduced threshold value is required in order to cater to this problem. So, we used the threshold value of 0.9 for the OptiClass. Before choosing the appropriate threshold value, we conducted experimental analysis on a grand dataset with 0.7, 0.75, 0.8, 0.85, 0.9, and 0.95 threshold values and observed recall, precision, and F1-score for each of the application protocols. Experiments show satisfying results at a 0.9 threshold value, as shown in Table 13.
Table 13.
Protocol0.7 Threshold0.75 Threshold0.8 Threshold0.85 Threshold0.9 Threshold0.95 Threshold
 RecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-Score
BACnet100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
BJNP100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
BitTorrent100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
Bootp79.06100.0088.3079.06100.0088.3079.06100.0088.30100.00100.00100.00100.00100.00100.00100.00100.00100.00
CUPS66.6688.2375.9477.7776.0876.9188.8854.0567.22100.0033.5850.27100.00100.00100.00100.0032.3748.90
DNS99.9499.6599.7999.9499.6699.7999.9399.6699.7999.9399.6699.7999.8799.9399.8997.98100.0098.97
Dropbox100.00100.00100.00100.00100.00100.0099.22100.0099.60100.00100.00100.00100.00100.00100.00100.00100.00100.00
GsmIp100.00100.00100.00100.00100.00100.00100.0050.0066.66100.00100.00100.00100.00100.00100.00100.00100.00100.00
HTTP86.48100.0092.7486.48100.0092.7486.48100.0092.7486.48100.0092.7486.4886.7486.6095.32100.0097.60
Kerberos23.3665.9634.5075.2984.6179.6782.1485.1883.6382.1485.5883.82100.00100.00100.00100.0095.8697.88
MWBP85.71100.0092.3071.42100.0083.32100.0077.7787.49100.0077.7787.49100.00100.00100.00100.0070.0082.35
NBNS84.1897.7190.4484.1898.6190.8284.1899.0791.0284.1899.4491.1784.1884.3184.2499.9251.2567.75
NBSS77.7498.9787.0777.7499.3187.2177.7499.3187.2178.2899.3287.5585.7985.7985.7998.9299.7399.32
NTP56.60100.0072.2871.57100.0083.4287.9299.7093.4495.5199.7497.5797.2597.2597.2599.2512.7222.54
POP100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
QUIC100.0093.0096.37100.0097.0098.47100.0098.0098.98100.0099.0099.49100.00100.00100.00100.00100.00100.00
RPC71.428.7715.6285.71100.0092.3085.71100.0092.3085.71100.0092.30100.00100.00100.00100.00100.00100.00
SIP80.00100.0088.8880.0090.7885.0480.0090.7885.0480.0090.7885.0493.7593.7593.7593.7589.8291.74
SMTP100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
SSH100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
Average85.5592.6186.7189.4597.3092.9092.5692.6791.6794.6194.2493.3697.3697.3897.3799.2587.5890.35
Table 13. Sensitivity Analysis by Varying Threshold Value

5.4 Performance Comparison

We compared the performance of OptiClass with five closely related state-of-the-art methods BitMiner [41], BitFlow [40], BitPack [40], BitCoding [18], and BitProb [19]. All five state-of-the-art methods are supervised bit-level network traffic classification methods. We compared OptiClass with all five methods by testing them on the grand dataset using recall, precision, and F1-score as evaluating parameters. This comparison is shown in Table 14. We can see that OptiClass performed equally or better for 15 out of 20 application protocols. We then compared the time required for classification for OptiClass and the other five closely related state-of-the-art methods to identify the faster method between them. The classification time is calculated for the same experiment, which is done for performance comparison. Table 15 shows the classification time comparison. We deduced from Table 15 that OptiClass is, on average, 3.13, 2.94, 2.06, 15.20, and 22.24 times faster than BitMiner, BitFlow, BitPack, BitCoding, and BitProb respectively. The reason for the faster classification of OptiClass is BiTSPLITTER, which performs simultaneous signature matching.
Table 14.
ProtocolsBitMinerBitFlowBitPackBitCodingBitProbOptiClass
 RecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-ScoreRecallPrecisionF1-Score
BACnet100.00100.00100.0050.00100.0066.66100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
BJNP100.00100.00100.00100.00100.00100.000.000.000.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
BitTorrent100.00100.00100.0088.77100.0094.0575.00100.0085.71100.00100.00100.0099.36100.0099.67100.00100.00100.00
Bootp100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.0097.13100.0098.54100.00100.00100.00
CUPS100.00100.00100.00100.00100.00100.00100.00100.00100.0093.6173.0782.0782.6073.0777.54100.00100.00100.00
DNS43.90100.0061.0199.68100.0099.83100.00100.00100.0099.7599.9999.8697.9398.0697.9999.8799.9399.89
Dropbox100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.0099.8799.8799.87100.00100.00100.00
GsmIp100.00100.00100.0016.66100.0028.5666.66100.0079.99100.00100.00100.00100.00100.00100.00100.00100.00100.00
HTTP99.48100.0099.7436.22100.0053.1753.33100.0069.56100100.00100.0086.4799.8892.6986.4886.7486.60
Kerberos83.03100.0090.7322.86100.0037.2125.00100.0040.00100.0098.6099.29100.0098.6099.29100.00100.00100.00
MWBP0.000.000.0093.16100.0096.45100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
NBNS4.72100.009.0194.47100.0097.1566.66100.0079.9999.0552.4268.5599.0599.5299.2884.1884.3184.24
NBSS98.65100.0099.3233.75100.0050.4666.66100.0079.99100.00100.00100.00100.00100.00100.0085.7985.7985.79
NTP99.50100.0099.7594.11100.0096.9692.85100.0096.29100.00100.00100.00100.00100.00100.0097.2597.2597.25
POP100.00100.00100.0066.66100.0079.9966.66100.0079.99100.00100.00100.0099.11100.0099.55100.00100.00100.00
QUIC21.50100.0035.3922.22100.0036.3685.00100.0091.89100.00100.00100.0099.54100.0099.76100.00100.00100.00
RPC100.00100.00100.0075.00100.0085.71100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
SIP100.00100.00100.00100.00100.00100.0050.00100.0066.6691.03100.0095.3091.03100.0095.3093.7593.7593.75
SMTP100.00100.00100.0060.52100.0075.40100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00100.00
SSH100.00100.00100.0012.34100.0021.9666.66100.0079.99100.00100.00100.00100.00100.00100.00100.00100.00100.00
Average82.5395.0084.7468.3210075.9977.0395.0083.3199.1796.2097.2597.6098.4597.9797.3697.3897.37
Table 14. Performance Comparison on Grand Dataset
Table 15.
ProtocolsBitMinerBitFlowBitPackBitCodingBitProbOptiClass
BACnet1.861.001.001.170.400.02
BJNP1.841.001.001.950.380.07
BitTorrent1.796.142.354.344.160.24
Bootp1.841.001.000.581.250.10
CUPS1.801.001.001.660.570.04
DNS1.984.464.1043.6624.004.88
Dropbox1.867.224.508.9410.000.20
GsmIp1.771.001.000.320.410.02
HTTP2.076.563.5098.57184.004.69
Kerberos1.840.030.021.015.930.18
MWBP1.750.040.021.350.450.02
NBNS1.870.700.401.352.060.23
NBSS1.780.040.030.671.570.16
NTP1.700.010.010.581.310.14
POP1.821.001.000.330.590.05
QUIC1.760.010.011.770.640.05
RPC1.741.001.000.260.450.01
SIP1.841.001.005.0410.200.18
SMTP1.790.590.541.621.950.24
SSH1.850.680.571.417.750.18
Average1.821.711.208.8212.900.58
Table 15. Classification Time Comparison (in seconds) on Grand Dataset

6 Conclusion and Future Work

In this article, we presented the OptiClass framework based on bit-level signatures to classify application layer protocols accurately. OptiClass uses the first “n” bits of data extracted from the payload of bidirectional flows. These flows of each application generate the application signatures using invariant bits at fixed positions. Subsequently, the BiTSPLITTER is created, containing the signatures of all the application protocols. After that, BiTSPLITTER is used for accurate, fast, and efficient flow classification. With extensive experimentation, we showed that OptiClass is a robust method for network traffic classification, which utilizes only 32-bit application signatures and achieved average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively. Moreover, when the OptiClass is compared with the five state-of-the-art methods BitMiner, BitFlow, BitPack, BitCoding, and BitProb, it is found that OptiClass performed classification approximately 3.13, 2.94, 2.06, 15.20, and 22.24 times faster than BitMiner, BitFlow, BitPack, BitCoding, and BitProb respectively because of the use of a novel data structure BiTSPLITTER. However, the limitation of OptiClass is that it is a supervised method that requires labeled application flows to generate their signatures. Secondly, our proposed method did not focus on the encrypted network traffic. In the future, we target to develop an unsupervised approach to build over the OptiClass for encrypted network traffic classification. Moreover, the hardware implementation of OptiClass can be an interesting future direction, and positioning the system within an actual network will be a focal point.

7 Acknowledgments

We sincerely acknowledge that this work is supported by Science and Engineering Research Board (SERB)

References

[2]
[n.d.]. https://www.emule-project.net. Accessed: 2022-09-20.
[3]
[n.d.]. https://www.bittorrent.com/. Accessed: 2022-09-20.
[4]
[n.d.]. https://www.torproject.org/. Accessed: 2022-09-20.
[7]
[n.d.]. https://digitalcorpora.org/. Accessed: 2022-11-01.
[8]
[n.d.]. https://www.netresec.com/. Accessed: 2022-11-01.
[9]
[n.d.]. Internet Assigned Numbers Authority (IANA). Retrieved September 19, 2022 from https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml. Accessed: 2022-09-19.
[10]
Alfred V. Aho and Margaret J. Corasick. 1975. Efficient string matching: An aid to bibliographic search. Communications of the ACM 18, 6 (1975), 333–340.
[11]
Furat Al-Obaidy, Shadi Momtahen, Md Foysal Hossain, and Farah Mohammadi. 2019. Encrypted traffic classification based ML for identifying different social media applications. In Proceeding of the 32nd IEEE Canadian Conference of Electrical and Computer Engineering. IEEE, 1–5.
[12]
Riyad Alshammari and A Nur Zincir-Heywood. 2011. Can encrypted traffic be identified without port numbers, IP addresses and payload inspection? Computer Networks 55, 6 (2011), 1326–1350.
[13]
Lei Ding, Jun Liu, Tao Qin, and Haifei Li. 2017. Internet traffic classification based on expanding vector of flow. Computer Networks 129, 1 (2017), 178–192.
[14]
Dinil Mon Divakaran, Le Su, Yung Siang Liau, and Vrizlynn LL Thing. 2015. SLIC: Self-learning intelligent classifier for network traffic. Computer Networks 91, 1 (2015), 283–297.
[15]
Luigi Grimaudo, Marco Mellia, and Elena Baralis. 2012. Hierarchical learning for fine grained internet traffic classification. In Proceedings of the 8th International Wireless Communications and Mobile Computing Conference. 463–468.
[16]
Neminath Hubballi and Pratibha Khandait. 2022. KeyClass: Efficient keyword matching for network traffic classification. Computer Communications 185, 1 (2022), 79–91.
[17]
Neminath Hubballi and Mayank Swarnkar. 2017. BitCoding: Protocol type agnostic robust bit level signatures for traffic classification. In Proceedings of the 32nd IEEE Global Communications Conference. 1–6.
[18]
Neminath Hubballi and Mayank Swarnkar. 2018. BitCoding: Network traffic classification through encoded bit level signatures. IEEE/ACM Transactions on Networking 26, 5 (2018), 2334–2346.
[19]
Neminath Hubballi, Mayank Swarnkar, and Mauro Conti. 2020. BitProb: Probabilistic bit signatures for accurate application identification. IEEE Transactions on Network and Service Management 17, 3 (2020), 1730–1741.
[20]
Maya Kapoor, Garrett Fuchs, and Jonathan Quance. 2021. RExACtor: Automatic regular expression signature generation for stateless packet inspection. In Proceedings of the 20th IEEE International Symposium on Network Computing and Applications. 1–9.
[21]
Thomas Karagiannis, Andre Broido, Michalis Faloutsos, and KC Claffy. 2004. Transport layer identification of P2P traffic. In Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement. 121–134.
[22]
Jan Kohout, Tomáš Komárek, Přemysl Čech, Jan Bodnar, and Jakub Lokoč. 2018. Learning communication patterns for malware discovery in HTTPs data. Expert Systems with Applications 101, 1 (2018), 129–142.
[23]
Ying-Dar Lin, Chun-Nan Lu, Yuan-Cheng Lai, Wei-Hao Peng, and Po-Ching Lin. 2009. Application classification using packet size distribution and port association. Journal of Network and Computer Applications 32, 5 (2009), 1023–1030.
[24]
Andrew W. Moore and Konstantina Papagiannaki. 2005. Toward the accurate identification of network applications. In Proceedings of the 6th International Workshop on Passive and Active Network Measurement. 41–54.
[25]
Vladimir A. Muliukha, Leonid U. Laboshin, Alexey A. Lukashin, and Nikolay V. Nashivochnikov. 2020. Analysis and classification of encrypted network traffic using machine learning. In Proceedings of the 23rd International Conference on Soft Computing and Measurements. 194–197.
[26]
Osborn N. Nyasore, Pavol Zavarsky, Bobby Swar, Raphael Naiyeju, and Shubham Dabra. 2020. Deep packet inspection in industrial automation control system to mitigate attacks exploiting Modbus/TCP vulnerabilities. In Proceedings of the 6th IEEE Intl Conference on Big Data Security on Cloud, IEEE Intl Conference on High Performance and Smart Computing, and IEEE Intl Conference on Intelligent Data and Security. 241–245.
[27]
Byung-Chul Park, Young J. Won, Myung-Sup Kim, and James W. Hong. 2008. Towards automated application signature generation for traffic identification. In Proceedings of the 9th IEEE Network Operations and Management Symposium. 160–167.
[28]
Gonzalo De La Torre Parra, Paul Rad, and Kim-Kwang Raymond Choo. 2019. Implementation of deep packet inspection in smart grids and industrial internet of things: Challenges and opportunities. Journal of Network and Computer Applications 135, 1 (2019), 32–46.
[29]
Buyun Qu, Zhibin Zhang, Xingquan Zhu, and Dan Meng. 2015. An empirical study of morphing on behavior-based network traffic classification. Security and Communication Networks 8, 1 (2015), 68–79.
[30]
Muhammad Shafiq, Xiangzhan Yu, Ali Kashif Bashir, Hassan Nazeer Chaudhry, and Dawei Wang. 2018. A machine learning approach for feature selection traffic classification using security analysis. The Journal of Supercomputing 74, 10 (2018), 4867–4892.
[31]
Muhammad Shafiq, Xiangzhan Yu, Asif Ali Laghari, Lu Yao, Nabin Kumar Karn, and Foudil Abdessamia. 2016. Network traffic classification techniques and comparative analysis using machine learning algorithms. In Proceedings of the 2nd IEEE International Conference on Computer and Communications. 2451–2455.
[32]
Mayank Swarnkar and Neminath Hubballi. 2018. RDClass: On using relative distance of keywords for accurate network traffic classification. IET Networks 7, 4 (2018), 273–279.
[33]
Alok Tongaonkar, Ruben Torres, Marios Iliofotou, Ram Keralapura, and Antonio Nucci. 2015. Towards self adaptive network traffic classification. Computer Communications 56, 1 (2015), 35–46.
[34]
Thijs van Ede, Riccardo Bortolameotti, Andrea Continella, Jingjing Ren, Daniel J. Dubois, Martina Lindorfer, David Choffnes, Maarten van Steen, and Andreas Peter. 2020. Flowprint: Semi-supervised mobile-App fingerprinting on encrypted network traffic. In Proceedings of the 30th Network and Distributed System Security Symposium. 1–18.
[35]
Xiang Wang, Yang Hong, Harry Chang, KyoungSoo Park, Geoff Langdale, Jiayu Hu, and Heqing Zhu. 2019. Hyperscan: A fast multi-pattern regex matcher for modern \(\lbrace\)CPUs\(\rbrace\). In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation. 631–648.
[36]
Xiao Wang, Ying Liu, and Wei Su. 2019. Real-time Classification method of network traffic based on parallelized CNN. In Proceeding of the 1st IEEE International Conference on Power, Intelligent Computing and Systems. 92–97.
[37]
Yu Wang, Yang Xiang, Wanlei Zhou, and Shunzheng Yu. 2012. Generating regular expression signatures for network traffic classification in trusted network management. Journal of Network and Computer Applications 35, 3 (2012), 992–1000.
[38]
Yipeng Wang, Xiaochun Yun, M Zubair Shafiq, Liyan Wang, Alex X. Liu, Zhibin Zhang, Danfeng Yao, Yongzheng Zhang, and Li Guo. 2012. A semantics aware approach to automated reverse engineering unknown protocols. In Proceedings of the 20th IEEE International Conference on Network Protocols. 1–10.
[39]
Mingjiang Ye, Ke Xu, Jianping Wu, and Hu Po. 2009. Autosig-automatically generating signatures for applications. In Proceedings of the 9th IEEE International Conference on Computer and Information Technology. 104–109.
[40]
Zhenlong Yuan, Jie Xu, Yibo Xue, and Mihaela Van der Schaar. 2016. Bits learning: User-adjustable privacy versus accuracy in internet traffic classification. IEEE Communications Letters 20, 4 (2016), 704–707.
[41]
Zhenlong Yuan, Yibo Xue, and Mihaela van der Schaar. 2015. BitMiner: Bits mining in internet traffic classification. In Proceedings of the 37th ACM Conference on Special Interest Group on Data Communication. 93–94.
[42]
Xiaochun Yun, Yipeng Wang, Yongzheng Zhang, and Yu Zhou. 2015. A semantics-aware approach to the automated network protocol identification. IEEE/ACM Transactions on Networking 24, 1 (2015), 583–595.
[43]
Jun Zhang, Chao Chen, Yang Xiang, Wanlei Zhou, and Yong Xiang. 2013. Internet traffic classification by aggregating correlated naive bayes predictions. IEEE Transactions on Information Forensics and Security 8, 1 (2013), 5–15.
[44]
Jun Zhang, Yang Xiang, Yu Wang, Wanlei Zhou, Yong Xiang, and Yong Guan. 2013. Network traffic classification using correlation information. IEEE Transactions on Parallel and Distributed Systems 24, 1 (2013), 104–117.
[45]
Jun Zhang, Yang Xiang, Wanlei Zhou, and Yu Wang. 2013. Unsupervised traffic classification using flow statistical properties and IP packet payload. Journal of Computer and System Sciences 79, 5 (2013), 573–585.
[46]
Jingjing Zhao, Xuyang Jing, Zheng Yan, and Witold Pedrycz. 2021. Network traffic classification for data fusion: A survey. Information Fusion 72, 1 (2021), 22–47.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Privacy and Security
ACM Transactions on Privacy and Security  Volume 27, Issue 1
February 2024
369 pages
EISSN:2471-2574
DOI:10.1145/3613489
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2024
Online AM: 22 November 2023
Accepted: 14 November 2023
Revised: 19 September 2023
Received: 05 December 2022
Published in TOPS Volume 27, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Application layer protocols
  2. network traces
  3. bit level encoding
  4. crown graph
  5. ladder graph

Qualifiers

  • Research-article

Funding Sources

  • India. SERB

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 943
    Total Downloads
  • Downloads (Last 12 months)853
  • Downloads (Last 6 weeks)143
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media