research-article

Open access

OptiClass: An Optimized Classifier for Application Layer Protocols Using Bit Level Signatures

Authors:

Mayank Swarnkar,

Neha SharmaAuthors Info & Claims

ACM Transactions on Privacy and Security, Volume 27, Issue 1

Article No.: 6, Pages 1 - 23

https://doi.org/10.1145/3633777

Published: 10 January 2024 Publication History

PDF eReader

Abstract

Network traffic classification has many applications, such as security monitoring, quality of service, traffic engineering, and so on. For the aforementioned applications, Deep Packet Inspection (DPI) is a popularly used technique for traffic classification because it scrutinizes the payload and provides comprehensive information for accurate analysis of network traffic. However, DPI-based methods reduce network performance because they are computationally expensive and hinder end-user privacy as they analyze the payload. To overcome these challenges, bit-level signatures are significantly used to perform network traffic classification. However, most of these methods still need to improve performance as they perform one-by-one signature matching of unknown payloads with application signatures for classification. Moreover, these methods become stagnant with the increase in application signatures. Therefore, to fill this gap, we propose OptiClass, an optimized classifier for application protocols using bit-level signatures. OptiClass performs parallel application signature matching with unknown flows, which results in faster, more accurate, and more efficient network traffic classification. OptiClass achieves twofold performance gains compared to the state-of-the-art methods. First, OptiClass generates bit-level signatures of just 32 bits for all the applications. This keeps OptiClass swift and privacy-preserving. Second, OptiClass uses a novel data structure called BiTSPLITTER for signature matching for fast and accurate classification. We evaluated the performance of OptiClass on three datasets consisting of twenty application protocols. Experimental results report that OptiClass has an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.

1 Introduction

Network traffic classification is an essential task for Internet Service Providers (ISPs), network administrators, researchers, and security architects to identify the type of applications flowing in the network and perform analysis on protocols of different network packets for the efficient management of a network [31]. Network traffic classification serves various applications for network management, such as providing quality of service, network security, trend analysis, fault diagnosis, anomaly detection, and so on. Traditionally, the port-based method was used to identify network traffic applications. This method was useful for identifying applications whose port numbers are registered with Internet Assigned Number Authority (IANA) [9]. The port-based method works well with fixed/standard port numbers [24] that can detect application protocols running on standard ports such as HTTP, FTP, SMTP, DNS, POP, NTP, SSH, and Telnet, and so on. However, this technique failed to operate with peer-to-peer (P2P) application protocols and proprietary protocols that use dynamic port numbers such as Skype [12], eDonkey [1], emule [2], BitTorrent [3] that are not assigned, controlled, or registered by IANA. Moreover, P2P-based applications like Tor [4] use tunneling to bypass the network traffic, making the port-based method unreliable and obsolete for network traffic classification. To address these issues, statistics based methods [11, 25, 30, 36, 45], correlation-based methods [13, 14, 34, 43, 44], behavior-based methods [22, 29], and DPI-based methods [16, 17, 18, 41, 42] are used for traffic classification. Statistics-based methods rely on statistical features, including the number of packets, minimum, maximum, mean packet size, and so on. This type of method protects the user’s privacy but involves the occurrence of too many redundant features. Correlation-based methods classify network traffic by finding the correlation between network flows of the same packets using five-tuple information: source IP address, destination IP address, source port address, destination port address, and transport layer protocol. This type of method avoids feature redundancy but still has high computational overhead. Another new perspective on traffic classification is behavior-based methods that perform traffic classification by looking at behaviors of the end devices, like their IP address, protocol, and port used for communication. This type of method has high classification accuracy, but classification results are not fine-grained [46], and the methods are not scalable for many end devices. To overcome these challenges, DPI methods are proposed that analyze the payload for detailed and accurate information. DPI-based methods are popular in traffic classification because of their high classification accuracy. DPI-based methods identify applications by generating signatures using payload. A major challenge faced by DPI-based methods is to capture a unique signature for each application with minimal signature length. This is a primary requirement to reduce computational overhead. Generating signatures using byte-level information is a traditional and mostly manual effort that is error-prone. Few works in the literature propose signature generation automatically [20, 27, 33, 37, 39]. However, they focus on the extraction of byte-level information from the payload content of a packet. Byte-level information seems unconventional nowadays because many application protocols use data formats that operate on bit level [18, 40, 41]. Moreover, the available network traffic classification methods that use bit-level signatures perform linear signature matching. But this method shows performance fall with the increase in application signatures. To fill this gap, we propose OptiClass, which generates bit-level signatures to identify application protocols with parallel and accurate signature matches. We summarize our contribution in this article as follows:

–

We propose OptiClass, an optimized classifier that uses an application-specific DPI-based method for classifying application layer protocols using bit-level signatures. The classifier performs parallel matching of all available application signatures simultaneously with an unknown application flow. This makes OptiClass highly scalable.

–

OptiClass uses a novel data structure called BiTSPLITTER for accurate and parallel signature matching. BiTSPLITTER is created by inheriting the properties of well known CROWN graph and LADDER graph.

–

OptiClass is computationally inexpensive and less susceptible to user’s privacy as it generates signatures from the first ‘n’ bits of bidirectional flows. Our experiments showed that only 32 bits of application signatures are sufficient for accurate classification.

–

We performed extensive experiments on three different datasets consisting of text, binary, and proprietary application protocols. OptiClass has achieved average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.

We organize the remainder of this article as follows. In Section 2, we describe related work. In Section 3, we explain our proposed method OptiClass. We then perform the complexity analysis of each module of OptiClass in Section 4. We show the experiments performed on OptiClass and we give the results in Section 5. We end this article with the conclusion and future direction in Section 6.

2 Related Work

In this section, we discuss state-of-the-art network traffic classification methods. These methods mainly fall into two categories: Shallow Packet Inspection (SPI) based and Deep Packet Inspection (DPI) based methods. These methods are discussed as follows:

2.1 Shallow Packet Inspection-based Methods

SPI based methods are lightweight inspection methods that screen the header of the network packets for classifying network traffic. Karagiannis et al. [21] developed a method to classify peer-to-peer network traffic using two packet-level features, IP-TCP/IP-UDP pair and IP-Port pair. However, this method is limited to peer-to-peer applications only. Lin et al. [23] utilized packet size distribution associated with each transport layer connection. Each application has a distinct distribution combined with a port to accelerate the application classification. However, a limited set of applications were tested during the experiments performed on this method. Hence, the method may need to be more scalable. Grimaudo et al. [15] proposed a method that uses 40 transport layer features and then trained the hierarchical classifier using these features for the application classification. However, the experimental evaluation only considers applications with TCP flows and no applications with UDP flows. Zhang et al. [43] proposed an unsupervised classifier that utilized flow-based statistical features, such as the number of packets per flow, volume of flow, and so on, along with the correlation information between flows like source-destination IP pairs, port pairs, and so on. for traffic classification. This method has high time complexity because the method has a high prediction time to find correlated flows. Zhang et al. [44] proposed a supervised classification method that uses statistical correlation information between network flows for traffic classification. However, the method takes significant classification time because it combines different variants of Nearest Neighbor algorithms (AVG-NN, MIN-NN, and MVT-NN) for classification. Divakaran et al. [14] implemented semi-supervised classification based on correlation information between the network packets of dynamic network traffic. However, significant classification accuracy is not attained when the classifier is trained with a large amount of training data. This may not be possible with dynamic network traffic, which is prevalent nowadays. Wang et al. [36] utilized distributed Spark platform to achieve parallel optimization of Convolutional Neural Networks to perform real-time network traffic classification. Experimental results show the high classification accuracy of the datasets. However, the datasets are balanced, which is not valid with real-time network traffic.

2.2 Deep Packet Inspection-based Methods

Deep Packet Inspection-based methods examine a packet’s content (payload) for more detailed information to make more accurate network traffic classifications than SPI-based methods. However, the DPI-based methods are slower than the SPI-based methods. Wang et al. [38] proposed ProDecoder for the automatic generation of protocol message format without prior knowledge about the protocol specifications based on the n-grams with the same semantics (such as relationship among multiple common byte sequences) of protocol network traces. However, ProDecoder will not work on binary application protocols as these protocols operate on a bit level. Yun et al. [42] proposed SECURITAS that uses the packet’s payload to generate n-grams, which are then combined to create keywords based on the latent relationship between n-grams to identify each application protocol from mixed network traffic. However, this method employs Gibbs sampling to identify keywords from n-grams, which are computationally expensive. Tongaonkar et al. [33] presented SANTaCLASS, which automatically extracts keywords from the packets’ payload based on their occurrence frequency in payloads. These keywords are arranged in occurrence order to generate the application signatures, which are further used in classification. However, the method is suitable only for text-based application protocols. Swarnkar et al. [32] proposed RDClass, which is based on Relative Distance Constraint Counting Automata (RDCCA). It accepts a set of keywords and their relative distances extracted from the payload in an encoded format to identify unknown applications and classify network flows. However, this method also classifies only text-based protocols. Yuan et al. [41] proposed BitMiner, which is a bit-level classifier that utilizes correlation between bit-values and bit-positions in each network flow for the automatic generation of signatures. Experimental evaluation is done only on UDP traffic of six protocols. Hence, the granularity of the method on other types of traffic is unknown. Yuan et al. [40] proposed BitsLearning which extended the same idea by utilizing machine learning algorithms. This method uses features such as packet sizes (PS) and flow ports along with bit values as their respective positions for traffic classification. However, experiments on TCP-based network traffic were not furnished in their article. Hubballi et al. [18] proposed BitCoding, which is a bit-level classification method. The method uses Transition Constraint Counting Automata (TCCA) as an application signature that takes the encoded format of n-bit signatures as input for bit-level matching signatures with unknown traffic flows. As BitCoding only uses invariant bits as part of a signature, the remaining bits have no role in the classification. Hence, if a signature contains a large portion of variant bits, then ignoring that complete portion causes information loss. Hubballi et al. [19] proposed BitProb that constructs a space-efficient state transition machine that uses probabilistic bit-level signatures for classifying applications. The BitProb uses a threshold value to decide whether a new test flow belongs to a particular application. If a threshold value is set to a higher number, then the new test flow will not cross it and will remain unclassified, and if the threshold value is set to a lower value, then the new test flow will be classified into another class. Hence, deciding the threshold every time would be a critical scenario. Hubballi et al. [16] proposed KeyClass that does fast searching of keywords in the payload by skipping certain parts of the payload for the quick identification of applications. The authors designed a new finite state machine based on the AhoCorasick algorithm [10] for achieving the task. The KeyClass works with bytes. Hence, it is incompatible with binary application protocols. Recent applications, including intelligent grid networks [28], industrial applications [26], and intrusion detection systems [35] indicate the importance of DPI in the upcoming future and invoke the need for designing robust methods for network traffic classification.

3 Proposed Method

In this section, we describe our proposed method OptiClass, which is an optimized framework for accurately classifying application protocols through bit-level signatures. The proposed method is a lightweight DPI-based method that uses signatures extracted from bit sequences of the payload of network flows for classifying applications. The architecture diagram of OptiClass is shown in Figure 1. We can see from Figure 1 that the OptiClass has two phases: The training and testing phases. These two phases are explained in the following two subsections.

Fig. 1.

3.1 Training Phase

In this phase, OptiClass generates the bit-level signatures of each application protocol from the network traces. After that, all the generated bit signatures are inserted in a novel data structure called BiTSPLITTER, which is then used for efficient, accurate, and parallel signature matching of test flows against the signatures. This phase consists of three modules: Traffic Preprocessing, Bit Signature Generation, and BiTSPLITTER Creation. Each of the modules is explained below.

3.1.1 Traffic Preprocessing.

The input to this module is the network traffic traces of applications, and the output is the bidirectional binary flows reconstructed from the packets of the network traces. A network flow is a series of communication bounded by exchanging packets between two hosts. A unique combination of five tuples identifies a flow: source IP address, destination IP address, source port address, destination port address, and transport layer protocol. There are two transport layer protocols: TCP and UDP. Therefore, flows are also of two types: TCP flow and UDP flow. A TCP flow is reconstructed from all the packets exchanged between two hosts, say Host A to Host B, in a bidirectional flow sequence. This sequence starts with TCP connection establishment via a three-way handshake and closes using a two-way handshake. All packets are exchanged between a TCP connection, and its disconnection becomes a part of that flow. Following shows TCP connection and disconnection between two hosts Host A and Host B:

–

TCP Connection:

(1)

A →[SYN] →B

(2)

A ←[\(SYN/ACK\)] ←B

(3)

A →[ACK] →B

–

TCP Disconnection:

(1)

A →[FIN] →B or A ←[FIN] ←B

(2)

A ←[ACK] ←B or A →[ACK] →B

On the other hand, UDP is a connectionless protocol, and flows are constructed using timing information. A UDP flow consists of all the packets exchanged between two hosts with inter-packet timing of not more than a threshold value \(\delta\) whose value is defined by the user and is fixed suitably. Timing information along with the same tuples: source IP address, destination IP address, source port number, and destination port number is a part of a flow.

3.1.2 Bit Signature Generation.

The input to this module is the reconstructed flows of an application protocol, and the output is the generated binary signature for the same application. This makes OptiClass a supervised method that takes network traces of known applications to generate its bit-level signature. To keep OptiClass lightweight and privacy-preserving, the method generates the signature of an application by using only the first “n” bits from the payload of each of the “f” bidirectional binary flows extracted from the network trace of application “A”. Therefore, the signature of the application is also of length “n”. Application signature generation happens under the following conditions:

(1)

The \(i^{th}\) bit of signature has value as “1” only if all the bits at \(i^{th}\) position of every flow is “1”.

(2)

The \(i^{th}\) bit of signature has value as “0” only if all the bits at \(i^{th}\) position of every flow is “0”.

(3)

The \(i^{th}\) bit of signature has value as ASTERICK “\(*\)” if the bits at \(i^{th}\) position are varying.

An example of an application signature of length “10” generated from three bidirectional binary flows of the same application with each flow of length “10” is shown in Figure 2.

Fig. 2.

We can see from Figure 2 that the bits at position \(B_2\), \(B_3\), and \(B_6\) in each flow is “0” and therefore, the bit values at the corresponding positions of the signature are also “0”. Similarly, the bits at position \(B_5\), \(B_7\), \(B_8\), and \(B_9\) in all the application flows are always “1” and hence the bit values at the corresponding positions of the signature are also “1”. However, the bits at all other remaining positions, i.e., \(B_1\), \(B_4\), and \(B_{10}\), are varying bits; therefore, these bits are inconsistent and represented by “\(*\)”. Since there is a high possibility that the network trace of an application may contain a small number of broken flows due to packet re-transmission, network congestion, bit errors, and so on, which may result in incorrect signature generation, we kept a threshold of 0.9 to overcome this problem. In other words, if the bits at position “i” remain constant 90 out of 100 times, then the signature will have that bit value at the “\(i^{th}\)” position.

3.1.3 BiTSPLITTER Creation.

The input to this module is the generated signatures of each application, and the output is the generated BiTSPLITTER. Unlike most methods that perform linear signature matching with a test flow, BiTSPLITTER performs parallel signature matching. BiTSPLITTER is a novel data structure created by inheriting the properties of the CROWN Graph [5] and LADDER Graph [6]. We constructed the BiTSPLITTER by taking motivation from the CROWN and the LADDER graph. BiTSPLITTER is formally defined as a weighted, directed, and binary-valued acyclic graph G(\(V, E, W\)) such that:

–

V represents the two sets of vertices {\(x,y\)} such that \(x=\){\(x_{1}, x_{2}, \dots , x_{n}\)} and \(y=\) {\(y_{1}, y_{2}, \dots , y_{n}\)} where value at vertex \(x_i \in x\) = 0 and value at vertex \(y_i \in y\) = 1.

–

Edge \(e_{ij} \in E\) is a directed edge either from \(x_i \rightarrow x_j\) or \(x_i \rightarrow y_j\) or \(y_i \rightarrow x_j\) or \(y_i \rightarrow y_j\) where \(j=i\)+1.

–

Root vertex is the start vertex of the BiTSPLITTER such that set x is present to the left and set y is present to the right of the root, respectively.

–

\(w_{ij} \in W\) is a weight on each directed edge \(e_{ij}\) which is initially an empty set {\(\phi\)}. \(w_{ij}\) is updated on each edge based on application signatures generated in the previous step. For an application signature \(S_i\) which is a sequence of n bits consisting of {0,1,*}, the weights on the edges are updated as follows:

–

The BiTSPLITTER is traversed from the root, and \(w_{ij}\) is updated on edge \(e_{ij}\) with \(S_i\) based on the bit value of that signature at the first value. If the bit value of \(S_i\) at first position is “1” then \(w_{root,y_1}=\lbrace S_1\rbrace\). Similarly, if the bit value of \(S_i\) at first position is “0” then \(w_{root,x_1}=\lbrace S_1\rbrace\). However, If the bit value of \(S_i\) at first position is “*” then \(w_{root,x_1}=\lbrace S_1\rbrace\) as well as \(w_{root,y_1}=\lbrace S_1\rbrace\). In the same way, the weights on other edges are updated based on bit values at respective positions.

–

When another application signature \(S_2\) is introduced, the BiTSPLITTER is traversed, and weights are updated in the same way as it is done for application signature \(S_1\). However, for any edge \(e_{ij}\) which has weight \(w_{ij}=\lbrace S_1\rbrace\) instead of \(w_{ij}=\phi\) and \(S_2\) need to update the weight on the same edge then its weight is updated as \(w_{ij}=\lbrace S_1, S_2\rbrace\)

The algorithm to generate BiTSPLITTER is shown in Algorithm 1. BiTSPLITTER generation process takes application signatures {\(S_{1}, S_{2}, \dots , S_{k}\)} as input and output is a BiTSPLITTER. Initially, empty BiTSPLITTER is generated with a root and “n” vertices on the left and “n” vertices on the right, respectively, where “n” is the length of application signatures. Moreover, all directed edges are connected as per the constraints discussed above. Then weights on each edge initiate with {\(\phi\)}. Subsequently, each bit of signature is mapped with vertices of the BiTSPLITTER such that weights present on the edge are updated with the signature label to which the bit belongs. If a bit of a signature is “0”, then there is a directed edge either from root to \(x_{1}\) or \(x_{i}\) to \(x_{i+1}\) or \(y_{j}\) to \(x_{i+1}\). Similarly, a bit of a signature is “1”, then there is a directed edge either from root to \(y_{1}\) or \(y_{j}\) to \(y_{j+1}\) or \(x_{i}\) to \(y_{j+1}\). Finally, if a bit is “*” then there is a directed edge from root to \(x_{1}\), \(y_{1}\) or \(x_{i}\) to \(x_{i+1}\), \(y_{j}\) to \(y_{j+1}\) or \(y_{j}\) to \(x_{i+1}\), \(x_{i}\) to \(y_{j+1}\). Each new bit weight on the directed edge is updated with \(S_{k}\).

Let us take an example to understand the BiTSPLITTER generation from three sample signatures, each of length three as \({S_{1}=101}\), \({S_{2}=*11}\), and \({S_{3}=111}\). This example is shown in Figure 3. Since the signature length is 3, initially, a BiTSPLITTER is generated with all weights as empty, denoted by {\(\phi\)}, which is shown as the first in Figure 3(a). Let us insert the first signature \(S_1\)=“101” in the BiTSPLITTER. This is shown in Figure 3(a). The first bit of \(S_1\) is “1”, which is inserted on the edge between root and \(y_1\) (right vertex at level 1 of the BiTSPLITTER), and the weight \(w_{root,y_1}\) is updated with {\(S_1\)}. This is shown in Figure 3(a). The second bit of \(S_1\) is “0”, which is inserted on the edge between \(y_1\) (the current vertex) and \(x_2\) (left vertex at level 2 of the BiTSPLITTER), and weight \(w_{y_1x_2}\) is updated as {\(S_1\)}. Then, the third bit of \(S_1\), which is “1”, is also inserted in the same way and shown in Figure 3(a). Next, we insert the Signature \(S_2\)=“*11”, shown in Figure 3(b). The first bit of \(S_2\) is “*”, which is a varying bit, and hence both the weights \(w_{root,x_1}\) and \(w_{root,y_1}\) are updated with \(S_2\) as {\(S_2\)} and {\(S_1\), \(S_2\)} respectively. Similarly, weights for the other two bits, “1” and “1”, are updated with \(S_2\) in the BiTSPLITTER. Finally, the third application signature \(S_3\) is updated on the BiTSPLITTER as shown for application signatures \(S_1\) and \(S_2\). This is shown in Figure 3(c).

Fig. 3.

3.2 Testing Phase

The testing phase of OptiClass consists of two modules: the Traffic Preprocessing Module and the Flow Classification Module. The traffic preprocessing module of the testing phase is the same as that of the training phase. The flow classification module is explained as follows:

3.2.1 Flow Classification Module.

The input to this module is the BiTSPLITTER obtained from the training phase and test flows received from the flow reconstruction module. A test flow is matched against the BiTSPLITTER to identify the application to which this flow belongs. To perform efficient and parallel matching of one test flow against all signatures, flow classification of new test flows using BiTSPLITTER \(G(V, E, W)\) is explained as follows:

–

Each test flow “f” is passed into a BiTSPLITTER at the root “R.” When a bit of a test flow is “0” or “1,” we traverse to the left or right of the root, respectively.

–

Initially, we have a universal set “U” that consists of a set of signatures as {\(S_{1}, S_{2}, \dots , S_{k}\)}.

–

With each traversed bit, we perform intersection on the weights associated with \(e_{ij}\).

–

For the first bit of a test flow “f,” the first vertex of BiTSPLITTER is visited such that weight \(w_{ij}\) of \(e_{ij}\) is the intersection of “U” and \(w_{ij}\).

–

For the next subsequent bits of an “f,” \(w_{ij}\) of \(e_{ij}\) is the intersection of \(w_{ij}\) and resultant of the intersection of “U” and \(w_{ij}\) computed from the previous step.

–

These steps are repeated until we reach the last vertex of the BiTSPLITTER. We are left with a final set of weights at the end vertex.

–

If a final set contains only {\(\phi\)} then “f” is unclassified. However, if a set contains one weight or more than one weight, then “f” is classified, undecided, or misclassified respectively.

Let us take an example to understand the flow classification using the BiTSPLITTER we have already created in Figure 3. We have taken three test flows as “011”, “111”, and “000” in Figure 4. Each test flow is passed into a BiTSPLITTER, and labeled as classified, undecided, misclassified, or unclassified. Initially, we have a universal set “U” that consists of all three signature labels such that U = {\(S_{1}\), \(S_{2}\), \(S_{3}\)}. The first test flow “011” is passed into a BiTSPLITTER from the root, as shown in Figure 4(a). The first bit of this test flow is “0” and \(w_{root,x_1}\) is {\(S_{2}\)} such that U \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, the second bit is “1” and \(w_{x_1,y_2}\) is {\(S_{2}\)} such that {\(S_{2}\)} \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, and the third bit is “1” and \(w_{y_2,y_3}\) are {\(S_{2}\), \(S_{3}\)} such that {\(S_{2}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\)}. Since, intersection set consists single signature as {\(S_{2}\)} thus test flow “011” belongs to signature {\(S_{2}\)} and hence this is a case of classification. Similarly, the second test flow “111” is passed into BiTSPLITTER, as shown in Figure 4(b). The first bit of this test flow is “1” and \(w_{root,y_1}\) is {\(S_{1}\), \(S_{2}\), \(S_{3}\)} such that U \(\cap\) {\(S_{1}\), \(S_{2}\), \(S_{3}\)} = {\(S_{1}\), \(S_{2}\), \(S_{3}\)}, second bit is “1” and \(w_{y_1,y_2}\) is {\(S_{2}\), \(S_{3}\)} such that {\(S_{1}\), \(S_{2}\), \(S_{3}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\), \(S_{3}\)}, and third bit is “1” and \(w_{y_2,y_3}\) is {\(S_{2}\), \(S_{3}\)} such that {\(S_{2}\), \(S_{3}\)} \(\cap\) {\(S_{2}\), \(S_{3}\)} = {\(S_{2}\), \(S_{3}\)}. Since the intersection set consists of two signature labels as {\(S_{2}\), \(S_{3}\)} thus, test flow “111” belongs to both {\(S_{2}\), \(S_{3}\)}, and hence this is a case of indecision or ambiguity. In the same case, if a set consists of a signature label as {\(S_{3}\)}, then this is a case of misclassification. Finally, the third test flow “000” is passed into BiTSPLITTER, as shown in Figure 4(c). The first bit of this test flow is “0,” and \(w_{root,x_1}\) is {\(S_{2}\)} such that U \(\cap\) {\(S_{2}\)} = {\(S_{2}\)}, the second bit is “0” and \(w_{x_1,x_2}\) is {\(\phi\)} such that {\(S_{2}\)} \(\cap\) {\(\phi\)} = {\(\phi\)}, and the third bit is “0” and \(w_{x_2,x_3}\) is {\(\phi\)} such that {\(\phi\)} \(\cap\) {\(\phi\)} = {\(\phi\)}. Since the intersection set consists {\(\phi\)}, test flow “000” does not belong to any signature; hence, this is a case of not classification.

Fig. 4.

Flow classification process takes input parameters as BiTSPLITTER, universal set U consisting of a known set of weights {\(S_{1}\), \(S_{2}\), \(S_{3}\)}, empty SignatureSet, and test flow “f” that needs to be classified. If a bit of a test flow is “0” and the current vertex is the root, then the new weight of the directed edge from the current vertex to its left vertex is an intersection of U and \(w_{ij}\) such that U is updated with the intersection result. Similarly, the new weight is an intersection of updated U and \(w_{ij}\) for the next vertex. Similarly, if a bit of a new test flow is “1”, then the same process is repeated to the current vertex’s right. Final weights are stored in SignatureSet{}. If this set contains {\(\phi\)}, then “f” remains unclassified. Else, “f” can either be classified or misclassified. When SignatureSet{} consists of only one weight, then “f” is classified. However, if SignatureSet{} consists of more than one weight, then “f” is misclassified. The step-by-step process is shown in Algorithm 2.

4 Complexity Analysis

This section discusses the asymptotic complexity of each module of OptiClass. We show the complexity of modules of OptiClass in Table 1. The first module is the traffic preprocessing module, which examines each packet header and adds it to the corresponding flow to reconstruct the flow. If there are “f” flows and each flow is reconstructed with “p” packets, then it has a \(O(f \times p)\) complexity. The following module is the signature generation module, where the application signature is generated from “f” application flows with each flow of “n” bits. Since the number of bits “n” is a constant, the complexity of this module becomes \(O(f)\). The following module is the BiTSPLITTER creation, which is generated from “m” application signatures with each signature of length “n.” Since n is a constant value, the effect on the module’s complexity depends only on m, and the module’s complexity is \(O(m)\). In the flow classification module, each test flow of “n” bits is matched with the BiTSPLITTER generated from application signatures with the signature length of “n.” However, all signatures are compared simultaneously with each of the “n” bits of the “f” test flows, and the complexity becomes \(O(f \times n)\).

Table 1.

Module	Complexity	Explanation
Traffic Preprocessing Module	O(\(f \times p\))	f are the total number of flows and p are the number of packets in a flow.
Signature Generation	O(f)	f is the total number of flows of an application A.
BiTSPLITTER Creation	O(m)	m is the number of application signatures to be inserted.
Flow Classification	O(\(f \times n\))	n is the number of bits in f test flows.

Table 1. Module-Wise Complexity Analysis of OptiClass

5 Experiments and Results

This section describes the experiments performed to assess the efficiency of OptiClass. We conducted extensive experiments to check the efficiency of OptiClass across various network scenarios. We first discuss the datasets used to perform the experiments. Next, we executed the OptiClass on the datasets to evaluate its performance. After that, we performed the sensitivity analysis of OptiClass by varying the key parameters to identify the responsiveness of the method to those parameters. The performance of OptiClass is compared with five closely related recent state-of-the-art methods.

5.1 Dataset Description

Opticlass experiments were conducted based on three datasets. The first dataset was mentioned in previous articles [17, 18]. The second and third datasets are from two publicly available sources: Digital Corpora [7] and the Swedish Defense Research Agency’s FOI Information Warfare Lab [8]. Our datasets consist of twenty application protocols, which contain both text and binary application protocols. Moreover, among these application protocols, few are proprietary protocols, while others are open. Table 2 provides the list of protocols used in the experiments and their types (text/binary, open/proprietary). In the subsequent sections, we named the first dataset “Private”, the second and third datasets are named “Public-1” and “Public-2”, respectively. We split the datasets into two equal sets, a training set and a testing set, in which each set contains 50% application flows of each of the three datasets. Training datasets are used for generating signatures of each application protocol, and testing datasets are used for identifying traffic flows. Training and testing dataset divisions of Private, Public-1, and Public-2 are shown in Tables 3, 4, and 5 respectively.

Table 2.

Abbreviation	Protocol	Type	Proprietariness
BACnet	Building Automation and Control network	Binary	ASHRAE
BitTorrent	Bit torrent protocol	Text	No
BJNP	Used to communicate with printer	Binary	Canon
Bootp	Bootstrap protocol	Binary	No
CUPS	Common Unix Printing System	Text	Apple Inc.
DNS	Domain Name System	Binary	No
Dropbox	Dropbox LAN Sync protocol	Text	Dropbox
GsmIp	GSM over Internet protocol	Text	No
HTTP	Hyper Text Transfer Protocol	Text	No
Kerberos	Kerberos protocol	Binary	No
MWBP	Microsoft Windows Browsing Protocol	Text	Microsoft
NBNS	NetBIOS Name Service	Binary	No
NBSS	NetBIOS Session Service	Binary	No
NTP	Network Time Protocol	Binary	No
POP	Post Office Protocol	Text	No
QUIC	Quick UDP Internet Connections	Binary	No
RPC	Remote Procedure Call	Binary	No
SIP	Session Initiation Protocol	Text	No
SMTP	Simple Mail Transfer Protocol	Text	No
SSH	Secure Shell	Binary	No

Table 2. Application Protocols used in the Experiments

Table 3.

Protocol	TCP/UDP	Training		Testing
		Flows	Size (MB)	Flows	Size (MB)
BitTorrent	TCP	00789	245.8	00791	150.4
DNS	UDP	32576	005.7	32762	005.7
Dropbox	UDP	01138	098.2	01128	153.4
HTTP	TCP	48834	220.4	48878	328.3
SIP	UDP	00609	194.1	00640	191.4
SMTP	TCP	00597	010.1	00608	022.9
SSH	TCP	01104	006.2	01106	006.2
Total	\(-\)	\(\mathbf {85647}\)	\(\mathbf {7 8 0 . 5}\)	\(\mathbf {85913}\)	\(\mathbf {8 5 8 . 3}\)

Table 3. Private Dataset Statistics

Table 4.

Protocol	TCP/UDP	Training		Testing
		Flows	Size (MB)	Flows	Size (MB)
BACnet	UDP	00009	000.097	00011	000.074
BJNP	UDP	00034	000.026	00038	000.031
Bootp	UDP	00086	004.400	00081	004.500
CUPS	UDP	00047	000.107	00045	000.218
DNS	UDP	25469	012.900	25850	011.100
Dropbox	UDP	00026	000.109	00025	000.319
HTTP	TCP	17964	151.100	17968	133.600
MWBP	UDP	00008	000.565	00007	000.574
NBNS	UDP	00982	007.800	00982	007.500
NTP	UDP	00201	000.652	00201	000.141
QUIC	UDP	00127	000.110	00093	000.115
SMTP	TCP	00520	010.100	00521	009.900
Total	\(-\)	\(\mathbf {45473}\)	\(\mathbf {187.996}\)	\(\mathbf {45822}\)	\(\mathbf {168.072}\)

Table 4. Public-1 Dataset Statistics

Table 5.

Protocol	TCP/UDP	Training		Testing
		Flows	Size (MB)	Flows	Size (MB)
Bootp	UDP	00091	00.080	00091	0.096
DNS	UDP	00963	00.865	00958	1.200
Gsmlp	TCP	00009	00.007	00009	0.015
HTTP	TCP	00257	04.800	00253	9.000
Kerberos	UDP	00669	01.600	00672	1.900
NBNS	UDP	00290	00.853	00289	0.680
NBSS	TCP	00377	02.700	00373	3.900
NTP	UDP	00202	00.145	00200	0.648
POP	TCP	00056	00.035	00057	0.036
RPC	TCP	00007	00.020	00007	0.141
Total	\(-\)	\(\mathbf {2921}\)	\(\mathbf {1 1 . 1 0 5}\)	\(\mathbf {2909}\)	\(\mathbf {1 7 . 6 1 6}\)

Table 5. Public-2 Dataset Statistics

5.2 Evaluation

OptiClass is built using Java programming language with JnetPcap packet parsing library and can generate bidirectional flows from the network traces. We conducted three different experiments to perform an in-depth evaluation of OptiClass. These three experiments are homogeneous, heterogeneous, and grand experiments. The training and testing parts are taken from the same dataset in homogeneous experiments. These experiments check the efficiency of OptiClass for one site at once. In heterogeneous experiments, the training part is taken from one dataset and tested from the testing part of the other two datasets. These experiments evaluate the OptiClass for site independence, i.e., if OptiClass is trained from one site and can perform on another site. All three datasets are combined for training and testing in a grand experiment. This experiment checks whether the OptiClass is trained from multiple locations and can become more robust and accurate for classification. We reported the performance evaluation of OptiClass for all three experiments using Recall, Precision, and F1-Score, whereas we also showed the confusion matrix for a grand experiment.

5.2.1 Homogeneous Experiments.

These experiments are done by selecting training and testing sets from the same datasets. Results of all homogeneous experiments are compiled in Table 6. Experiments on the Private dataset are shown in Table 6 a. These tables show that the recall rate of all application protocols except HTTP and SIP is 100%. This means all the testing flows are correctly matched with their respective signature except for HTTP and SIP under this experiment. The recall rates of HTTP and SIP are 90.95% and 97.96 %, respectively, because some testing flows remained unmatched. The precision of HTTP is 100%, except for other protocols. The average F1-score of all protocols is 98.37%. Similarly, experiments done on the \(Public-1\) dataset are shown in Table 6 b. This table shows that the recall rate of most application protocols is 100% except for a few, which are Bacnet, HTTP, NBNS, and NTP. We can see a low recall rate of NTP because there is a high similarity between signatures of NTP and QUIC. Moreover, the differentiating bit positions also have “*” values, resulting in misclassification. The average precision for most protocols is 90.52% except MWBP and NTP. This is because other application protocols match with MWBP and NTP. The average F1-score is 97.12% except NTP. Finally, experiments on the \(Public-2\) dataset are shown in Table 6 c. This table shows that the recall of all applications is more than 98% except NBSS. The reason is that the testing flows of NBSS matched with the signature of Kerberos and NBNS because of the relatively high number of “*” in Kerberos and NBNS. The precision of all application protocols is 100% except Kerberos because other protocols match with Kerberos. The average F1-score is 98.62%.

Table 6.

5.2.2 Heterogeneous Experiments.

These experiments are done by selecting a training set from one dataset and testing sets from the other two datasets. The first heterogeneous experiment is performed by using training data from the Private dataset and testing data from the Public-1 and Public-2 datasets. The recall for this experiment is shown in Table 7. We can see from Table 7 a that a 100% recall is achieved except for HTTP applications because these flows remain misclassified. Precision is 100% except for DNS. The reason is other protocols match with DNS because of a relatively higher number of “*” in DNS. The average F1-score is 93.67%. Similar results can be seen in Table 7 b. The second heterogeneous experiment is performed using training data from the Public-1 dataset and testing data from Private and Public-2 datasets. The recall for this experiment is shown in Table 8. We can see in Table 8 a that more than 90% recall is obtained. Precision is 100% for all application protocols. The average F1-score is 99.16% However, in Table 8 b, NBNS has a comparatively lower recall rate because the remaining flows of NBNS are misclassified as NTP. Precision is 100% except for DNS. The average F1-score is 94.04%. The third heterogeneous experiment is performed by using training data from the Public-2 dataset and tested on testing data of the Private and Public-1 datasets. The recall, precision, and F1-score for this experiment are shown in Table 9. In both Tables 9 a and Table 9 b, we found that all the recall was more than 93%. In Table, 9 a precision is 100% for HTTP except DNS. The average F1-score is 99.22%. In Table, 9 b precision is 100% except for DNS and NBNS. The average F1-score is 94.04%.

Table 7.

Table 8.

Table 9.

5.2.3 Grand Experiment.

In this experiment, we combined the training data of all three datasets to create a BiTSPLITTER. Then, we tested the efficiency of BiTSPLITTER by testing the grand dataset, Private dataset, Public-1 dataset, and Public-2 dataset. The recall, precision, and F1-score for each testing data is shown in Table 10. We obtained an overall improved recall rate, precision, and F1-Score compared to those in the homogeneous and heterogeneous experiments. On grand testing data, we got an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively. Thus, it can be deduced that the BiTSPLITTER generated from the training data of the grand dataset is more robust than that of homogeneous and heterogeneous experiments. We showed the grand experiment’s flow-wise classification of testing data as a confusion matrix. The confusion matrix is shown in Table 11. The first column of the confusion matrix represents application signatures, and the first row represents the number of testing flows, respectively. This confusion matrix shows the classification of each test flow when tested against all the available application signatures. Also, in Table 11, the misclassified flows are shown in red. For example, BACnet has 11 testing flows, all of which only match the signature of BACnet. This represents no misclassification for BACnet test flows. Similarly, no misclassification is obtained for BitTorrent, BJNP, Bootp, CUPS, Dropbox, GsmIpa, Kerberos, MWBP, POP, QUIC, RPC, SMTP, and SSH. However, DNS has 59,577 testing flows, out of which 34, 02, 01, and 38 are misclassified with Kerberos, MWBP, NTP, and QUIC, respectively. Similarly, HTTP has 67099 testing flows, out of which 101, 26, 4, 4, 1, 1, 8864, and 67 are misclassified with CUPS, Kerberos, MWBP, NBNS, NBSS, NTP, QUIC, and SIP, respectively. A similar case appears for NBNS, NBSS, NTP, and SIP. In our confusion matrix, more than one value appears in red, indicating that testing flows of one application protocol are misclassified with more than one signature of other application protocols.

Table 10.

Protocol	Grand Dataset			Private Dataset			Public-1 Dataset			Public-2 Dataset
	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score
BACnet	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)
BJNP	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)
BitTorrent	100.00	100.00	100.00	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)
Bootp	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00	100.00	100.00	100.00
CUPS	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	095.74	097.82	\(-\)	\(-\)	\(-\)
DNS	099.87	099.93	099.89	099.77	099.96	099.86	099.99	099.53	099.75	099.89	092.37	095.98
Dropbox	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)
GsmIp	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00
HTTP	086.48	086.74	086.60	090.95	100.00	095.26	074.30	100.00	085.25	088.53	100.00	093.91
Kerberos	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	100.00	092.68	096.20
MWBP	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	087.50	093.33	\(-\)	\(-\)	\(-\)
NBNS	084.18	084.31	084.24	\(-\)	\(-\)	\(-\)	087.67	100.00	093.42	072.31	100.00	083.93
NBSS	085.79	085.79	085.79	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	085.79	100.00	092.35
NTP	097.25	097.25	097.25	\(-\)	\(-\)	\(-\)	094.52	100.00	097.18	100.00	100.00	100.00
POP	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00
QUIC	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	100.00	068.88	081.57	\(-\)	\(-\)	\(-\)
RPC	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	100.00	100.00	100.00
SIP	093.75	093.75	093.75	084.37	099.44	091.28	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)
SMTP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)
SSH	100.00	100.00	100.00	100.00	100.00	100.00	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)	\(-\)

Table 10. Recall, Precision, and F1-Score for Grand Dataset

Table 11.

Protocols	BACnet	BitTorrent	BJNP	Bootp	CUPS	DNS	Dropbox	GsmIpa	HTTP	Kerberos	MWBP	NBNS	NBSS	NTP	POP	QUIC	RPC	SIP	SMTP	SSH
	(11)	(791)	(38)	(172)	(45)	(59577)	(1153)	(9)	(67099)	(672)	(7)	(1271)	(373)	(401)	(57)	(93)	(7)	(640)	(1129)	(1106)
BACnet	11	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
BitTorrent	0	791	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
BJNP	0	0	38	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
Bootp	0	0	0	172	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
CUPS	0	0	0	0	45	0	0	0	101	0	0	0	0	0	0	0	0	0	0	0
DNS	0	0	0	0	0	59502	0	0	0	0	0	199	0	0	0	0	0	0	0	0
Dropbox	0	0	0	0	0	0	1153	0	0	0	0	0	0	0	0	0	0	0	0	0
GsmIpa	0	0	0	0	0	0	0	9	0	0	0	0	0	0	0	0	0	0	0	0
HTTP	0	0	0	0	0	0	0	0	58031	0	0	0	0	0	0	0	0	0	0	0
Kerberos	0	0	0	0	0	34	0	0	26	672	0	0	53	0	0	0	0	0	0	0
MWBP	0	0	0	0	0	2	0	0	4	0	7	0	0	0	0	0	0	0	0	0
NBNS	0	0	0	0	0	0	0	0	4	0	0	1070	0	0	0	0	0	0	0	0
NBSS	0	0	0	0	0	0	0	0	1	0	0	0	320	0	0	0	0	0	0	0
NTP	0	0	0	0	0	1	0	0	1	0	0	0	0	390	0	0	0	0	0	0
POP	0	0	0	0	0	0	0	0	0	0	0	0	0	0	57	0	0	0	0	0
QUIC	0	0	0	0	0	38	0	0	8864	0	0	2	0	11	0	93	0	40	0	0
RPC	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	7	0	0	0
SIP	0	0	0	0	0	0	0	0	67	0	0	0	0	0	0	0	0	600	0	0
SMTP	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1129	0
SSH	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1106

Table 11. Confusion Matrix for Evaluation of Grand Dataset with 32 bits

5.3 Sensitivity Analysis

We performed the sensitivity analysis of OptiClass to identify the responsiveness of the method by varying the number of bits used for signature generation and classification. We have done two experiments for sensitivity analysis. The first experiment is done by varying signature lengths. Second, by varying threshold values. Results for both experiments are done with Recall, Precision, and \(F1-Score\) evaluation metrics.

5.3.1 Signature Length.

Setting the best signature length to obtain decent classification results with minimum classification duration is highly significant. Moreover, finding a suitable length signature hinders minimal user privacy. Therefore, we conducted experiments on the grand dataset by training and testing on 16, 32, and 48 bits, respectively. Table 12 shows the results for these experiments. We can notice that OptiClass results in a lower average recall, precision, and F1-score with 16 bits. The reason is that the number of bits used for classification is lesser than the required bits. It means that introducing a few more bits in the signature may result in good classification. Thus, the improvement in the recall is observed when tested with 32 bits of signatures. Moreover, the average recall, precision, and F1-score is decent, with 32 bits of application signatures. However, OptiClass with 48 bits of signatures did not show any significant recall improvement. This happened because the increased bits contain more “*” than the fixed values in their signatures.

Table 12.

Protocols	16 Bits			32 Bits			48 Bits
	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score
BACnet	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
BJNP	97.36	100.00	98.66	100.00	100.00	100.00	100.00	100.00	100.00
BitTorrent	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Bootp	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
CUPS	88.88	100.00	94.11	100.00	100.00	100.00	100.00	49.45	66.17
DNS	100.00	100.00	100.00	99.87	99.93	99.89	99.87	99.70	99.78
Dropbox	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
GsmIp	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
HTTP	86.48	100.00	92.74	86.48	86.74	86.60	86.48	100.00	92.74
Kerberos	93.15	100.00	96.45	100.00	100.00	100.00	90.62	82.85	86.56
MWBP	85.71	100.00	92.30	100.00	100.00	100.00	100.00	100.00	100.00
NBNS	100.00	12.12	21.61	84.18	84.31	84.24	84.18	100.00	91.41
NBSS	99.73	100.00	99.86	85.79	85.79	85.79	78.55	100.00	87.98
NTP	91.27	100.00	95.43	97.25	97.25	97.25	97.25	100.00	98.60
POP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
QUIC	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
RPC	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
SIP	91.25	100.00	95.42	93.75	93.75	93.75	91.25	91.67	91.45
SMTP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
SSH	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Average	96.69	95.60	94.33	97.36	97.38	97.37	96.41	96.18	95.73

Table 12. Sensitivity Analysis by Varying Signature Length

5.3.2 Threshold Value.

In OptiClass, if a few flows are broken or incomplete, these may influence the signature generation due to a mismatch in bit positions. Thus, if we keep the threshold as “1,” then a large number of bit positions will have a “*” value due to broken flows. This will result in a weak application signature and reduce the classification accuracy of the OptiClass. Thus, a reduced threshold value is required in order to cater to this problem. So, we used the threshold value of 0.9 for the OptiClass. Before choosing the appropriate threshold value, we conducted experimental analysis on a grand dataset with 0.7, 0.75, 0.8, 0.85, 0.9, and 0.95 threshold values and observed recall, precision, and F1-score for each of the application protocols. Experiments show satisfying results at a 0.9 threshold value, as shown in Table 13.

Table 13.

Protocol	0.7 Threshold			0.75 Threshold			0.8 Threshold			0.85 Threshold			0.9 Threshold			0.95 Threshold
	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score
BACnet	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
BJNP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
BitTorrent	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Bootp	79.06	100.00	88.30	79.06	100.00	88.30	79.06	100.00	88.30	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
CUPS	66.66	88.23	75.94	77.77	76.08	76.91	88.88	54.05	67.22	100.00	33.58	50.27	100.00	100.00	100.00	100.00	32.37	48.90
DNS	99.94	99.65	99.79	99.94	99.66	99.79	99.93	99.66	99.79	99.93	99.66	99.79	99.87	99.93	99.89	97.98	100.00	98.97
Dropbox	100.00	100.00	100.00	100.00	100.00	100.00	99.22	100.00	99.60	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
GsmIp	100.00	100.00	100.00	100.00	100.00	100.00	100.00	50.00	66.66	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
HTTP	86.48	100.00	92.74	86.48	100.00	92.74	86.48	100.00	92.74	86.48	100.00	92.74	86.48	86.74	86.60	95.32	100.00	97.60
Kerberos	23.36	65.96	34.50	75.29	84.61	79.67	82.14	85.18	83.63	82.14	85.58	83.82	100.00	100.00	100.00	100.00	95.86	97.88
MWBP	85.71	100.00	92.30	71.42	100.00	83.32	100.00	77.77	87.49	100.00	77.77	87.49	100.00	100.00	100.00	100.00	70.00	82.35
NBNS	84.18	97.71	90.44	84.18	98.61	90.82	84.18	99.07	91.02	84.18	99.44	91.17	84.18	84.31	84.24	99.92	51.25	67.75
NBSS	77.74	98.97	87.07	77.74	99.31	87.21	77.74	99.31	87.21	78.28	99.32	87.55	85.79	85.79	85.79	98.92	99.73	99.32
NTP	56.60	100.00	72.28	71.57	100.00	83.42	87.92	99.70	93.44	95.51	99.74	97.57	97.25	97.25	97.25	99.25	12.72	22.54
POP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
QUIC	100.00	93.00	96.37	100.00	97.00	98.47	100.00	98.00	98.98	100.00	99.00	99.49	100.00	100.00	100.00	100.00	100.00	100.00
RPC	71.42	8.77	15.62	85.71	100.00	92.30	85.71	100.00	92.30	85.71	100.00	92.30	100.00	100.00	100.00	100.00	100.00	100.00
SIP	80.00	100.00	88.88	80.00	90.78	85.04	80.00	90.78	85.04	80.00	90.78	85.04	93.75	93.75	93.75	93.75	89.82	91.74
SMTP	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
SSH	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Average	85.55	92.61	86.71	89.45	97.30	92.90	92.56	92.67	91.67	94.61	94.24	93.36	97.36	97.38	97.37	99.25	87.58	90.35

Table 13. Sensitivity Analysis by Varying Threshold Value

5.4 Performance Comparison

We compared the performance of OptiClass with five closely related state-of-the-art methods BitMiner [41], BitFlow [40], BitPack [40], BitCoding [18], and BitProb [19]. All five state-of-the-art methods are supervised bit-level network traffic classification methods. We compared OptiClass with all five methods by testing them on the grand dataset using recall, precision, and F1-score as evaluating parameters. This comparison is shown in Table 14. We can see that OptiClass performed equally or better for 15 out of 20 application protocols. We then compared the time required for classification for OptiClass and the other five closely related state-of-the-art methods to identify the faster method between them. The classification time is calculated for the same experiment, which is done for performance comparison. Table 15 shows the classification time comparison. We deduced from Table 15 that OptiClass is, on average, 3.13, 2.94, 2.06, 15.20, and 22.24 times faster than BitMiner, BitFlow, BitPack, BitCoding, and BitProb respectively. The reason for the faster classification of OptiClass is BiTSPLITTER, which performs simultaneous signature matching.

Table 14.

Protocols	BitMiner			BitFlow			BitPack			BitCoding			BitProb			OptiClass
	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score	Recall	Precision	F1-Score
BACnet	100.00	100.00	100.00	50.00	100.00	66.66	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
BJNP	100.00	100.00	100.00	100.00	100.00	100.00	0.00	0.00	0.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
BitTorrent	100.00	100.00	100.00	88.77	100.00	94.05	75.00	100.00	85.71	100.00	100.00	100.00	99.36	100.00	99.67	100.00	100.00	100.00
Bootp	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	97.13	100.00	98.54	100.00	100.00	100.00
CUPS	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	93.61	73.07	82.07	82.60	73.07	77.54	100.00	100.00	100.00
DNS	43.90	100.00	61.01	99.68	100.00	99.83	100.00	100.00	100.00	99.75	99.99	99.86	97.93	98.06	97.99	99.87	99.93	99.89
Dropbox	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	99.87	99.87	99.87	100.00	100.00	100.00
GsmIp	100.00	100.00	100.00	16.66	100.00	28.56	66.66	100.00	79.99	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
HTTP	99.48	100.00	99.74	36.22	100.00	53.17	53.33	100.00	69.56	100	100.00	100.00	86.47	99.88	92.69	86.48	86.74	86.60
Kerberos	83.03	100.00	90.73	22.86	100.00	37.21	25.00	100.00	40.00	100.00	98.60	99.29	100.00	98.60	99.29	100.00	100.00	100.00
MWBP	0.00	0.00	0.00	93.16	100.00	96.45	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
NBNS	4.72	100.00	9.01	94.47	100.00	97.15	66.66	100.00	79.99	99.05	52.42	68.55	99.05	99.52	99.28	84.18	84.31	84.24
NBSS	98.65	100.00	99.32	33.75	100.00	50.46	66.66	100.00	79.99	100.00	100.00	100.00	100.00	100.00	100.00	85.79	85.79	85.79
NTP	99.50	100.00	99.75	94.11	100.00	96.96	92.85	100.00	96.29	100.00	100.00	100.00	100.00	100.00	100.00	97.25	97.25	97.25
POP	100.00	100.00	100.00	66.66	100.00	79.99	66.66	100.00	79.99	100.00	100.00	100.00	99.11	100.00	99.55	100.00	100.00	100.00
QUIC	21.50	100.00	35.39	22.22	100.00	36.36	85.00	100.00	91.89	100.00	100.00	100.00	99.54	100.00	99.76	100.00	100.00	100.00
RPC	100.00	100.00	100.00	75.00	100.00	85.71	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
SIP	100.00	100.00	100.00	100.00	100.00	100.00	50.00	100.00	66.66	91.03	100.00	95.30	91.03	100.00	95.30	93.75	93.75	93.75
SMTP	100.00	100.00	100.00	60.52	100.00	75.40	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
SSH	100.00	100.00	100.00	12.34	100.00	21.96	66.66	100.00	79.99	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Average	82.53	95.00	84.74	68.32	100	75.99	77.03	95.00	83.31	99.17	96.20	97.25	97.60	98.45	97.97	97.36	97.38	97.37

Table 14. Performance Comparison on Grand Dataset

Table 15.

Protocols	BitMiner	BitFlow	BitPack	BitCoding	BitProb	OptiClass
BACnet	1.86	1.00	1.00	1.17	0.40	0.02
BJNP	1.84	1.00	1.00	1.95	0.38	0.07
BitTorrent	1.79	6.14	2.35	4.34	4.16	0.24
Bootp	1.84	1.00	1.00	0.58	1.25	0.10
CUPS	1.80	1.00	1.00	1.66	0.57	0.04
DNS	1.98	4.46	4.10	43.66	24.00	4.88
Dropbox	1.86	7.22	4.50	8.94	10.00	0.20
GsmIp	1.77	1.00	1.00	0.32	0.41	0.02
HTTP	2.07	6.56	3.50	98.57	184.00	4.69
Kerberos	1.84	0.03	0.02	1.01	5.93	0.18
MWBP	1.75	0.04	0.02	1.35	0.45	0.02
NBNS	1.87	0.70	0.40	1.35	2.06	0.23
NBSS	1.78	0.04	0.03	0.67	1.57	0.16
NTP	1.70	0.01	0.01	0.58	1.31	0.14
POP	1.82	1.00	1.00	0.33	0.59	0.05
QUIC	1.76	0.01	0.01	1.77	0.64	0.05
RPC	1.74	1.00	1.00	0.26	0.45	0.01
SIP	1.84	1.00	1.00	5.04	10.20	0.18
SMTP	1.79	0.59	0.54	1.62	1.95	0.24
SSH	1.85	0.68	0.57	1.41	7.75	0.18
Average	1.82	1.71	1.20	8.82	12.90	0.58

Table 15. Classification Time Comparison (in seconds) on Grand Dataset

6 Conclusion and Future Work

In this article, we presented the OptiClass framework based on bit-level signatures to classify application layer protocols accurately. OptiClass uses the first “n” bits of data extracted from the payload of bidirectional flows. These flows of each application generate the application signatures using invariant bits at fixed positions. Subsequently, the BiTSPLITTER is created, containing the signatures of all the application protocols. After that, BiTSPLITTER is used for accurate, fast, and efficient flow classification. With extensive experimentation, we showed that OptiClass is a robust method for network traffic classification, which utilizes only 32-bit application signatures and achieved average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively. Moreover, when the OptiClass is compared with the five state-of-the-art methods BitMiner, BitFlow, BitPack, BitCoding, and BitProb, it is found that OptiClass performed classification approximately 3.13, 2.94, 2.06, 15.20, and 22.24 times faster than BitMiner, BitFlow, BitPack, BitCoding, and BitProb respectively because of the use of a novel data structure BiTSPLITTER. However, the limitation of OptiClass is that it is a supervised method that requires labeled application flows to generate their signatures. Secondly, our proposed method did not focus on the encrypted network traffic. In the future, we target to develop an unsupervised approach to build over the OptiClass for encrypted network traffic classification. Moreover, the hardware implementation of OptiClass can be an interesting future direction, and positioning the system within an actual network will be a focal point.

7 Acknowledgments

We sincerely acknowledge that this work is supported by Science and Engineering Research Board (SERB)

References

[1]

[n.d.]. https://en.wikipedia.org/wiki/EDonkey2000. Accessed: 2022-09-20.

Abstract

1 Introduction

2 Related Work

2.1 Shallow Packet Inspection-based Methods

2.2 Deep Packet Inspection-based Methods

3 Proposed Method

3.1 Training Phase

3.1.1 Traffic Preprocessing.

3.1.2 Bit Signature Generation.

3.1.3 BiTSPLITTER Creation.

3.2 Testing Phase

3.2.1 Flow Classification Module.

4 Complexity Analysis

5 Experiments and Results

5.1 Dataset Description

5.2 Evaluation

5.2.1 Homogeneous Experiments.

5.2.2 Heterogeneous Experiments.

5.2.3 Grand Experiment.

5.3 Sensitivity Analysis

5.3.1 Signature Length.

5.3.2 Threshold Value.

5.4 Performance Comparison

6 Conclusion and Future Work

7 Acknowledgments

References

Index Terms

Recommendations

Strengthening Zero-Knowledge Protocols Using Signatures

Proxy Confirmation Signatures

Internet Traffic Classification Using Score Level Fusion of Multiple Classifier

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations