Reconfigurable regular expression matching architecture for real-time pattern update and payload inspection

https://doi.org/10.1016/j.jnca.2022.103507Get rights and content

Abstract

Regular expression (regex) matching is an integral part of deep packet inspection (DPI), but its efficiency becomes a question due to low performance. For regex matching (REM) acceleration, FPGA-based solutions have emerged to maximize parallelism by processing multiple regex patterns concurrently. However, even though they significantly accelerate the performance, they have a critical problem that they do not support dynamic regex pattern updates in run time, which is the key functionality along with frequently altered signatures to cover newly identified vulnerabilities. Hence, we present Reinhardt, a new reconfigurable hardware architecture for REM. Reinhardt introduces new FPGA blocks, called reconfigurable cells, that form regex patterns in hardware, enabling real-time regex pattern update and match in run time while providing high performance. With the prototype of Reinhardt on NetFPGA-SUME, our evaluation shows that Reinhardt updates hundreds of regex patterns within a second and performs REM at up to 10 Gbps throughput (max. hardware bandwidth) with the constant latency. Our case studies also show that Reinhardt can operate in multiple modes (e.g., as a standalone NIDS/NIPS or as the REM accelerator for them).

Introduction

As network traffic has become sophisticated with time, payload analysis has also become an essential operation in network protection. In that sense, deep packet inspection (DPI), which analyzes packet payloads, plays a central role in network intrusion and prevention systems (NIDS/IPS) (Snort, 2021, Suricata, 2021, Zeek (Bro), 2021). However, while DPI in modern networks should satisfy high performance and dynamic updatability to deal with a large amount of traffic and rapidly changing networks (Xu et al., 2016, Van Lunteren, 2006, Atasu et al., 2013, AbuHmed et al., 2008, Tupakula et al., 2011, Fernandes et al., 2019), its key functionality, regular expression matching (REM), is considered the major bottleneck. Thus, prior researchers have attempted to accelerate the performance of REM using hardware, mainly based on Field-Programmable Gate Arrays (FPGA), by matching multiple regex patterns in parallel (Sidhu and Prasanna, 2001, Hieu et al., 2013, Hutchings et al., 2002, Sourdis et al., 2008, Lin et al., 2007, Yang et al., 2008, Mitra et al., 2007).

However, even though FPGA-based REM significantly accelerates the overall performance, it also raises three critical problems due to the lack of dynamic updatability. First, updating regex patterns in FPGA takes a significant amount of time (e.g., at least a few hours) to compile the patterns into hardware circuits (synthesis, map, placement, and routing). Second, service interruption is required for the initialization to apply newly compiled patterns into FPGA, exposing a network in an unprotected state for a while. Third, the update has to perform in an all-or-nothing fashion, which means that even a tiny pattern change requires the entire compilation process and service interruption. For these reasons, the hardware-based REM solutions have been less adopted for NIDS/IPS so far. Also, while these problems in FPGA-based REM have been pointed out for years, they still remain as significant but unsolved limitations (Kumar, 2007, Vasiliadis et al., 2008, Chen et al., 2010, Xu et al., 2016).

To achieve dynamic updatability in FPGA-based REM, we propose Reinhardt, a new reconfigurable hardware architecture for real-time regex pattern update and match in run time. The Reinhardt design sits on the opposite spectrum of prior FPGA-based REM (from a circuit level to a logic level) to better support dynamic updatability. For this, Reinhardt introduces new FPGA blocks, called reconfigurable cells, that can change their connections to neighbor cells in run time. In this work, reconfigurable cells represent regex patterns in hardware. The combination of the cells implements Finite-State Machines (FSM) for given regex patterns using our conversion algorithm and is deployed into hardware. Indeed, Reinhardt enables dynamic pattern updates instantly without service interruptions while maximizing the natural parallelism of hardware. Moreover, Reinhardt maintains all the information of cell connections in FPGA memory; thus, Reinhardt can process a large number of regex patterns that even exceed the physical space of Reinhardt by dynamically fetching them (swapping pattern sets in processing), which means that it allows a packet to be inspected with multiple patterns continuously (called resubmitting). Lastly, Reinhardt provides application programming interfaces (APIs) for better applicability, allowing existing NIDS/IPS solutions to adopt Reinhardt for REM acceleration as well.

We implement a Reinhardt prototype using NetFPGA-SUME (NetFPGA, 2014, Zilberman et al., 2014). Our evaluation shows that Reinhardt can update 1300 patterns in 0.96 s with zero downtime, which is overwhelmingly faster than prior solutions (1–5 h). Reinhardt shows 1.4–10 Gbps throughput with 800–160 regex patterns, respectively. It also performs with the constant latency (2μs) no matter how many patterns are installed, which means that Reinhardt enables deterministic processing. In fact, Reinhardt has competitive benefits in providing stable performance, compared with DPDK-Hyperscan (Intel, 2021, Wang et al., 2019). Our case studies demonstrate the unique strengths of Reinhardt in REM. In particular, Reinhardt NIDS/IPS covers 87% of signatures in 2.9.7 default rules (6411 signatures), and the hardware acceleration using Reinhardt improves the overall throughput up to 65 times compared to the original performance of the vanilla Snort IDS.

In sum, this paper makes the following contributions.

  • Reconfigurable REM architecture: We introduce Reinhardt, a novel reconfigurable REM architecture, which directly implements given regex patterns as the state machine logic with the combinations of the reconfigurable cells without service interruption.

  • Prototype and evaluation: We implement a prototype of Reinhardt on the FPGA hardware and our evaluation shows that Reinhardt can update regex patterns within a second and reach 10 Gbps throughput with a negligible latency overhead.

  • Practical Deployment: We present the practical deployments of Reinhardt: Reinhardt as an NIDS/NIPS, REM acceleration in Snort IDS, and SDN integration, showing that it can replace today’s NIDS/NIPS and accelerate the performance of Snort IDS up to 65 times.

The rest of the paper is organized as follows. Section 2 provides the background and motivation for FPGA-based REM. Sections 3 Design, 4 Implementation describe the design and implementation of the Reinhardt architecture. Sections 5 Evaluation, 6 Use cases summarize the results of our performance evaluation and practical use cases. Section 7 relates Reinhardt to prior work and Section 8 provides a conclusion.

Section snippets

Background and motivation

In this section, we provide the background of regular expression matching (REM) and explain the performance degradation issue in DPI due to REM. Then, we discuss the challenges of previous efforts for accelerating the REM performance by using FPGA.

Design

In this section, we present Reinhardt, a reconfigurable FPGA architecture for REM, and explain how Reinhardt dynamically updates given regex patterns and matches them with the incoming traffic on the FPGA architecture.

Implementation

To validate the efficiency and feasibility of the Reinhardt design, we implemented a prototype using NetFPGA-SUME with Xilinx Virtex-7 XC7V690T and four SFP+ 10 Gbps interfaces (NetFPGA, 2014, Zilberman et al., 2014), which processes packets in chunks of 256-bit at 160 MHz. We also implemented a device driver based on the NetFPGA-SUME reference driver (NetFPGA-SUME, 2020) to create communication channels between the Reinhardt framework and the Reinhardt datapath. Reinhardt APIs internally

Reinhardt core constraints

To maximize the performance of Reinhardt, we need to configure its constraints (the width and height of the core, the length of input cells, and the number of input queues). Here, we demonstrate how to determine these constraints.

For this, we collect 2735 regex patterns from Snort 2.9.7 default (648), Snort 2.9 (645) and 3.0 (524) community, and Suricata 4.1.2 default (918) rulesets, and then find the constraints that can express 90% of regex forms and accommodate as many patterns as possible

Use cases

In this section, we discuss how Reinhardt can be employed in real-world applications. For this, we introduce three use cases: Reinhardt as an NIDS/NIPS, (2) Replacement of the PCRE engine in Snort IDS, and Reinhardt integration with software-defined networking (SDN). Here, we provide the intuition on how to leverage the unique strengths of Reinhardt through these use cases.

Related work

FPGA-based REM: Sidhu and Prasanna (2001) proposed a one-hot encoding scheme to express NFA with circuit blocks, and its subsequent studies (Hutchings et al., 2002, Lin et al., 2007) inspire Reinhardt. Also, Sert and Bazlamacci (2021) proposed a NFA-based REM based on 2-stride-input-based transitions for high performance and memory efficiency. Some studies (Hieu et al., 2013, Wang et al., 2013, Nakahara et al., 2012) suggested resource-efficient regex circuits. Other studies (Mitra et al., 2007

Conclusion

FPGA-based REM satisfies high-performance but its flexibility becomes a major and critical limitation as it involves a time-consuming process to update regex patterns in the hardware. To address this, we have presented Reinhardt, an improved hardware architecture of implementing regex patterns with its reconfigurable cells to support dynamic pattern updates. Our evaluation and case studies demonstrate that Reinhardt updates regex patterns immediately without service interruption and serves as a

CRediT authorship contribution statement

Jaehyun Nam: Conceptualization, Methodology, Validation, Software, Writing – original draft, Writing – review & editing. Seung Ho Na: Conceptualization, Methodology, Software, Writing – original draft, Writing – review & editing. Seungwon Shin: Project administration, Supervision, Funding acquisition, Writing – original draft, Writing – review & editing. Taejune Park: Conceptualization, Formal analysis, Project administration, Supervision, Funding acquisition, Writing – original draft, Writing

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by National Research Foundation of Korea (No. 2022R1C1C1006967).

Jaehyun Nam is an assistant professor at Department of Computer Engineering, Dankook University, South Korea. He received his Ph.D. and M.S. degree in School of Computing (Information Security) from KAIST and his B.S. degree in Computer Science and Engineering from Sogang University in Korea. His research interests focus on networked systems and security. He is especially interested in security issues in cloud and edge computing systems (including SDN, NFV, IoT, and containers).

References (94)

  • BispoJ. et al.

    Regular expression matching for reconfigurable packet inspection

  • BispoJ. et al.

    Synthesis of regular expressions targeting fpgas: Current status and open issues

  • Bremler-BarrA. et al.

    Deep packet inspection as a service

  • BrodieB.C. et al.

    A scalable architecture for high-throughput regular-expression pattern matching

  • ChamolaV. et al.

    FPGA for 5G: Re-configurable hardware for next generation communication

    IEEE Wirel. Commun.

    (2020)
  • CheS. et al.

    Accelerating compute-intensive applications with GPUs and FPGAs

  • Check Point Software Technologies LtdS.

    How quick are turn-around times for IPS signature updates addressing newly found vulnerabilities

    (2019)
  • ChenH. et al.

    A survey on the application of FPGAs for network infrastructure security

    IEEE Commun. Surv. Tutor.

    (2010)
  • ChoiB. et al.

    DFC: Accelerating string pattern matching for network applications

  • CopeB. et al.

    Performance comparison of graphics processors to reconfigurable logic: A case study

    IEEE Trans. Comput.

    (2010)
  • CORSAB.

    Is your network security keeping up?

    (2019)
  • DetailC.

    The ultimate security vulnerability datasource

    (2021)
  • Emerging ThreatsC.

    Emerging threats rulesets

    (2021)
  • FahmyS.A. et al.

    Virtualized FPGA accelerators for efficient cloud computing

  • FernandesG. et al.

    A comprehensive survey on network anomaly detection

    Telecommun. Syst.

    (2019)
  • FireEyeG.

    FireEye dynamic threat intelligence cloud

    (2021)
  • Firestone, D., Putnam, A., Mundkur, S., Chiou, D., Dabagh, A., Andrewartha, M., Angepat, H., Bhanu, V., Caulfield, A.,...
  • FowersJ. et al.

    A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications

  • GanegedaraT. et al.

    Automation framework for large-scale regular expression matching on FPGA

  • GuptaP.

    Accelerating datacenter workloads

  • HammingR.W.

    Error detecting and error correcting codes

    Bell Syst. Tech. J.

    (1950)
  • HazelP.

    Pcre: Perl compatible regular expressions

    (2005)
  • HPC, I., 0000. Inside HPC Special Report: Are FPGAs the Answer to the ”Compute Gap”?,...
  • HutchingsB.L. et al.

    Assisting network intrusion detection with reconfigurable hardware

  • Hypolite, J., Sonchack, J., Hershkop, S., Dautenhahn, N., DeHon, A., Smith, J.M., 2020. DeepMatch: practical deep...
  • InfoSecurityM.

    Advanced malware detection - signatures vs. Behavior analysis

    (2019)
  • Intel, ., 0000. Cloud computing,...
  • Intel, ., 0000. 5G Wireless,...
  • IntelM.

    Hyperscan

    (2019)
  • IntelM.

    DPDK: Data plane development kit

    (2021)
  • Intel 01.orgM.

    Hyperscan sample data

    (2019)
  • JamshedM.A. et al.

    Kargus: a highly-scalable software-based intrusion detection system

  • Jepsen, T., Alvarez, D., Foster, N., Kim, C., Lee, J., Moshref, M., Soulé, R., 2019. Fast string searching on pisa. In:...
  • JohnsonA. et al.

    Pattern matching in reconfigurable logic for packet classification

  • Kreibich, C., Handley, M., Paxson, V., 2001. Network intrusion detection: Evasion, traffic normalization, and nd-to-end...
  • KumarS.

    Survey of Current Network Intrusion Detection Techniques

    (2007)
  • LavinC. et al.

    Impact of hard macro size on FPGA clock rate and place/route time

  • Cited by (1)

    Jaehyun Nam is an assistant professor at Department of Computer Engineering, Dankook University, South Korea. He received his Ph.D. and M.S. degree in School of Computing (Information Security) from KAIST and his B.S. degree in Computer Science and Engineering from Sogang University in Korea. His research interests focus on networked systems and security. He is especially interested in security issues in cloud and edge computing systems (including SDN, NFV, IoT, and containers).

    Seung Ho Na is a Ph.D. candidate in the Network and System Security Lab, advised by Seungwon Shin, in the School of Electrical Engineering at KAIST. He received his M.S. degree and B.S. degree in Electrical Engineering from KAIST. His research interests span the areas of data-driven security, artificial intelligence (AI) security, and cyber threat intelligence (CTI). Currently, he focuses on preserving and improving privacy of machine learning systems.

    Seungwon Shin is an associate professor in School of Electrical Engineering at KAIST. He received his Ph.D. degree in Computer Engineering from the Electrical and Computer Engineering Department, Texas A&M University, and his M.S. degree and B.S. degree from KAIST, both in Electrical and Computer Engineering. His research interests span the areas of SDN security, IoT security, Botnet analysis/detection, DarkWeb analysis and cyber threat intelligence (CTI).

    Taejune Park is an assistant professor at the Department of Artificial Intelligence Convergence, Chonnam National University, South Korea. He received B.S. in Computer Engineering at Korea Maritime and Ocean University, South Korea, and M.S. and Ph.D. in information security at KAIST, South Korea. His research interests focus on network and IoT security and reliable low-latency communications.

    View full text