FPGA based string matching for network processing applications
Introduction
Many network processing applications require frequent use of string matching to find the presence of a keyword. There are several software algorithms for string matching, but the slowness of these implementations creates a bottleneck for growing gigabit network configurations. A hardware implementation can potentially speed up the signature matching process considerably, and in this paper, we introduce variations on an array based architecture for string matching with applications as a string lookup cache and for network intrusion detection.
The architecture is flexible enough to handle variable sized keys as well as provide a mechanism to do mapping in addition to string matching. The architecture is based loosely on temporally cascaded content addressable memories (CAM) and is an extension of Motomura’s cellular automata structure. As with Motomura’s design, processor element cells, external to the CAM array, will process character match signals from the CAM and output keyword match signals. This architecture is also compatible with further optimizations like processing characters in parallel, prefix sharing, pattern partitioning, etc. to improve the performance. The compactness of each cellular automata element leads to a highly efficient design in terms of both area and speed.
The rest of the paper is organized as follows. The following section presents background on string matching in software and hardware. Details of the architecture and working of a generic lookup cache and signature match processor for network intrusion detection are explained in following sections.
Section snippets
Background and related work
String matching is the process of finding out if a given string or pattern is present in the data we have. String matching is a very common problem we encounter in the computing world. We often search our computer for a specific file or we search some text document for a specific keyword. There are different types of string matching like exact string matching which as the name suggests looks for the presence of the exact pattern in the search area. String matching with errors or approximate
String lookup cache
Many networking applications often require the ability to cache certain frequently used values. This is particularly true for operations that use tables that map one known value to another value. Examples include IP to Ethernet address mappings and routing table lookups. It is common to use an address resolution protocol (ARP) cache that serves as a translation table to map layer 3 IP address to corresponding layer 2 hardware address. If an entry is found in the ARP cache, the network processor
Network intrusion detection
With the rapid increase in malicious attacks on corporate and government networks, there has been an increased awareness amongst network administrators to deploy tools that protect them from external attacks. Network intrusion detection systems (NIDS) are one of the primary tools available to help in creating a secure network infrastructure. Network intrusion detection is the process of identifying and analyzing packets that may signify an impending threat to organization’s network. NIDS can be
Conclusions
In this paper, we have introduced an array based architecture for string matching and demonstrated its use in two applications – namely network intrusion detection and as a string cache. The principle behind the architecture is that the processing elements external to CAM array process character matches from CAM yielding keyword matches. This architecture is scalable and optimizations such as processing characters in parallel improve the throughput considerably. Aligning the keyword on a
Future work
We are investigating the use of SMPs in these applications as well as developing extensions of the SMPs to support wildcards and approximate word matching capabilities as well. Other research directions include improving the power characteristics of the SMPs and analyzing operating system requirements to efficiently and seamlessly support these FPGA implementations of network processing functions.
Acknowledgements
This work was supported in part by a grant from the University of Connecticut Research Foundation. Special thanks to Long Bu for his contributions to the initial architecture design.
References (31)
- et al.
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development
(1987) - et al.
Fast pattern matching in strings
SIAM Journal on Computing
(1977) - et al.
A fast string searching algorithm
Communications of the ACM
(1977) - Q. Zhang, R.D. Chamberlain, R.S. Indeck, B.M. West, J. White, Massively parallel data mining using reconfigurable...
- et al.
A new approach to text searching
Communications of the ACM
(1992) - et al.
A new approach to text searching
Communications of the ACM
(1992) - R. Sidhu, V.K. Prasanna, Fast regular expression matching using FPGAs, in: Proceedings of IEEE Symposium on...
- B.L. Hutchings, R. Franklin, D. Carver, Assisting network intrusion detection with reconfigurable hardware, in:...
- J. Moscola, J. Lockwood, R.P. Loui, M. Pachos, Implementation of a content-scanning module for an internet firewall,...
- C. Clark, D. Schimmel, Scalable multi-pattern matching on high-speed networks, in: Proceedings of the IEEE Symposium on...
Hardware algorithms for string processing
IEEE Computer
A 1.2-million transistor, 33 MHz, 20-b dictionary search processor (DISP) ULSI with a 160-kb CAM
IEEE Journal for Solid State Circuits
Cascading content addressable memories
IEEE Micro
A 2 k-word dictionary search processor (DISP) with an approximate word search capability
IEEE Journal for Solid State Circuits
Cited by (10)
Active storage networks: Using embedded computation in the network switch for cluster data processing
2015, Future Generation Computer SystemsCitation Excerpt :ASN could work in concert with SDN to provide application level computational enhancements that reduce network traffic and congestion at the switch. Previous approaches to string matching in FPGAs have included finite automata methods that translate regexp signatures into hardware implementations [30,31], CAM based methods [15,32–35], and tree-based approaches [36]. Our approach uses similar approaches within the FPGA, but the integration into the ASN allows it to operate on data as it is streaming from multiple data sources.
A bit-parallel algorithm for searching multiple patterns with various lengths
2015, Journal of Parallel and Distributed ComputingNetStage/DPR: A self-reconfiguring platform for active and passive network security operations
2012, Microprocessors and MicrosystemsCitation Excerpt :Today, solutions capable of processing 10+ Gb/s are widely available. Katashita et al. propose an integrated solution for packet inspection and filtering on a single FPGA [11], while Singaraju et al. focus on string matching implementations for deep packet inspection [12]. Beyond academic research, numerous commercial companies also investigate the technology.
An FPGA based soft multiprocessor for DNS/DNSSEC authoritative server
2011, Microprocessors and MicrosystemsCitation Excerpt :One of the most natural hardware solutions is the use of associative memories. These memories, referred to as Content Addressable Memory (CAM) [14], realize addressing by content. The content, the prefix to search in our case, is applied to the memory input.
Optimized memory based accelerator for scalable pattern matching
2009, Microprocessors and MicrosystemsA Real-Time Hardware Intrusion Detection System and a Classifying Features Algorithm
2023, Journal of Applied Security Research