skip to main content
10.1145/1882486.1882496acmconferencesArticle/Chapter ViewAbstractPublication PagesancsConference Proceedingsconference-collections
research-article

LaFA: lookahead finite automata for scalable regular expression detection

Published: 19 October 2009 Publication History

Abstract

Although Regular Expressions (RegExes) have been widely used in network security applications, their inherent complexity often limits the total number of RegExes that can be detected using a single chip for a reasonable throughput. This limit on the number of RegExes impairs the scalability of today's RegEx detection systems. The scalability of existing schemes is generally limited by the traditional per character state processing and state transition detection paradigm. The main focus of existing schemes is in optimizing the number of states and the required transitions, but not the suboptimal character-based detection method. Furthermore, the potential benefits of reduced number of operations and states using out-of-sequence detection methods have not been explored. In this paper, we propose Looka-head Finite Automata (LaFA) to perform scalable RegEx detection using very small amount of memory. LaFA's memory requirement is very small due to the following three areas of effort described in this paper: (1) Different parts of a RegEx, namely RegEx components, are detected using different detectors, each of which is specialized and optimized for the detection of a certain RegEx component. (2) We systematically reorder the RegEx component detection sequence, which provides us with new possibilities for memory optimization. (3) Many redundant states in classical finite automata are identified and eliminated in LaFA. Our simulations show that LaFA requires an order of magnitude less memory compared to today's state-of-the-art RegEx detection systems. A single commodity Field Programmable Gate Array (FPGA) chip can accommodate up to twenty-five thousand (25k) RegExes. Based on the throughput of our LaFA prototype on FPGA, we estimated that a 34-Gbps throughput can be achieved.

References

[1]
Bro intrusion detection system. http://www.bro-ids.org.
[2]
Snort network intrusion detection system. http://www.snort.org.
[3]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison Wesley, 2006.
[4]
A. N. Arslan. Multiple Sequence Alignment Containing a Sequence of Regular Expressions. In CIBCB, pages 1--7, 2005.
[5]
N. S. Artan, M. Bando, and H. J. Chao. Boundary hash for memory-efficient deep packet inspection. In ICC, pages 1732--1737, 2008.
[6]
N. S. Artan and H. J. Chao. TriBiCa: Trie bitmap content analyzer for high-speed network intrusion detection. In INFOCOM, pages 125--133, 2007.
[7]
Z. K. Baker, H.-J. Jung, and V. K. Prasanna. Regular expression software deceleration for intrusion detection systems. In FPL, pages 1--8, 2006.
[8]
M. Bando, N. S. Artan, and H. J. Chao. Highly memory-efficient LogLog Hash for deep packet inspection. In GLOBECOM, pages 1--6, 2008.
[9]
M. Becchi and S. Cadambi. Memory-efficient regular expression search using state merging. In INFOCOM, pages 1064--1072, 2007.
[10]
M. Becchi and P. Crowley. A hybrid finite automaton for practical deep packet inspection. In CoNEXT, 2007.
[11]
M. Becchi and P. Crowley. An improved algorithm to accelerate regular expression evaluation. In ANCS, pages 145--154, 2007.
[12]
M. Becchi and P. Crowley. Extending finite automata to efficiently match perl-compatible regular expressions. In CoNEXT, 2008.
[13]
Y. H. Cho and W. H. Mangione-Smith. Deep network packet filter design for reconfigurable devices. Trans. on Embedded Computing Sys., 7(2):1--26, 2008.
[14]
Y. S. Chung, W. H. Lee, C. Y. Tang, and C. L. Lu. RE-MuSiC: a tool for multiple sequence alignment with regular expression constraints. Nucleic Acids Res. (Web Server issue), (35):W639--644, 2007.
[15]
S. Dharmapurikar, P. Krishnamurthy, T. S. Sproull, and J. W. Lockwood. Deep packet inspection using parallel bloom filters. Micro, 24(1):52--61, 2004.
[16]
D. Ficara, S. Giordano, G. Procissi, F. Vitucci, G. Antichi, and A. Di Pietro. An improved DFA for fast regular expression matching. SIGCOMM Comput. Commun. Rev., 38(5):29--40, 2008.
[17]
S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese. Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia. In ANCS, pages 155--164, 2007.
[18]
S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In SIGCOMM, pages 339--350, 2006.
[19]
S. Kumar, J. Turner, and J. Williams. Advanced algorithms for fast and scalable deep packet inspection. In ANCS, pages 81--92, 2006.
[20]
J. Levandoski, E. Sommer, and M. Strait. Application layer packet classifer for linux. http://17-filter.sourceforge.net.
[21]
M. Paolieri, I. Bonesana, and M. D. Santambrogio. ReCPU: A parallel and pipelined architecture for regular expression matching. In IFIP, VLSI - SoC., pages 19--24, 2007.
[22]
R. Smith, C. Estan, and S. Jha. XFA: Faster signature matching with extended automata. In SP, pages 187--201, 2008.
[23]
R. Smith, C. Estan, S. Jha, and S. Kong. Deflating the big bang: Fast and scalable deep packet inspection with extended finite automata. In SIGCOMM, 2008.
[24]
I. Sourdis, D. Pnevmatikatos, and S. Vassiliadis. Scalable multigigabit pattern matching for packet inspection. In VLSI, volume 16, pages 156--166, 2008.
[25]
N. Wirth. Compiler Construction. Addison Wesley, 1996.
[26]
F. Yu, Z. Chen, Y. Diao, T. V. Lakshman, and R. H. Katz. Fast and memory-efficient regular expression matching for deep packet inspection. In ANCS, pages 93--102, 2006.
[27]
F. Yu, R. Katz, and T. Lakshman. Gigabit rate packet pattern-matching using TCAM. In ICNP, pages 174--183, 2004.

Cited By

View all
  • (2019)An improved method in deep packet inspection based on regular expressionThe Journal of Supercomputing10.1007/s11227-018-2517-075:6(3317-3333)Online publication date: 1-Jun-2019
  • (2014)Towards Fast and Optimal Grouping of Regular Expressions via DFA Size EstimationIEEE Journal on Selected Areas in Communications10.1109/JSAC.2014.235883932:10(1797-1809)Online publication date: Oct-2014
  • (2013)Hardware-Accelerated Regular Expression Matching with Overlap Handling on IBM PowerEN™ ProcessorProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2013.54(1254-1265)Online publication date: 20-May-2013
  • Show More Cited By

Index Terms

  1. LaFA: lookahead finite automata for scalable regular expression detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ANCS '09: Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
    October 2009
    227 pages
    ISBN:9781605586304
    DOI:10.1145/1882486
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FPGA
    2. LaFA
    3. deep packet inspection
    4. finite automation
    5. network intrusion detection system
    6. regular expressions

    Qualifiers

    • Research-article

    Conference

    ANCS '09
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 88 of 314 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 21 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)An improved method in deep packet inspection based on regular expressionThe Journal of Supercomputing10.1007/s11227-018-2517-075:6(3317-3333)Online publication date: 1-Jun-2019
    • (2014)Towards Fast and Optimal Grouping of Regular Expressions via DFA Size EstimationIEEE Journal on Selected Areas in Communications10.1109/JSAC.2014.235883932:10(1797-1809)Online publication date: Oct-2014
    • (2013)Hardware-Accelerated Regular Expression Matching with Overlap Handling on IBM PowerEN™ ProcessorProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2013.54(1254-1265)Online publication date: 20-May-2013
    • (2012)Efficient regular expression pattern matching for network intrusion detection systems using modified word-based automataProceedings of the Fifth International Conference on Security of Information and Networks10.1145/2388576.2388590(103-110)Online publication date: 25-Oct-2012
    • (2012)Scalable lookahead regular expression detection system for deep packet inspectionIEEE/ACM Transactions on Networking10.1109/TNET.2011.218141120:3(699-714)Online publication date: 1-Jun-2012
    • (2012)Managing DFA History with Queue for Deflation DFAJournal of Network and Systems Management10.1007/s10922-010-9179-420:2(155-180)Online publication date: 1-Jun-2012
    • (2010)Range hash for regular expression pre-filteringProceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems10.1145/1872007.1872032(1-12)Online publication date: 25-Oct-2010
    • (2010)Hardware implementation for scalable lookahead Regular Expression detection2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)10.1109/IPDPSW.2010.5470750(1-8)Online publication date: Apr-2010

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media