skip to main content
10.1145/2396556.2396594acmconferencesArticle/Chapter ViewAbstractPublication PagesancsConference Proceedingsconference-collections
research-article

Fast submatch extraction using OBDDs

Published: 29 October 2012 Publication History

Abstract

Network-based intrusion detection systems (NIDS) commonly use pattern languages to identify packets of interest. Similarly, security information and event management (SIEM) systems rely on pattern languages for real-time analysis of security alerts and event logs. Both NIDS and SIEM systems use pattern languages extended from regular expressions. One such extension, the submatch construct, allows the extraction of substrings from a string matching a pattern. Existing solutions for submatch extraction are based on non-deterministic finite automata (NFAs) or recursive backtracking. NFA-based algorithms are time-inefficient. Recursive backtracking algorithms perform poorly on pathological inputs generated by algorithmic complexity attacks. We propose a new approach for submatch extraction that uses ordered binary decision diagrams (OBDDs) to represent and operate pattern matching. Our evaluation using patterns from the Snort HTTP rule set and a commercial SIEM system shows that our approach achieves its ideal performance when patterns are combined. In the best case, our approach is faster than RE2 and PCRE by one to two orders of magnitude.

References

[1]
M. Becchi and P. Crowley. A hybrid finite automaton for practical deep packet inspection. In Proceedings of the 2007 ACM CoNEXT conference, CoNEXT '07, pages 1:1--1:12, New York, NY, USA, 2007. ACM.
[2]
M. Becchi and P. Crowley. An improved algorithm to accelerate regular expression evaluation. In Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems, ANCS '07, pages 145--154, New York, NY, USA, 2007. ACM.
[3]
B. C. Brodie, D. E. Taylor, and R. K. Cytron. A scalable architecture for high-throughput regular-expression pattern matching. In Intl. Symp. Computer Architecture, pages 191--202. IEEE Computer Society, 2006.
[4]
R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput., 35:677--691, August 1986.
[5]
J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and J. Hwang. Symbolic model checking: 10 20 states and beyond. In Symp. on Logic in Computer Science, pages 401--424. IEEE Computer Society, 1990.
[6]
D. Chasaki and T. Wolf. Fast regular expression matching in hardware using nfa-bdd combination. In Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS '10, pages 12:1--12:2, New York, NY, USA, 2010. ACM.
[7]
C. R. Clark and D. E. Schimmel. Scalable pattern matching for high-speed networks. In Symp. on Field-Programmable Custom Computing Machines, pages 249--257. IEEE Computer Society, 2004.
[8]
O. Coudert, C. Berthet, and J. C. Madre. Verification of synchronous sequential machines based on symbolic execution. In Proceedings of the international workshop on Automatic verification methods for finite state systems, pages 365--373, New York, NY, USA, 1990. Springer-Verlag New York, Inc.
[9]
R. Cox. Regular expression matching can be simple and fast. http://swtch.com/ rsc/regexp/regexp1.html, 2007.
[10]
R. Cox. Implementing regular expressions. http://swtch.com/ rsc/regexp/, Last retrieved in August 2011.
[11]
J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to automata theory, languages, and computation. Addison Wesley, 2001.
[12]
B. L. Hutchings, R. Franklin, and D. Carver. Assisting network intrusion detection with reconfigurable hardware. In Symp. on Field-Programmable Custom Computing Machines, pages 111--120. IEEE Computer Society, 2002.
[13]
S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In ACM SIGCOMM Conference, pages 339--350. ACM, 2006.
[14]
V. Laurikari. NFAs with tagged transitions, their conversion to deterministic automata and application to regular expressions. In SPIRE'00, September 2000.
[15]
C. Meiners, J. Patel, E. Norige, E. Torng, and A. X. Liu. Fast regular expression matching using small TCAMs for network intrusion detection and prevention systems. In 19th USENIX Security Symposium, August 2010.
[16]
PCRE. The Perl compatible regular expression library. http://www.pcre.org.
[17]
R. Pike. The text editor sam. Softw. Pract. Exper., 17:813--845, November 1987.
[18]
R. Sidhu and V. Prasanna. Fast regular expression matching using FPGAs. In Symp. on Field-Programmable Custom Computing Machines, pages 227--238. IEEE Computer Society, 2001.
[19]
R. Smith, C. Estan, and S. Jha. Backtracking algorithmic complexity attacks against a NIDS. In Annual Computer Security Applications Conf., pages 89--98. IEEE Computer Society, 2006.
[20]
R. Smith, C. Estan, and S. Jha. XFA: Faster signature matching with extended automata. In Symp. on Security and Privacy, pages 187--201. IEEE Computer Society, 2008.
[21]
R. Smith, C. Estan, S. Jha, and S. Kong. Deflating the Big Bang: Fast and scalable deep packet inspection with extended finite automata. In SIGCOMM Conference, pages 207--218. ACM, 2008.
[22]
F. Somenzi. CUDD: CU decision diagram package, release 2.4.2. Department of Electrical, Computer, and Energy Engineering, University of Colorado at Boulder. http://vlsi.colorado.edu/~fabio/CUDD.
[23]
R. Sommer and V. Paxson. Enhancing byte-level network intrusion detection signatures with context. In CCS'03, pages 262--271. ACM, 2003.
[24]
K. Thompson. Programming techniques: Regular expression search algorithm. Commun. ACM, 11:419--422, June 1968.
[25]
H. J. Touati, H. Savoj, B. Lin, R. K. Brayton, and A. Sangiovanni-Vincentelli. Implicit state enumeration of finite state machines using bdd's. In IEEE International Conference on Computer-Aided Design, pages 130--133, Santa Clara, CA, 1990. IEEE.
[26]
L. Yang, R. Karim, V. Ganapathy, and R. Smith. Improving nfa-based signature matching using ordered binary decision diagrams. In RAID'10, volume 6307 of Lecture Notes in Computer Science (LNCS), pages 58--78, Ottawa, Canada, September 2010. Springer.
[27]
L. Yang, R. Karim, V. Ganapathy, and R. Smith. Fast, memory-efficient regular expression matching with nfa-obdds. Computer Networks, 55(15):3376--3393, October 2011.
[28]
F. Yu, Z. Chen, Y. Diao, T. V. Lakshman, and R. H. Katz. Fast and memory-efficient regular expression matching for deep packet inspection. In ACM/IEEE Symp. on Arch. for Networking and Comm. Systems, pages 93--102, 2006.

Cited By

View all
  • (2016)Kleenex: compiling nondeterministic transducers to deterministic streaming transducersACM SIGPLAN Notices10.1145/2914770.283764751:1(284-297)Online publication date: 11-Jan-2016
  • (2016)Kleenex: compiling nondeterministic transducers to deterministic streaming transducersProceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages10.1145/2837614.2837647(284-297)Online publication date: 11-Jan-2016
  • (2015)A novel algorithm for pattern matching with back referencesProceedings of the 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2015.7410264(1-8)Online publication date: 14-Dec-2015
  • Show More Cited By

Index Terms

  1. Fast submatch extraction using OBDDs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ANCS '12: Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
    October 2012
    270 pages
    ISBN:9781450316859
    DOI:10.1145/2396556
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ordered binary decision diagram (OBDD)
    2. pattern matching
    3. regular expression
    4. submatch
    5. tagged-nfa

    Qualifiers

    • Research-article

    Conference

    ANCS '12

    Acceptance Rates

    Overall Acceptance Rate 88 of 314 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 25 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Kleenex: compiling nondeterministic transducers to deterministic streaming transducersACM SIGPLAN Notices10.1145/2914770.283764751:1(284-297)Online publication date: 11-Jan-2016
    • (2016)Kleenex: compiling nondeterministic transducers to deterministic streaming transducersProceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages10.1145/2837614.2837647(284-297)Online publication date: 11-Jan-2016
    • (2015)A novel algorithm for pattern matching with back referencesProceedings of the 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2015.7410264(1-8)Online publication date: 14-Dec-2015
    • (2013)Hierarchical object log format for normalisation of security events2013 9th International Conference on Information Assurance and Security (IAS)10.1109/ISIAS.2013.6947748(25-30)Online publication date: Dec-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media