research-article

Public Access

Scalable FSM parallelization via path fusion and higher-order speculation

Authors:

Amir Hossein Nodehi Sabet,

Zhijia ZhaoAuthors Info & Claims

ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 887 - 901

https://doi.org/10.1145/3445814.3446705

Published: 17 April 2021 Publication History

Abstract

Finite-state machine (FSM) is a fundamental computation model used by many applications. However, FSM execution is known to be “embarrassingly sequential” due to the state dependences among transitions. Existing solutions leverage enumerative or speculative parallelization to break the dependences. However, the efficiency of both parallelization schemes highly depends on the properties of the FSM and its inputs. For those exhibiting unfavorable properties, the former suffers from the overhead of maintaining multiple execution paths, while the latter is bottlenecked by the serial reprocessing among the misspeculation cases. Either way, the FSM parallelization scalability is seriously compromised.

This work addresses the above scalability challenges with two novel techniques. First, for enumerative parallelization, it proposes path fusion. Inspired by the classic NFA to DFA conversion, it maps a vector of states in the original FSM to a new (fused) state. In this way, path fusion can reduce multiple FSM execution paths into a single path, minimizing the overhead of path maintenance. Second, for speculative parallelization, this work introduces higher-order speculation to avoid the serial reprocessing during validations. This is a generalized speculation model that allows speculated states to be validated speculatively. Finally, this work integrates different schemes of FSM parallelization into a framework—BoostFSM, which automatically selects the best based on the relevant properties of the FSM. Evaluation using real-world FSMs with diverse characteristics shows that BoostFSM can raise the average speedup from 3.1× and 15.4× of the existing speculative and enumerative parallelization schemes, respectively, to 25.8× on a 64-core machine.

References

[1]

[n.d.]. regex2dfa. https://github.com/kpdyer/regex2dfa.

[2]

Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA. isbn:0321486811

Digital Library

[3]

Pritpal S Ahuja, Kevin Skadron, Margaret Martonosi, and Douglas W Clark. 1998. Multipath execution: Opportunities and limits. In Proceedings of the 12th International Conference on Supercomputing. 101?108.

Digital Library

[4]

Sotiris Apostolakis, Ziyang Xu, Greg Chan, Simone Campanoni, and David I August. 2020. Perspective: A sensible approach to speculative automatic parallelization. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 351?367.

Digital Library

[5]

Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al\mbox. 2006. The landscape of parallel computing research: A view from berkeley. Technical Report. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley.

[6]

Matteo Avalle, Fulvio Risso, and Riccardo Sisto. 2015. Scalable algorithms for NFA multi-striding and NFA-based deep packet inspection on GPUs. IEEE/ACM Transactions on Networking 24, 3 (2015), 1704?1717.

[7]

Robert D Cameron, Thomas C Shermer, Arrvindh Shriraman, Kenneth S Herdy, Dan Lin, Benjamin R Hull, and Meng Lin. 2014. Bitwise data parallelism in regular expression matching. In 2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT). IEEE, 139?150.

Digital Library

[8]

Niccolo' Cascarano, Pierluigi Rolando, Fulvio Risso, and Riccardo Sisto. 2010. iNFAnt: NFA pattern matching on GPGPU devices. ACM SIGCOMM Computer Communication Review 40, 5 (2010), 20?26.

Digital Library

[9]

Marcelo Cintra, Jos\'e F Martínez, and Josep Torrellas. 2000. Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In Proceedings of the 27th Annual International Symposium on Computer Architecture. 13?24.

Digital Library

[10]

Romain E Cledat, Tushar Kumar, and Santosh Pande. 2011. Efficiently speeding up sequential computation through the N-way programming model. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications. 537?554.

Digital Library

[11]

Sutapa Datta and Subhasis Mukhopadhyay. 2015. A grammar inference approach for predicting kinase specific phosphorylation sites. PloS one 10, 4 (2015), e0122294.

[12]

Yanlei Diao, Peter Fischer, Michael J Franklin, and Raymond To. 2002. Yfilter: Efficient and scalable filtering of XML documents. In Proceedings 18th International Conference on Data Engineering. IEEE, 341?342.

[13]

Chen Ding, Xipeng Shen, Kirk Kelsey, Chris Tice, Ruke Huang, and Chengliang Zhang. 2007. Software behavior oriented parallelization. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. 223?234.

Digital Library

[14]

Yuanwei Fang, Tung T Hoang, Michela Becchi, and Andrew A Chien. 2015. Fast support for unstructured data processing: the unified automata processor. In Proceedings of the 48th International Symposium on Microarchitecture. 533?545.

Digital Library

[15]

Todd J Green, Gerome Miklau, Makoto Onizuka, and Dan Suciu. 2003. Processing XML streams with deterministic automata. In International Conference on Database Theory. Springer, 173?189.

[16]

W Daniel Hillis and Guy L Steele Jr. 1986. Data parallel algorithms. Commun. ACM 29, 12 (1986), 1170?1183.

[17]

David Jefferson and Peter Reiher. 1991. Supercritical speedup. ACM SIGSIM Simulation Digest 21, 3 (1991), 159?168.

[18]

David R Jefferson. 1985. Virtual time. ACM Transactions on Programming Languages and Systems (TOPLAS) 7, 3 (1985), 404?425.

Digital Library

[19]

Mark C Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, and Daniel Sanchez. 2015. A scalable architecture for ordered parallelism. In Proceedings of the 48th International Symposium on Microarchitecture. 228?241.

Digital Library

[20]

Lin Jiang, Junqiao Qiu, and Zhijia Zhao. 2020. Scalable Structural Index Construction for JSON Analytics. Proceedings of the VLDB Endowment 14, 4 (2020), 694?707.

Digital Library

[21]

Lin Jiang, Xiaofan Sun, Umar Farooq, and Zhijia Zhao. 2019. Scalable Processing of Contemporary Semi-Structured Data on Commodity Parallel Processors-A Compilation-based Approach. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 79?92.

Digital Library

[22]

Lin Jiang and Zhijia Zhao. 2017. Grammar-aware Parallelization for Scalable XPath Querying. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 371?383.

Digital Library

[23]

Peng Jiang and Gagan Agrawal. 2017. Combining SIMD and Many/Multi-core parallelism for finite state machines with enumerative speculation. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 179?191.

Digital Library

[24]

C. Jones, R. Liu, L. Meyerovich, K. Asanovic, and R. Bodik. 2009. Parallelizing the web browser. In HotPar.

[25]

Kirk Kelsey, Tongxin Bai, Chen Ding, and Chengliang Zhang. 2009. Fast track: A software system for speculative program optimization. In 2009 International Symposium on Code Generation and Optimization. IEEE, 157?168.

Digital Library

[26]

Shmuel Tomi Klein and Yair Wiseman. 2003. Parallel Huffman decoding with applications to JPEG files. Comput. J. 46, 5 (2003), 487?497.

[27]

Sailesh Kumar, Sarang Dharmapurikar, Fang Yu, Patrick Crowley, and Jonathan Turner. 2006. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In ACM SIGCOMM Computer Communication Review, Vol. 36. ACM, 339?350.

Digital Library

[28]

Yinan Li, Nikos R Katsipoulakis, Badrish Chandramouli, Jonathan Goldstein, and Donald Kossmann. 2017. Mison: a fast JSON parser for data analytics. Proceedings of the VLDB Endowment 10, 10 (2017), 1118?1129.

Digital Library

[29]

Mikko H Lipasti and John Paul Shen. 1996. Exceeding the dataflow limit via value prediction. In Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29. IEEE, 226?237.

[30]

Hongyuan Liu, Sreepathi Pai, and Adwait Jog. 2020. Why GPUs are slow at executing NFAs and how to make them faster. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 251?265.

Digital Library

[31]

Saeed Maleki, Madanlal Musuvathi, and Todd Mytkowicz. 2014. Parallelizing dynamic programming through rank convergence. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 219?232.

Digital Library

[32]

Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, and Scott Mahlke. 2009. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. 166?176.

Digital Library

[33]

Todd Mytkowicz, Madanlal Musuvathi, and Wolfram Schulte. 2014. Data-parallel finite-state machines. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems. 529?542.

Digital Library

[34]

Marziyeh Nourian, Xiang Wang, Xiaodong Yu, Wu-chun Feng, and Michela Becchi. 2017. Demystifying automata processing: GPUs, FPGAs or Micron's AP?. In Proceedings of the International Conference on Supercomputing. 1?11.

Digital Library

[35]

Peter Ogden, David Thomas, and Peter Pietzuch. 2013. Scalable XML query processing using parallel pushdown transducers. Proceedings of the VLDB Endowment 6, 14 (2013), 1738?1749.

Digital Library

[36]

Yinfei Pan, Ying Zhang, Kenneth Chiu, and Wei Lu. 2007. Parallel XML parsing using meta-DFAs. In e-Science and Grid Computing, IEEE International Conference on. IEEE, 237?244.

[37]

Leo Porter, Bumyong Choi, and Dean M Tullsen. 2009. Mapping out a path from hardware transactional memory to speculative multithreading. In 2009 18th International Conference on Parallel Architectures and Compilation Techniques. IEEE, 313?324.

Digital Library

[38]

Manohar K Prabhu and Kunle Olukotun. 2003. Using thread-level speculation to simplify manual parallelization. In Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 1?12.

Digital Library

[39]

Prakash Prabhu, Ganesan Ramalingam, and Kapil Vaswani. 2010. Safe programmable speculative parallelism. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation. 50?61.

Digital Library

[40]

Junqiao Qiu, Lin Jiang, and Zhijia Zhao. 2020. Challenging Sequential Bitstream Processing via Principled Bitwise Speculation. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 607?621.

Digital Library

[41]

Junqiao Qiu, Zhijia Zhao, and Bin Ren. 2016. MicroSpec: Speculation-centric fine-grained parallelization for FSM computations. In 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT). IEEE, 221?233.

Digital Library

[42]

Junqiao Qiu, Zhijia Zhao, Bo Wu, Abhinav Vishnu, and Shuaiwen Leon Song. 2017. Enabling Scalability-Sensitive Speculative Parallelization for FSM Computations. In Proceedings of the International Conference on Supercomputing (Chicago, Illinois) (ICS ?17). Association for Computing Machinery, New York, NY, USA, Article 2, 10 pages. isbn:9781450350204 https://doi.org/10.1145/3079079.3079082

Digital Library

[43]

Carlos García Qui\ nones, Carlos Madriles, Jes\'us S\'anchez, Pedro Marcuello, Antonio Gonz\'alez, and Dean M Tullsen. 2005. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation. 269?279.

[44]

Hany E Ramadan, Christopher J Rossbach, and Emmett Witchel. 2008. Dependence-aware transactional memory for increased concurrency. In 2008 41st IEEE/ACM International Symposium on Microarchitecture. IEEE, 246?257.

[45]

Arun Raman, Hanjun Kim, Thomas R Mason, Thomas B Jablin, and David I August. 2010. Speculative parallelization using software multi-threaded transactions. In Proceedings of the fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems. 65?76.

Digital Library

[46]

Lawrence Rauchwerger and David A Padua. 1999. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Transactions on Parallel and Distributed Systems 10, 2 (1999), 160?180.

Digital Library

[47]

Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles. 153?167.

Digital Library

[48]

Martin Roesch et al\mbox. 1999. Snort: Lightweight Intrusion Detection for Networks. In LISA, Vol. 99. 229?238.

[49]

Indranil Roy and Srinivas Aluru. 2014. Finding motifs in biological sequences using the Micron automata processor. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE, 415?424.

Digital Library

[50]

Amir Hossein Nodehi Sabet, Junqiao Qiu, Zhijia Zhao, and Sriram Krishnamoorthy. 2020. Reliability Analysis for Unreliable FSM Computations. ACM Transactions on Architecture and Code Optimization (TACO) 17, 2 (2020), 1?23.

[51]

Elaheh Sadredini, Reza Rahimi, Marzieh Lenjani, Mircea Stan, and Kevin Skadron. 2020. FlexAmata: A universal and efficient adaption of applications to spatial automata processing accelerators. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 219?234.

Digital Library

[52]

Priti Shankar, Amitava Dasgupta, Kaustubh Deshmukh, and B Sundar Rajan. 2003. On viewing block codes as finite automata. Theoretical Computer Science 290, 3 (2003), 1775?1797.

[53]

Randy Smith, Cristian Estan, Somesh Jha, and Shijin Kong. 2008. Deflating the big bang: fast and scalable deep packet inspection with extended finite automata. In ACM SIGCOMM Computer Communication Review, Vol. 38. ACM, 207?218.

Digital Library

[54]

J Gregory Steffan and Todd C Mowry. 1998. The potential for using thread-level data speculation to facilitate automatic parallelization. In Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture. IEEE, 2?13.

[55]

Arun Subramaniyan and Reetuparna Das. 2017. Parallel automata processor. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, 600?612.

Digital Library

[56]

Arun Subramaniyan, Jingcheng Wang, Ezhil RM Balasubramanian, David Blaauw, Dennis Sylvester, and Reetuparna Das. 2017. Cache automaton. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. 259?272.

Digital Library

[57]

Chen Tian, Min Feng, and Rajiv Gupta. 2010. Speculative parallelization using state separation and multiple value prediction. In Proceedings of the 2010 International Symposium on Memory Management. 63?72.

Digital Library

[58]

Chen Tian, Min Feng, Vijay Nagarajan, and Rajiv Gupta. 2008. Copy or discard execution model for speculative parallelization on multicores. In 2008 41st IEEE/ACM International Symposium on Microarchitecture. IEEE, 330?341.

Digital Library

[59]

Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos P Markatos, and Sotiris Ioannidis. 2008. Gnort: High performance network intrusion detection using graphics processors. In International Workshop on Recent Advances in Intrusion Detection. Springer, 116?134.

Digital Library

[60]

Steven Wallace, Brad Calder, and Dean M Tullsen. 1998. Threaded multiple path execution. In Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No. 98CB36235). IEEE, 238?249.

[61]

Ke Wang, Kevin Angstadt, Chunkun Bo, Nathan Brunelle, Elaheh Sadredini, Tommy Tracy, Jack Wadden, Mircea Stan, and Kevin Skadron. 2016. An overview of Micron's automata processor. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Co-design and System Synthesis. 1?3.

Digital Library

[62]

Ke Wang, Yanjun Qi, Jeffrey J Fox, Mircea R Stan, and Kevin Skadron. 2015. Association rule mining with the Micron Automata Processor. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE, 689?699.

Digital Library

[63]

Yang Xia, Peng Jiang, and Gagan Agrawal. 2020. Scaling out speculative execution of finite-state machines with parallel merge. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 160?172.

Digital Library

[64]

Fang Yu, Zhifeng Chen, Yanlei Diao, TV Lakshman, and Randy H Katz. 2006. Fast and memory-efficient regular expression matching for deep packet inspection. In Proceedings of the 2006 ACM/IEEE Symposium on Architecture for Networking and Communications Systems. ACM, 93?102.

Digital Library

[65]

Xiaodong Yu and Michela Becchi. 2013. GPU acceleration of regular expression matching for large datasets: exploring the implementation space. In Proceedings of the ACM International Conference on Computing Frontiers. 1?10.

Digital Library

[66]

Zhijia Zhao and Xipeng Shen. 2015. On-the-Fly Principled Speculation for FSM Parallelization. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (Istanbul, Turkey) (ASPLOS ?15). Association for Computing Machinery, New York, NY, USA, 619?630. isbn:9781450328357 https://doi.org/10.1145/2694344.2694369

Digital Library

[67]

Zhijia Zhao, Bo Wu, and Xipeng Shen. 2014. Challenging the ?Embarrassingly Sequential?: Parallelizing Finite State Machine-Based Computations through Principled Speculation. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (Salt Lake City, Utah, USA) (ASPLOS ?14). Association for Computing Machinery, New York, NY, USA, 543?558. isbn:9781450323055 https://doi.org/10.1145/2541940.2541989

Digital Library

[68]

Craig Zilles and Gurindar Sohi. 2002. Master/slave speculative parallelization. In 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002.(MICRO-35). Proceedings. IEEE, 85?96.

[69]

Yuan Zu, Ming Yang, Zhonghu Xu, Lin Wang, Xin Tian, Kunyang Peng, and Qunfeng Dong. 2012. GPU-based NFA implementation for memory efficient high speed regular expression matching. In Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming. 129?140.

Digital Library

Cited By

Somaini ACarloni FAgosta GSantambrogio MConficconi DDoerfert JGrosser TLeather HSadayappan P(2025)Combining MLIR Dialects with Domain-Specific Architecture for Efficient Regular Expression MatchingProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708916(255-270)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708916
Cicolini LCarloni FSantambrogio MConficconi DGrosser TDubach CSteuwer MXue JOttoni GQuintão Pereira F(2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions ExecutionProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1109/CGO57630.2024.10444810
Valizadeh MBerger M(2023)Search-Based Regular Expression Inference on a GPUProceedings of the ACM on Programming Languages10.1145/35912747:PLDI(1317-1339)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591274
Show More Cited By

Index Terms

Scalable FSM parallelization via path fusion and higher-order speculation
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Parallel algorithms
  2. Formal languages and automata theory
    1. Regular languages

Recommendations

Enabling scalability-sensitive speculative parallelization for FSM computations
ICS '17: Proceedings of the International Conference on Supercomputing

Finite state machines (FSMs) are the backbone of many applications, but are difficult to parallelize due to their inherent dependencies. Speculative FSM parallelization has shown promise on multicore machines with up to eight cores. However, as hardware ...
On-the-Fly Principled Speculation for FSM Parallelization
ASPLOS'15

Finite State Machine (FSM) is the backbone of an important class of applications in many domains. Its parallelization has been extremely difficult due to inherent strong dependences in the computation. Recently, principled speculation shows good promise ...
On-the-Fly Principled Speculation for FSM Parallelization
ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems

Finite State Machine (FSM) is the backbone of an important class of applications in many domains. Its parallelization has been extremely difficult due to inherent strong dependences in the computation. Recently, principled speculation shows good promise ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 2021

1090 pages

ISBN:9781450383172

DOI:10.1145/3445814

General Chair:
Tim Sherwood
University of California at Santa Barbara, USA
,
Program Chairs:
Emery Berger
University of Massachusetts at Amherst, USA
,
Christos Kozyrakis
Stanford University, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ASPLOS '21

Sponsor:

SIGPLAN

ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 19 - 23, 2021

Virtual, USA

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
1,064
Total Downloads

Downloads (Last 12 months)278
Downloads (Last 6 weeks)30

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Somaini ACarloni FAgosta GSantambrogio MConficconi DDoerfert JGrosser TLeather HSadayappan P(2025)Combining MLIR Dialects with Domain-Specific Architecture for Efficient Regular Expression MatchingProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708916(255-270)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708916
Cicolini LCarloni FSantambrogio MConficconi DGrosser TDubach CSteuwer MXue JOttoni GQuintão Pereira F(2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions ExecutionProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1109/CGO57630.2024.10444810
Valizadeh MBerger M(2023)Search-Based Regular Expression Inference on a GPUProceedings of the ACM on Programming Languages10.1145/35912747:PLDI(1317-1339)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591274
Liu HPai SJog A(2023)Asynchronous Automata Processing on GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794537:1(1-27)Online publication date: 2-Mar-2023
https://dl.acm.org/doi/10.1145/3579453
Sun XZhang GWu DYu QCui JZhong H(2023)Parallel Pattern Matching over Brotli Compressed Network Traffic2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00079(477-484)Online publication date: 1-Nov-2023
https://doi.org/10.1109/TrustCom60117.2023.00079
Carloni FConficconi DMoschetto ISantambrogio M(2023)YARB: a Methodology to Characterize Regular Expression Matching on Heterogeneous Systems2023 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS46773.2023.10181547(1-5)Online publication date: 21-May-2023
https://doi.org/10.1109/ISCAS46773.2023.10181547
Wang YWatling RQiu JWang Z(2022)GSpecPal: Speculation-Centric Finite State Machine Parallelization on GPUs2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00053(481-491)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00053
Nguyen TBecchi M(2022)A GPU-accelerated Data Transformation Framework Rooted in Pushdown Transducers2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC56025.2022.00038(215-225)Online publication date: Dec-2022
https://doi.org/10.1109/HiPC56025.2022.00038

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten