Skip to main content

Succinct Determinisation of Counting Automata via Sphere Construction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11893))

Abstract

We propose an efficient algorithm for determinising counting automata (CAs), i.e., finite automata extended with bounded counters. The algorithm avoids unfolding counters into control states, unlike the naïve approach, and thus produces much smaller deterministic automata. We also develop a simplified and faster version of the general algorithm for the sub-class of so-called monadic CAs (MCAs), i.e., CAs with counting loops on character classes, which are common in practice. Our main motivation is (besides applications in verification and decision procedures of logics) the application of deterministic (M)CAs in pattern matching regular expressions with counting, which are very common in e.g. network traffic processing and log analysis. We have evaluated our algorithm against practical benchmarks from these application domains and concluded that compared to the naïve approach, our algorithm is much less prone to explode, produces automata that can be several orders of magnitude smaller, and is overall faster.

This work has been supported by the Czech Science Foundation (project No. 19-24397S), the IT4Innovations Excellence in Science (project No. LQ1602), and the FIT BUT internal project FIT-S-17-4014.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    To handle large or infinite sets of symbols symbolically, the predicates \(\texttt {l}= a\) may be generalised to predicates from an arbitrary effective Boolean algebra, as in [6].

  2. 2.

    A Boolean combination of atomic guards and updates can be factorised through (1) a transformation to DNF, yielding a set of clauses X; (2) writing each clause \(\varphi \in X\) as a conjunction of a guard formula \(g_\varphi \) and an assignment formula \(f_\varphi \); (3) computing minterms of the set \(\{g_\varphi \mid \varphi \in X\}\); (4) creating one factor \((g)\wedge (f)\) from every minterm g where f is the disjunction of all the assignment formulae \(f_\varphi \) with \(\varphi \in X\) compatible with g (i.e., such that \(g\wedge f_\varphi \) is satisfiable).

  3. 3.

    We note that we only need to use a specialised, simple, and cheap quantifier elimination. In particular, we only need to eliminate counter variables c from formulae such that, in clauses of their DNF, c always appears together with a predicate \(c=p\) where p is a parameter. Eliminating c from such a DNF clause is then done by simply substituting occurrences of c by p. We do not need complex algorithms such as the general quantifier elimination for Presburger arithmetic.

  4. 4.

    The choice of the parameters in the image of \(\theta _{ at }: at ( u _i)\rightarrow \mathcal {P}'\) on line 9 is arbitrary, although, in practice, it would be sensible to define some systematic parameter naming policy and reuse existing parameters whenever possible.

  5. 5.

    For this step to preserve the language of the automaton, we need to assume that the input CA does not assign nondeterministic values to live counters. We are refering to the standard notion: a counter is live at a state if the value it holds at that state may influence satisfaction of some guard in the future. Any CA can be transformed into this form, and CAs we compile from regular expressions satisfy this condition by construction.

  6. 6.

    We note that we restrict ourselves to range sub-expressions of the form \(\sigma \{n,n\}\) or \(\sigma \{0,n\}\) only. This is without loss of generality since a general range expression \(\sigma \{m,n\}\) can be rewritten as \(\sigma \{m,m\}.\sigma \{0,n-m\}\).

  7. 7.

    Notice that the guards \(c_q < {{\varvec{max}}}_{q}\) on the incrementing self-loops of exact counting states could be removed without affecting the language since when \(c_q\) exceeds \({{\varvec{max}}}_{q}\), then the run can never leave q and has thus no chance of accepting. We include these guards only to conform to the condition on boundedness of counter values in the definition of CAs.

  8. 8.

    Notice that maintaining a fixed association of a parameter to a counter is a difference from Algorithms 1 and 2, where one parameter may represent different counters.

  9. 9.

    The fact that this relation is indeed a simulation can be seen from that both the higher and lower value of \(c_q\) can use any exit transition of q at any moment regardless of the value of \(c_q\), but the lower value of \(c_q\) can stay in the counting loop longer.

References

  1. Abdulla, P.A., Krcal, P., Yi, W.: R-automata. In: van Breugel, F., Chechik, M. (eds.) CONCUR 2008. LNCS, vol. 5201, pp. 67–81. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85361-9_9

    Chapter  Google Scholar 

  2. Bardin, S., Finkel, A., Leroux, J., Petrucci, L.: FAST: acceleration from theory to practice. STTT 10(5) (2008)

    Article  Google Scholar 

  3. Börklund, E., Martens, W., Timm, T.: Efficient incremental evaluation of succinct regular expressions. In: Proceedings of CIKM 2015, ACM (2015)

    Google Scholar 

  4. Chen, H., Lu, P.: Checking determinism of regular expressions with counting. Inf. Comput. 241, 302–320 (2015)

    Article  MathSciNet  Google Scholar 

  5. Cheng, K., Krishnakumar, A.S.: Automatic functional test generation using the extended finite state machine model. In: Proceedings of DAC 1993, ACM Press (1993)

    Google Scholar 

  6. D’Antoni, L., Veanes, M.: Minimization of symbolic automata. In: Proceedings of POPL 2014, ACM (2014)

    Google Scholar 

  7. Dill, D.L., Hu, A.J., Wong-Toi, H.: Checking for language inclusion using simulation preorders. In: Larsen, K.G., Skou, A. (eds.) CAV 1991. LNCS, vol. 575, pp. 255–265. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-55179-4_25

    Chapter  Google Scholar 

  8. Gelade, W., Martens, W., Neven, F.: Optimizing schema languages for XML: numerical constraints and interleaving. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 269–283. Springer, Heidelberg (2006). https://doi.org/10.1007/11965893_19

    Chapter  Google Scholar 

  9. Gelade, W., Gyssens, M., Martens, W.: Regular expressions with counting: weak versus strong determinism. In: Královič, R., Niwiński, D. (eds.) MFCS 2009. LNCS, vol. 5734, pp. 369–381. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03816-7_32

    Chapter  Google Scholar 

  10. van Glabbeek, R., Ploeger, B.: Five Determinisation algorithms. In: Ibarra, O.H., Ravikumar, B. (eds.) CIAA 2008. LNCS, vol. 5148, pp. 161–170. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70844-5_17

    Chapter  Google Scholar 

  11. Heizmann, M., Hoenicke, J., Podelski, A.: Software model checking for people who love automata. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 36–52. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_2

    Chapter  Google Scholar 

  12. Henriksen, J.G., et al.: Mona: monadic second-order logic in practice. In: Brinksma, E., Cleaveland, W.R., Larsen, K.G., Margaria, T., Steffen, B. (eds.) TACAS 1995. LNCS, vol. 1019, pp. 89–110. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-60630-0_5

    Chapter  Google Scholar 

  13. Hovland, D.: Regular expressions with numerical constraints and automata with counters. In: Leucker, M., Morgan, C. (eds.) ICTAC 2009. LNCS, vol. 5684, pp. 231–245. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03466-4_15

    Chapter  Google Scholar 

  14. Hovland, D.: The membership problem for regular expressions with unordered concatenation and numerical constraints. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 313–324. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28332-1_27

    Chapter  MATH  Google Scholar 

  15. Kilpeläinen, P., Tuhkanen, R.: One-unambiguity of regular expressions with numeric occurrence indicators. Inf. Comput. 205(6), 890–916 (2007)

    Article  MathSciNet  Google Scholar 

  16. Lengál, O., Šimáček, J., Vojnar, T.: VATA: a library for efficient manipulation of non-deterministic tree automata. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 79–94. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28756-5_7

    Chapter  MATH  Google Scholar 

  17. Roesch, M., et al.: Snort: A Network Intrusion Detection and Prevention System. http://www.snort.org

  18. Microsoft Automata Library: Automata and Transducer Library for .NET. https://github.com/AutomataDotNet/Automata

  19. OWASP Foundation and Checkmarx: Regular Expression Denial of Service: ReDoS (2017)

    Google Scholar 

  20. RegExLib.com: The Internet’s First Regular Expression Library. http://regexlib.com/

  21. Sommer, R., et al.: The Bro Network Security Monitor. http://www.bro.org

  22. Shiple, T.R., Kukula, J.H., Ranjan, R.K.: A comparison of Presburger engines for EFSM reachability. In: Hu, A.J., Vardi, M.Y. (eds.) CAV 1998. LNCS, vol. 1427, pp. 280–292. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0028752

    Chapter  Google Scholar 

  23. Smith, R., Estan, C., Jha, S.: XFA: faster signature matching with extended automata. In: Proceedings of SSP 2008, IEEE (2008)

    Google Scholar 

  24. Smith, R., Estan, C., Jha, S., Siahaan, I.: Fast signature matching using extended finite automaton (XFA). In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 158–172. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89862-7_15

    Chapter  Google Scholar 

  25. Sperberg-McQueen, M.: Notes on Finite State Automata with Counters. https://www.w3.org/XML/2004/05/msm-cfa.html. Accessed 08 Aug 2018

  26. The Sagan Team: The Sagan Log Analysis Engine. https://quadrantsec.com/sagan_log_analysis_engine/

  27. Thompson, K.: Programming techniques: regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)

    Article  Google Scholar 

  28. Češka, M., Havlena, V., Holík, L., Lengál, O., Vojnar, T.: Approximate reduction of finite automata for high-speed network intrusion detection. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 155–175. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89963-3_9

    Chapter  Google Scholar 

  29. Yang, L., Karim, R., Ganapathy, V., Smith, R.: Improving NFA-based signature matching using ordered binary decision diagrams. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 58–78. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15512-3_4

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Margus Veanes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Holík, L., Lengál, O., Saarikivi, O., Turoňová, L., Veanes, M., Vojnar, T. (2019). Succinct Determinisation of Counting Automata via Sphere Construction. In: Lin, A. (eds) Programming Languages and Systems. APLAS 2019. Lecture Notes in Computer Science(), vol 11893. Springer, Cham. https://doi.org/10.1007/978-3-030-34175-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34175-6_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34174-9

  • Online ISBN: 978-3-030-34175-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics