Skip to main content
Log in

Querying streams using regular expressions: some semantics, decidability, and efficiency issues

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

This paper analyzes the decidability and complexity problems that arise when matching regular expressions on infinite streams of sets of symbols. We show that in important application domains, several apparently obvious semantics lead to detecting spurious events (events that are mere artifacts of the semantics) or to missing events of potential interest. We single out a class of semantics, of interest in many applications, which we dub use-and-throw: In a use-and-throw semantics, an elementary event can participate in the creation of at most one detected complex event. Many areas of research have identified this as a desirable requirement (we give the examples of databases and video surveillance), but hitherto there has been no systematic study of the characteristics of these semantics, in particular their decidability and algorithmic complexity. This paper is meant to provide at least some initial answers on this subject. We analyze several semantics, provide polynomial algorithms for them, and prove their correctness and their properties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. A definition of the signature of an event will be given shortly; for the sake of this example, the intuitive notion of signature as the marker of an event “type” will suffice.

  2. The intersection is defined stream- and expression-wise in the obvious way.

  3. In this section, “semantics” will always signify “use-and-throw semantics”.

  4. We shall use syntactic sugar as necessary. So, we define \(a\ge {b}\) as \((a>b\vee {a}=b)\), \(\forall x.\psi \) as \(\lnot \exists {x}.\lnot \psi \), etc.

  5. We shall see in the next section examples in which the predicate P contains a term of the form \(\pi \diamond \pi '\ne \emptyset \). In this case, we are not interested in checking paths such that \({\mathtt{str}}({\pi '})>{\mathtt{end}}({\pi })\).

  6. The expression has been slightly modified with respect to the original in order to simplify the presentation without cluttering it with superfluous details, cf. [53].

References

  1. Agrawal, J., Diao, Y., Neil Immerman, D.G.: Efficient pattern matching over event streams. In: ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, pp. 147–159 (2008)

  2. Aho, A.V.: Algorithms for finding patterns in strings. In: van Leuween, J. (ed.) Handbook of Theoretical Computer Science, Vol. A: Algorithms and complexity. Elsevier and MIT Press, Amsterdam (1990)

  3. Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: EP-SPARQL: a unified language for event processing and stream reasoning. In: Proceedings of the International World Wide Web Conference (2011)

  4. Arasu, A., Babu, S., Widom, J.: CQL: a language for continuous queries over streams and relations. In: Proceedings of DBPL, pp. 1–19 (2004)

  5. Babu, S., Widom, J.: Continuous queries over data streams. ACM SIGMOD Record 30(3), 109–120 (2001)

    Article  Google Scholar 

  6. Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. In: ACM SIGMOD International Conference on Management of Data, pp. 16–52 (1986)

  7. Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-SPARQL: a continuous query language for RDF data streams. Int. J. Semant. Comput. 4(1), 3–25 (2010)

    Article  MATH  Google Scholar 

  8. Berry, G., Sethi, R.: From regular expressions to deterministic automata. Theoret. Comput. Sci. 48(1), 117–126 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  9. Bhonsle, S., Gupta, A., Santini, S., Worring, M., Jain, R.: Complex visual activity recognition using a temporally ordered database. In: Huijsmans, D.P., Smeulders, A.W. (eds.) Proceedings of the 3rd International Conference on Visual Computing, VISUAL ’99, Lecture Notes in Computer Science, vol. 1614, pp. 719–726. Springer, Berlin (1999)

    Google Scholar 

  10. Brookshear, J.G.: Theory of Computation: Formal Languages, Automata, and Complexity. Addison Wesley, Reading (1989)

    MATH  Google Scholar 

  11. Büchi, J.: On a decision method in restricted second order arithmetic. Zeitschrift Mathematik und Logik Grundlag 6, 66–92 (1960)

    Article  MATH  Google Scholar 

  12. Büchi, J.R., Landweber, L.: Solving sequential conditions by finite-state strategies. Trans. Am. Math. Soc. 138, 367–378 (1963)

    Google Scholar 

  13. Câmpeanu, C., Salomaa, K., Yu, S.: A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14(6), 1007–1018 (2003)

    Article  MATH  Google Scholar 

  14. Cate, B.T.: The expressivity of Xpath with transitive closure. In: Proceedings of the SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 328–337. ACM Press (2006)

  15. Chakravarthy, S., Krishnaprasad, V., Anwar, E., Kim, S.K.: Composite events for active databases: semantics, contexts and detection. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 606–617 (1994)

  16. Chakravarthy, S., Mishra, D.: Snoop: an expressive event specification language for active databases. Tech. Rep. UF-CIS-TR-93-007, University of Florida at Gainesville (1993)

  17. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Maden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proceedings of the Conference on Innovative Data Systems Research (2003)

  18. Church, A.: Applications of recursive arithmetic to the problem of circuit synthesis. In: Summaries of the summer institute of symbolic logic, pp. 3–50 (1957)

  19. Del Bimbo, A., Vicario, E., Zingoni, D.: Symbolic description and visual querying of image sequences using spatio-temporal logic. IEEE Trans. Knowl. Data Eng. 7(4), 609–622 (1994)

    Article  Google Scholar 

  20. Denecker, M., Missiaen, L., Bruynooghe, M.: Temporal reasoning with abductive event calculus. In: Proceedings of the European Conference on Artificial Intelligence, pp. 384–388. Wiley, New York (1992)

  21. Emerson, E.A., Jutla, C.S.: Tree automata, mu-calculus, and indeterminacy. In: 32nd Annual Symposium of Foundations of Computer Science (1991)

  22. Gatziu, S., Dittrich, K.: Detecting composite events in active databases using Petri nets. In: Proceedings of the 4th International Workshop on research Issues in data Engineering: active Database Systems, pp. 2–9 (1994)

  23. Gehani, N., Jagadish, H., Schmueli, O.: Compose: a system for composite event specification and detection. In: Adam, N., Bhargava, B. (eds.) Lecture Notes in Computer Science, vol. 759. Springer, Berlin (1994)

    Google Scholar 

  24. Golab, L., Özsu, T.: Issues in data stream management. SIGMOD Record 32(2), 5–14 (2003)

    Article  Google Scholar 

  25. Green, T.J., Miklau, G., Onizuka, M., Siciu, D.: Processing XML streams with deterministic automata. In: Proceedings of ICDT, pp. 173–189 (2003)

  26. Gyllstrom, D., Agrawal, J., Diao, Y., Immerman, N.: On supporting Kleene closure over event streams. In: Proceedings of ICDE, the International Conference on Data Engineering (2008)

  27. Hakeem, A., Shah, M.: Learning, detection and representation of multi-agent events in videos. Artif. Intell. 171, 586–605 (2007)

    Article  Google Scholar 

  28. Hellerstein, J.: Optimization techniques for queries with expensive methods. ACM Trans. Database Syst. 23(2), 113–157 (1998)

    Article  MathSciNet  Google Scholar 

  29. Hjelsvold, R., Midtstraum, R.: Modelling and querying video data. In: Proceedings of the 20th VLDB Conference, pp. 686–694 (1994)

  30. Hu, Y., Cao, L., Lv, F., Yan, S., Gong, Y., Huang, T.S.: Action detection in complex scenes with spatial and temporal ambiguities. In: Proceedings of the International Conference on Computer Vision (ICCV) (2009)

  31. Kamimura, T., Slutzki, G.: Parallel and two-way automata on directed ordered acyclic graphs. Inf. Control 49(1), 10–51 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  32. Kim, M., Viswanathan, M., Ben-Abdallah, H., Kannan, S., Lee, I., Sokolsky, O.: Formally specified monitoring of temporal properties. In: Proceedings of the European Conference on Real-Time Systems–ECRTS, pp. 114–121 (1999)

  33. Koudas, N., Srivastava, D.: Data stream query processing. In: Proceedings of the 21st International Conference on Data Engineering (2005)

  34. Koymans, R.: Specifying real-time properties with metric temporal logic. Real Time Syst. 2(4), 255–299 (1990)

    Article  Google Scholar 

  35. Larsen, K.S.: Regular expressions with nested levels of back referencing form a hierarchy. Inf. Process. Lett. 65, 169–172 (1998)

    Article  Google Scholar 

  36. Le Phuoc, D., Minh, D.T., Xavier Oarreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: Proceedings of the International Semantic Web Conference, pp. 370–388 (2011)

  37. Levesque, H., Pirri, F., Reiter, R.: Foundations for the situation calculus. Linköping Electron. Artic. Comput. Inf. Sci. 3(18), 41 (2010) (technical report)

  38. Libkin, L., Wong, L.: Unary quantifiers, transitive closure, and relations of large degree. In: Proceedings of the 15th Annual Symposium on Theoretical Aspects of Computer Science (STAC), pp. 183–193 (1998)

  39. Lieuwen, D.F., Gehani, N., Arlein, R.: The ODE active database: trigger semantics and implementation. In: Proceedings of ICDE, the International Conference on Data Engineering, pp. 412–420 (1996)

  40. Mendelzon, A.O., Wood, P.T.: Finding regular simple paths in graph database. SIAM J. Comput. 24(6), 1235–1258 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  41. Merler, M., Huang, B., Xe, L., Hua, G., Natsev, A.: Semantic model vectors for complex video event recognition. IEEE Trans. Multimed. 14(1), 101–188 (2011)

    Google Scholar 

  42. Mishra, D.: Snoop: an event specification language for active databases. Master’s thesis, Database system R&D Center, CIS Department, University of Florida at Gainesville (1991)

  43. Motakis, I., Zaniolo, C.: Temporal aggregation in active database rules. In: ACM SIGMOD International Conference on Management of Data, Tucson, AZ, USA 13–15 May, pp. 440–451 (1997)

  44. Mukind, M.: Finite-state automata on infinite inputs. Internal Report TCS-96-2, SPIC Mathematical Institute (1996)

  45. Muller, D.E.: Infinite sequences on finite machines. In: Proceedings of the 4th IEEE Symposium on Switching Circuit Theory and Logical Design, pp. 3–16 (1963)

  46. Neven, F.: Automata, logic, and XML. In: Computer Science Logic: 16th International Workshop, CSL 2002, 11th Annual Conference of the EACSL, Lecture notes in computer science, Vol. 2471. Springer, Heidelberg (2002)

  47. Neven, F., Schwentick, T.: On the power of tree-walking automata. Inf. Comput. 183(1), 547–560 (2003)

    Article  MathSciNet  Google Scholar 

  48. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)

    Article  Google Scholar 

  49. Prabhakar, K., Oh, S., Wang, P., Abowd, G.D., Rehg, J.M.: Temporal causality for the analysis of visual events. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2010)

  50. Rabin, M.: Decidability of second order theories and automata on infinite trees. Trans. Am. Math. Soc. 141, 1–37 (1969)

    MATH  MathSciNet  Google Scholar 

  51. Rabin, M.: Automata on Infinite Objects and Church’s Problem. American Mathematical Society, Providence (1972)

    MATH  Google Scholar 

  52. Roşu, G., Viswanathan, M.: Testing extended regular language membership incrementally by rewriting. In: van Oostrom, V. (ed.) Rewriting Techniques and Applications. Springer, Berlin (2003)

  53. Sammapun, U., Easwaran, A., Lee, I., Sokolsky, O.: Simulation of simultaneous events in regular expressions for run-time verification. Electron. Notes Theoret. Comput. Sci. 113, 123–143 (2005)

    Article  Google Scholar 

  54. Santini, S.: On the semantics of complex events in video. In: Proceedings of ACM Multimedia, Events in Multimedia Workshop (2010)

  55. Santini, S.: Regular languages with variables on graphs. Inf. Comput. 211, 1–28 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  56. Santini, S.: Regular queries in event systems with bounded uncertainty. In: Proceedings of the XII Spanish Conference on Programming and Computer Languages (2012)

  57. Sen, K., Roşu, G.: Generating optimal monitors for extended regular expressions. Electron. Notes Theoret. Comput. Sci. 89(2), 226–245 (2003)

    Article  Google Scholar 

  58. Sheng, L., Özsoyoǧlu, Z.M., Özsoyoǧlu, G.: A graph query language and its query processing. In: Proceedings of the 15th International Conference on Data Engineering (1999)

  59. Srivastava, U., Munagala, K., Widom, J.: Operator placement for in-network stream query processing. In: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (2005)

  60. Strett, R.: Propositional dynamic logic of looping and converse is elementarily decidable. Inf. Control 48, 261–283 (1988)

    Google Scholar 

  61. Thomas, W.: Automata on infinite objects. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, pp. 135–186. Elsevier, Amsterdam (1990)

    Google Scholar 

  62. Thompson, K.: Regular expressions search algorithm. Commun. ACM 11(6), 419–422 (1968)

    Article  MATH  Google Scholar 

  63. Tran, D., Yuan, J.: Optimal spatio-temporal path discovery for video event detection. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

  64. Tremblay, J.P., Manohar, R.: Discrete Mathematical Structures with Applications to Computer Science. McGraw-Hill, New York (1975)

    MATH  Google Scholar 

  65. Zimmer, D., Unland, R.: On the semantics of complex events in active database management. In: Proceedings of ICDE, the International Conference on Data Engineering, pp. 392–399 (1999)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Santini.

Additional information

This work was supported in part by the Ministerio de Educación y Ciencia under the Grant No. TIN2013-47090-C3-2, VoxPopuli, Efficient reputation analysis, propagation and recommendation in social network environments.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Santini, S. Querying streams using regular expressions: some semantics, decidability, and efficiency issues. The VLDB Journal 24, 801–821 (2015). https://doi.org/10.1007/s00778-015-0402-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-015-0402-5

Keywords

Navigation