Skip to main content
Log in

Smart scheme: an efficient query execution scheme for event-driven stream processing

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

With the increase in stream data, a demand for stream processing has become diverse and complicated. To meet this demand, several stream processing engines (SPEs) have been developed which execute continuous queries (CQs) to process continuous data streams. Event-driven stream processing, which is one of the important requirements, continuously gets the incoming stream data and, however, generates query results only on the occurrence of specified events. In the basic query execution scheme, even when no event is raised, input stream tuples are continuously processed by query operators, though they do not generate any query result. This results in increased system load and wastage of system resources. For this problem, we propose a smart event-driven stream processing scheme, which makes use of smart windows to buffer the stream tuples during the absence of an event. When the event is raised, the buffered tuples are flushed and processed by the downstream operators. If the buffered tuples in the smart window expire due to the window size before the occurrence of an event, they are deleted directly from the smart window. Since CQs once registered are executed for several weeks, months or even years, SPEs usually execute several CQs in parallel and merge their query plans whenever possible to save processing cost. Due to the presence of smart window, existing multi-query optimization techniques cannot work for smart event-driven stream processing. Hence, this work proposes a multi-query optimization for the proposed smart scheme to cover the cases where multiple continuous queries are registered. Extensive experiments are performed on real and synthetic data streams to show the effectiveness of the proposed smart scheme and its multi-query optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://zephoria.com/—accessed 01/21/2017.

  2. http://www.internetlivestats.com/—accessed 01/21/2017.

  3. http://www.internetlivestats.com/—accessed 01/21/2017.

  4. For simplicity, we assume that a new tuple arrives at every time instant t.

  5. Tsukuba mobility data stream is provided by Tsukuba city, National Institute for Land and Infrastructure Management and University of Tsukuba.

References

  1. Gartner IT Glossary (2016). http://www.gartner.com/it-glossary/big-data/. Accessed 17 Sept 2016

  2. Abadi DJ, Carney D, Cetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik S (2003) Aurora: a new model and architecture for data stream management. VLDB J 12(2):120–139

    Article  Google Scholar 

  3. Abadi DJ, Ahmad Y, Balazinska M, Cherniack M, Hwang J hyon, Lindner W, Maskey AS, Rasin E, Ryvkina E, Tatbul N, Xing Y, Zdonik S (2005) The design of the borealis stream processing engine. In: Proceedings of CIDR, pp 277–289

  4. Apache Storm project (2017). https://storm.apache.org/. Accessed 21 Jan 2017

  5. Arasu A, Babcock B, Babu S, Cieslewicz J, Datar M, Ito K, Motwani R, Srivastava U, Widom J (2003) STREAM: The Stanford data stream management system. Tech. Report, Stanford InfoLab, IEEE Data Engg. Bulletin 26(1)

  6. Wu Y, Tan K (2015) ChronoStream: elastic stateful stream computation in the cloud. In: Proceedings of the ICDE, pp 723–734

  7. Cetintemel U, Du J, Kraska T, Madden S, Maier D, Meehan J, Pavlo A, Stonebraker M, Sutherland E, Tatbul N, Tufte K, Wang H, Zdonik SB (2014) S-store: a streaming NewSQL system for big velocity applications. In: Proceedings of the VLDB, pp 1633–1636

  8. Chandramouli B, Goldstein J, Barnett M, DeLine R, Fisher D, Platt JC, Terwilliger JF, Wernsing J (2014) Trill: a high-performance incremental query processor for diverse analytics. In: Proceedings of the VLDB, pp 401–412

  9. Wang D, Rundensteiner EA, Ellison RT (2011) Active complex event processing over event streams. Proc VLDB Endow 4(10):634–645

    Article  Google Scholar 

  10. Wu E, Diao Y, Rizvi S (2006) High-performance complex event processing over streams. In: Proceedings of the ACM SIGMOD, pp 407–418

  11. Brenna L, Demers A, Gehrke J, Hong M, Ossher J, Panda B, Riedewald M, Thatte M, White W (2007) Cayuga: a high-performance event processing engine. In: Proceedings of ACM SIGMOD, pp 1100–1102

  12. Apache Spark Streaming (2017). https://spark.apache.org/streaming/. Accessed 21 Jan 2017

  13. Roy P, Seshadri S, Sudarshan S, Bhobe S (2000) Efficient and extensible algorithms for multi query optimization. In: Proceedings of the SIGMOD, pp 249–260

  14. Madden S, Shah M, Hellerstein JM, Raman V (2002) Continuously adaptive continuous queries over streams. In: Proceedings of the SIGMOD, pp 49–60

  15. Chandrasekaran S, Franklin MJ (2003) PSoup: a system for streaming queries over streaming data. VLDB J 12(2):140–156

    Article  Google Scholar 

  16. Beyer Kevin S, Ercegovac Vuk, Gemulla Rainer, Eltabakh Mohamed, Balmin Andrey (2011) Jaql: a scripting language for large scale semistructured data analysis. Proc VLDB Endow 4(12):1272–1283

    Google Scholar 

  17. The JSON Data Interchange Format (2013) Standard ECMA-404. ECMA International, Geneva

    Google Scholar 

  18. Shaikh SA, Watanabe Y, Wang Y, Kitagawa H (2016) Smart query execution for event-driven stream processing. In: Proceedings of 2nd IEEE international conference on multimedia big data, pp 97–104

  19. Terry D, Goldberg D, Nichols D, Oki B (1992) Continuous queries over append-only databases. SIGMOD Rec 21(2):321–330

    Article  Google Scholar 

  20. Zaharia M, Das T, Li H, Shenker S, Stoica I (2012) Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. In: Proceedings, HotCloud

  21. Motwani R, Widom J, Arasu A, Babcock B, Babu S, Datar M, Manku G, Olston C, Rosenstein J, Varma R (2003) Query processing, resource management, and approximation in a data stream management system. In: Proceedings of CIDR, pp 245–256

  22. Chandrasekaran S, Cooper O, Deshpande A, Franklin MJ, Hellerstein JM, Hong W, Krishnamurthy S, Madden SR , Reiss F, Shah MA (2003) Telegraphcq: continuous dataflow processing. In: Proceedings of ACM SIGMOD, pp 668–668

  23. Neumeyer L, Robbins B, Nair A, Kesari A (2010) S4: distributed stream computing platform. In: Proceedings of the ICDMW, pp 170–177

  24. Jaewoo K, Naughton JF, Viglas SD (2003) Evaluating window joins over unbounded streams. In: Proceedings of ICDE, pp 341–352

  25. Srivastava U, Widom J (2004) Memory-limited execution of windowed stream joins. In: Proceedings of very large database (PVLDB)

  26. Gedik B, Wu KL, Yu PS, Liu L (2007) GrubJoin: an adaptive, multi-way, windowed stream join with time corr.-aware CPU load shedding. IEEE TKDE 19(10):1363–1380

    Google Scholar 

  27. Arasu A, Babu S, Widom J (2006) The cql continuous query language: semantic foundations and query execution. VLDB J 15(2):121–142

    Article  Google Scholar 

  28. Viglas SD, Naughton JF (2002) Rate-based query optimization for streaming information sources. In: Proceedings of the SIGMOD, pp 37–48

  29. Ayad AM, Naughton JF (2004) Static optimization of conjunctive queries with sliding windows over infinite streams. In: Proceedings of the SIGMOD, pp 419–430

  30. Babu S, Motwani R, Munagala K, Nishizawa I, Widom J (2004) Adaptive ordering of pipelined stream filters. In: Proceedings of the SIGMOD

  31. Avnur R, Hellerstein JM (2000) Eddies: continuously adaptive query processing. In: Proceedings of the SIGMOD, pp 261–272

  32. Chen J, DeWitt DJ, Tian F, Wang Y (2000) NiagaraCQ: a scalable continuous query system for Internet databases. In: Proceedings of the SIGMOD, pp 379–390

  33. Arasu A, Widom J (2004) Resource sharing in continuous sliding-window aggregates. In: Proceedings of the VLDB, pp 336–347

  34. Babu S, Munagala K, Widom J, Motwani R (2005) Adaptive caching for continuous queries. In: Proceedings, ICDE

  35. ANSI/ISO/IEC International Standard (1999) Database language SQL: foundation (SQL/Foundation)

  36. Tokyo Metropolitan People Flow Data Stream (2016). https://joras.csis.u-tokyo.ac.jp/. Accessed 15 May 2016

Download references

Acknowledgements

This research was partly supported by the program “Research and Development on Real World Big Data Integration and Analysis” of the Ministry of Education, Culture, Sports, Science and Technology (MEXT) and RIKEN, Japan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salman Ahmed Shaikh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shaikh, S.A., Watanabe, Y., Wang, Y. et al. Smart scheme: an efficient query execution scheme for event-driven stream processing. Knowl Inf Syst 58, 341–370 (2019). https://doi.org/10.1007/s10115-018-1195-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1195-9

Keywords

Navigation