Skip to main content

Advertisement

A physical operator algebra for prioritized elements in data streams

  • Regular Paper
  • Published:
Computer Science - Research and Development

Abstract

Data stream management systems are a natural choice to efficiently process continuous queries over high volume data streams, e.g., to monitor sensor data or transaction streams. An immediate reaction on detected critical or security relevant situations is essential for a secure and economic operation, as in our scenario of monitoring decentralized energy systems, which realize geographically distributed energy generation processes. Without further provisions existing processing approaches may lead to a delay of critical or security relevant messages in high load situations, e.g., caused by bursts.

One way to allow an adequate processing in such situations is to prioritize queries that handle critical situations. Unfortunately, problems are not always solely identifiable by a query. Sometimes certain – e.g., out of range – data values or error messages indicate situations, which urge a faster processing of all queries processing these data. Traditional approaches on continuous query execution assume a stream order, typically based on timestamps, and a processing following this order. In this article we consider the prioritization of those elements and propose an out-of-order execution in the data stream.

We provide a comprehensive and formally founded approach for prioritizing data stream elements. Prioritized elements benefit twice from our approach. On the one hand, they are able to “overtake” lower prioritized elements, e.g., in queues. On the other hand, prioritized results can be produced earlier in stateful operators than this would be possible in other approaches. Still, the semantics of the queries remains unchanged. We implemented our approach and show with measurements that a very low latency of prioritized elements can be achieved – even under high load. As a result, all queries that process prioritized elements can benefit from our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abadi DJ, Carney D, Çetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik S (2003) Aurora: a new model and architecture for data stream management. VLDB J 12(2):120–139, doi: http://dx.doi.org/10.1007/s00778-003-0095-z

    Article  Google Scholar 

  2. Arasu A, Babu S, Widom J (2006) The cql continuous query language: semantic foundations and query execution. VLDB J 15(2):121–142

    Article  Google Scholar 

  3. Babcock B, Babu S, Datar M, Motwani R (2003) Chain: Operator scheduling for memory minimization in data stream systems. In: Halevy AY, Ives ZG, Doan A (eds) SIGMOD Conference, ACM, pp 253–264

  4. Bolles A, Grawunder M, Jacobi J (2008) Streaming sparql – extending sparql to process data streams. In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M (eds) ESWC, Lecture Notes in Computer Science 5021:448–462, Springer

  5. Cammert M, Heinz C, Krämer J, Schneider M, Seeger B (2003) A status report on xxl – a software infrastructure for efficient query processing. IEEE Data Eng Bull 26(2):12–18

    Google Scholar 

  6. Cammert M, Heinz C, Krämer J, Seeger B, Vaupel S, Wolske U (2007) Flexible multi-threaded scheduling for continuous queries over data streams. In: First International Workshop on Scalable Stream Processing Systems

  7. Carney D, Çetintemel U, Rasin A, Zdonik SB, Cherniack M, Stonebraker M (2003) Operator scheduling in a data stream manager. In: VLDB, pp 838–849

  8. Ding L, Rundensteiner EA (2004) Evaluating window joins over punctuated streams. In: CIKM ’04: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, New York, NY, doi: http://doi.acm.org/10.1145/1031171.1031189, pp 98–107

    Chapter  Google Scholar 

  9. Haas PJ, Hellerstein JM (1999) Ripple joins for online aggregation. In: Delis A, Faloutsos C, Ghandeharizadeh S (eds) SIGMOD 1999, Proceedings ACM SIGMOD International Conference on Management of Data, June 1–3, 1999, Philadelphia, Pennsylvania, USA, ACM Press, pp 287–298

  10. Hammad MA, Franklin MJ, Aref WG, Elmagarmid AK (2003) Scheduling for shared window joins over data streams. In: VLDB ’2003: Proceedings of the 29th international conference on Very large data bases, VLDB Endowment, pp 297–308

  11. Krämer J (2007) Continuous queries over data streams – semantics and implementation. Ph.D. thesis, Philipps-Universität Marburg, Marburg an der Lahn

  12. Krämer J, Seeger B (2005) A temporal foundation for continuous queries over data streams. In: Haritsa JR, Vijayaraman TM (eds) COMAD. Computer Society of India, pp 70–82

  13. Li J, Tufte K, Shkapenyuk V, Papadimos V, Johnson T, Maier D (2008) Out-of-order processing: A new architecture for high-performance stream systems. In: VLDB, pp 274–288

  14. Li M, Liu M, Ding L, Rundensteiner EA, Mani M (2007) Event stream processing with out-of-order data arrival. In: ICDCSW ’07: Proceedings of the 27th International Conference on Distributed Computing Systems Workshops. IEEE Computer Society, Washington, DC, USA, p 67, doi: http://dx.doi.org/10.1109/ICDCSW.2007.35

  15. Sharaf MA, Chrysanthis PK, Labrinidis A, Pruhs K (2008) Algorithms and metrics for processing multiple heterogeneous continuous queries. ACM Trans Database Syst 33(1):

  16. Srivastava U, Widom J (2004) Flexible time management in data stream systems. In: PODS ’04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposiumon Principles of database systems. ACM Press, New York, NY, pp 263–274, doi: http://doi.acm.org/10.1145/1055558.1055596

  17. Tatbul N, Çetintemel U, Zdonik SB, Cherniack M, Stonebraker M (2003) Load shedding in a data stream manager. In: VLDB, pp 309–320

  18. Urhan T, Franklin MJ (2001) Dynamic pipeline scheduling for improving interactive query performance. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT (eds) VLDB. Morgan Kaufmann, pp 501–510

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniela Nicklas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jacobi, J., Bolles, A., Grawunder, M. et al. A physical operator algebra for prioritized elements in data streams . Comput Sci Res Dev 25, 235–246 (2010). https://doi.org/10.1007/s00450-009-0102-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-009-0102-8

Keywords