Abstract
One of the most important uses of aggregate queries over data streams is sampling. Typically, aggregation is performed over sliding windows where queries return new results whenever the window contents change, a concept referred to as a continuous query. Existing data models and query languages for streams are not capable of expressing many practical user-defined samplings over streams. To this end we propose a new data stream model, referred to as the sequence model, and a query language for specifying aggregate queries over data streams. We show that the sequence model can readily express a superset of the aggregate queries expressible in the previously proposed time-based data stream model, thus providing a declarative and formal semantics to understand and reason about continuous aggregate queries. Defined on top of the sequence model, our query language supports existing sliding window operators and a novel frequency operator. By using the frequency operator one is capable of expressing useful sampling queries, such as queries with user-defined group-based sampling and nested aggregation over either the input stream or the result stream. Such capabilities are beyond those of previously proposed query languages over streams. Finally, we conduct a preliminary experimental study that shows our language is effective and efficient in practice.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arasu, A., Widom, J.: A denotational semantics for continuous queries over streams and relations. SIGMOD Record (ACM Special Interest Group on Management of Data) 33(3), 6–11 (2004)
Arasu, A., Widom, J.: Resource sharing in continuous sliding-window aggregates. In: VLDB, pp. 336–347 (2004)
Arasu, A., et al.: CQL: A Language for Continuous Queries over Streams and Relations. In: DBPL, pp. 1–19 (2003)
Babcock, B., et al.: Load shedding for aggregation queries over data streams. In: ICDE, pp. 350–361. IEEE Computer Society, Los Alamitos (2004)
Carney, D., et al.: Monitoring streams - A new class of data management applications. In: VLDB, pp. 215–226 (2002)
Chandrasekaran, S., Franklin, M.J.: Streaming queries over streaming data. In: VLDB, pp. 203–214 (2002)
Chandrasekaran, S., et al.: TelegraphCQ: Continuous dataflow processing. In: SIGMOD Conference, pp. 668–668 (2003)
Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: Niagaracq: A scalable continuous query system for internet databases. In: SIGMOD Conference, pp. 379–390 (2000)
Cranor, C., et al.: Gigascope: A stream database for network applications. In: SIGMOD Conference, pp. 647–651 (2003)
Dobra, A., Garofalakis, M.N., Gehrke, J., Rastogi, R.: Processing complex aggregate queries over data streams. In: SIGMOD Conference, pp. 61–72 (2002)
Gehrke, J., et al.: On computing correlated aggregates over continual data streams. In: SIGMOD Conference (2001)
Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Surfing wavelets on streams: One-pass summaries for approximate aggregate queries. In: VLDB, pp. 79–88 (2001)
Li, J., et al.: Semantics and evaluation techniques for window aggregates in data streams. In: SIGMOD Conference (2005)
Manjhi, A., et al.: Tributaries and deltas: Efficient and robust aggregation in sensor network streams. In: SIGMOD Conference (2005)
Motwani, R., et al.: Query Processing, Approximation, and Resource Management in a Data Stream Management System. In: CIDR (2003)
Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: SSDBM, pp. 84–95 (1998)
Seshadri, P., et al.: Seq: A model for sequence databases. In: ICDE, pp. 232–239 (1995)
Yao, Y., Gehrke, J.E.: Query processing in sensor networks. In: CIDR (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, L., Viglas, S.D., Li, M., Li, Q. (2005). Stream Operators for Querying Data Streams. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_36
Download citation
DOI: https://doi.org/10.1007/11563952_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29227-2
Online ISBN: 978-3-540-32087-6
eBook Packages: Computer ScienceComputer Science (R0)