Incremental Aggregation on Multiple Continuous Queries

Jin, Chun; Carbonell, Jaime

doi:10.1007/11875604_20

Incremental Aggregation on Multiple Continuous Queries

Chun Jin²² &
Jaime Carbonell²²

Conference paper

1094 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4203))

Abstract

Continuously monitoring large-scale aggregates over data streams is important for many stream processing applications, e.g. collaborative intelligence analysis, and presents new challenges to data management systems. The first challenge is to efficiently generate the updated aggregate values and provide the new results to users after new tuples arrive. We implemented an incremental aggregation mechanism for doing so for arbitrary algebraic aggregate functions including user-defined ones by keeping up-to-date finite data summaries. The second challenge is to construct shared query evaluation plans to support large-scale queries effectively. Since multiple query optimization is NP-complete and the queries generally arrive asynchronously, we apply an incremental sharing approach to obtain the shared plans that perform reasonably well. The system is built as a part of ARGUS, a stream processing system atop of a DBMS. The evaluation study shows that our approaches are effective and efficient on typical collaborative intelligence analysis data and queries.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003)
Article Google Scholar
Agarwal, S., et al.: On the computation of multidimensional aggregates. In: VLDB, pp. 506–521 (1996)
Google Scholar
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS, pp. 1–16 (2002)
Google Scholar
Blakeley, J.A., Coburn, N., Larson, P.-Å.: Updating derived relations: Detecting irrelevant and autonomously computable updates. ACM Trans. Database Syst. 14(3), 369–400 (1989)
Google Scholar
Chandrasekaran, S., et al.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: CIDR (January 2003)
Google Scholar
Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: Niagaracq: A scalable continuous query system for internet databases. In: SIGMOD Conference, pp. 379–390 (2000)
Google Scholar
Chen, Z., Narasayya, V.R.: Efficient computation of multiple group by queries. In: SIGMOD Conference, pp. 263–274 (2005)
Google Scholar
Cormode, G., et al.: Holistic aggregates in a networked world: Distributed tracking of approximate quantiles. In: SIGMOD Conference, pp. 25–36 (2005)
Google Scholar
DeHaan, D., Larson, P.-Å., Zhou, J.: Stacked indexed views in Microsoft SQL Server. In: SIGMOD Conference, pp. 179–190 (2005)
Google Scholar
Gazen, C., Carbonell, J., Hayes, P.: Novelty Detection in Data Streams: A Small Step Towards Anticipating Strategic Surprise. In: NIMD PI Meeting, Washington, DC (2005)
Google Scholar
Gray, J., et al.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)
Article Google Scholar
Gupta, A., Jagadish, H.V., Mumick, I.S.: Data integration using self-maintainable views. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 140–144. Springer, Heidelberg (1996)
Chapter Google Scholar
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: SIGMOD Conference, pp. 205–216 (1996)
Google Scholar
Jin, C., Carbonell, J.: Toward Incremental Sharing On Continuous Queries. Tech. Report available upon request from authors, Carnegie Mellon Univ. (2005)
Google Scholar
Jin, C., Carbonell, J., Hayes, P.: ARGUS: Rete + DBMS = Efficient Persistent Profile Matching on Large-Volume Data Streams. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 142–151. Springer, Heidelberg (2005)
Chapter Google Scholar
Levy, A.Y., Mendelzon, A.O., Sagiv, Y., Srivastava, D.: Answering queries using views. In: PODS, pp. 95–104 (1995)
Google Scholar
Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and evaluation techniques for window aggregates in data streams. In: SIGMOD Conf., pp. 311–322 (2005)
Google Scholar
Olston, C., Jiang, J., Widom, J.: Adaptive filters for continuous queries over distributed data streams. In: SIGMOD Conference, pp. 563–574 (2003)
Google Scholar
Ross, K.A., Srivastava, D.: Fast computation of sparse datacubes. In: VLDB, pp. 116–125 (1997)
Google Scholar
Scheufele, W., Moerkotte, G.: On the complexity of generating optimal plans with cross products. In: PODS, pp. 238–248 (1997)
Google Scholar
Sellis, T.K., Ghosh, S.: On the multiple-query optimization problem. IEEE Trans. Knowl. Data Eng. 2(2), 262–266 (1990)
Article Google Scholar
Zhang, M., Kao, B., Cheung, D.W.-L., Yip, K.: Mining periodic patterns with gap requirement from sequences. In: SIGMOD Conference (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Chun Jin & Jaime Carbonell

Authors

Chun Jin
View author publications
You can also search for this author in PubMed Google Scholar
Jaime Carbonell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Department of Computer Science, University of North Carolina, NC 28223, Charlotte, USA
Zbigniew W. Raś
Dipartimento di Informatica, Università degli Studi di Bari, via Orabona, 4, 70126, Bari, Italy
Donato Malerba
Dipartimento di Informatica, Università di Bari, Via E. Orabona, 4, 70125, Bari, Italia
Giovanni Semeraro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jin, C., Carbonell, J. (2006). Incremental Aggregation on Multiple Continuous Queries. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds) Foundations of Intelligent Systems. ISMIS 2006. Lecture Notes in Computer Science(), vol 4203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875604_20

Download citation

DOI: https://doi.org/10.1007/11875604_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45764-0
Online ISBN: 978-3-540-45766-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics