Abstract
There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the “Chain” pipelining scheduling approach to complex DAG query plans.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gupta, C., Wang, S., Ari, I., Hao, M., Dayal, U., Mehta, A., Marwah, M., Sharma, R.: Chaos: A data stream analysis architecture for enterprise applications. In: CEC ’09 (2009) (to appear)
Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, resource management, and approximation in a data stream management system. In: Proceedings of the First Biennial Conference on Innovative Data Systems Research (CIDR 2003), pp. 245–256 (2003)
Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M., Hellerstein, J., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: CIDR, pp. 269–280 (2003)
Rundensteiner, E.A., Ding, L., Sutherland, T., Zhu, Y., Pielech, B., Mehta, N.: CAPE: Continuous Query Engine with Heterogeneous-Grained Adaptivity. In: VLDB Demo, pp. 1353–1356 (2004)
Abbadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: A New Model and Architecture for Data Stream Management. VLDB Journal, 120–139 (2003)
Hammad, M.A., Mokbel, M.F., Ali, M.H., Aref, W.G., et al.: Nile: A Query Processing Engine for Data Streams. In: ICDE, p. 851 (2004)
Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: An architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005)
Yin, X., Pedersen, T.B.: What can hierarchies do for data streams? In: Bussler, C.J., Castellanos, M., Dayal, U., Navathe, S. (eds.) BIRTE 2006. LNCS, vol. 4365, pp. 4–19. Springer, Heidelberg (2007)
Lo, E., Kao, B., Ho, W.S., Lee, S.D., Chui, C.K., Cheung, D.W.: Olap on sequence data. In: SIGMOD, 649–660 (2008)
Gedik, B., Andrade, H., Wu, K.L., Yu, P.S., Doo, M.: Spade: the system s declarative stream processing engine. In: SIGMOD Conference, pp. 1123–1134 (2008)
Urhan, T., Franklin, M.J.: Dynamic pipeline scheduling for improving interactive query performance. In: VLDB, pp. 501–510 (Septmeber 2001)
Carney, D., Çetintemel, U., Rasin, A., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: VLDB, pp. 838–849 (2003)
Babcock, B., Babu, S., Motwani, R., Datar, M.: Chain: operator scheduling for memory minimization in data stream systems. In: ACM SIGMOD, pp. 253–264 (2003)
Pielech, T.S.B., Rundensteiner, E.A.: An adaptive multi-objective scheduling selection framework for continuous query processing. In: IDEAS, pp. 445–454 (July 2005)
Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A., Pruhs, K.: Efficient scheduling of heterogeneous continuous queries. In: VLDB, pp. 511–522 (2006)
Bai, Y., Zaniolo, C.: Minimizing latency and memory in dsms: a unified approach to quasi-optimal scheduling. In: SSPS, pp. 58–67 (2008)
Jiang, Q., Chakravarthy, S.: Scheduling strategies for processing continuous queries over streams. In: BNCOD, pp. 16–30 (2004)
Golab, L., Özsu, M.T.: Issues in data stream management. SIGMOD Rec. 32(2), 5–14 (2003)
Babcock, B., Babu, S., Datar, M., Motwani, R., Thomas, D.: Operator scheduling in data stream systems. VLDB J. 13(4), 333–353 (2004)
Babu, S., Munagala, K., Widom, J., Motwani, R.: Adaptive caching for continuous queries. In: ICDE, pp. 118–129 (2005)
Babu, S., Motwani, R., Munagala, K., Nishizawa, I., Widom, J.: Adaptive ordering of pipelined stream filters. In: SIGMOD, pp. 407–418 (2004)
Little, J.D.C.: A Proof of the Queueing Formula l = λω. Operation Research 9, 383–387 (1961)
Wolff, R.W.: Poisson arrivals see time averages. Operation Research 30(2), 223–231 (1982)
ExtendSim: ExtendSim Website, http://www.extendsim.com
Johnson, T., Muthukrishnan, S., Shkapenyuk, V., Spatscheck, O.: A heartbeat mechanism and its application in gigascope. In: VLDB, pp. 1079–1088 (2005)
Bai, Y., Thakkar, H., Wang, H., Zaniolo, C.: Optimizing timestamp management in data stream management systems. In: ICDE, pp. 1334–1338 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, S., Gupta, C., Mehta, A. (2010). VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans. In: Castellanos, M., Dayal, U., Miller, R.J. (eds) Enabling Real-Time Business Intelligence. BIRTE 2009. Lecture Notes in Business Information Processing, vol 41. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14559-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-14559-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14558-2
Online ISBN: 978-3-642-14559-9
eBook Packages: Computer ScienceComputer Science (R0)