Abstract
As more data are being collected and analyzed in real time, stream processing is attracting greater attention. In traditional network, economic and financial analysis, or processing of sensors data in Internet of Things, efficient and timely methods of handling continuous data generation are required. Data are being produced at higher frequency, and the volume of data to be processed within a particular period of time is increasing rapidly. This is especially true for continuous window aggregation, which involves intensive computation. An increase in the number of query windows also generates scalability problems involving aggregate queries. Traditional query handling algorithms can perform many repeated operations. In this study, to enable window-based shared processing of continuous queries, a window reuse algorithm for a multi-query environment based on pace and results is proposed: the multiple continuous query algorithm (MCQA). The aggregation is simplified by gradually shrinking the set of multiple values so that the operation is reduced at each step, eventually achieving result sharing. The algorithm is implemented with the Storm stream processing framework. Experiments prove that the MCQA performance is more efficient and effectively reduces memory usage.
Similar content being viewed by others
References
Naidu KVM, Rastogi R, Satkin S, Srinivasan A (2011) Memory constrained aggregate computation over data streams. In: Proceedings of 27th IEEE International Conference on Data Engineering (ICDE), Hannover, Germany, pp 852–863
Liu W, Shen YM, Wang P (2016) An efficient approach of processing multiple continuous queries. J Comput Sci Technol 31(6):1212–1227
Gulisano V, Jimenez-Peris R, Patio-Martinez M (2012) StreamCloud: an elastic and scalable data streaming system. IEEE Trans Parallel Distrib Syst 23(12):2351–2365
Krishnamurthy S, Franklin MJ, Davis J, Farina D, Golovko P, Li A, Thombre N (2010) Continuous analytics over discontinuous streams. In: Proceedings of 29th ACM SIGMOD International Conference on Management of Data, Indianapolis, Indiana, pp 1081–1092
Roy P, Seshadri S, Sudarshan S, Bhobe S (2000) Efficient and extensible algorithms for multi query optimization. ACM SIGMOD Rec. 29(2):249–260
Storm[EB/OL] http://storm.apache.org/. Accessed 13 June 2018
Gray J, Chaudhuri S, Bosworth A, Layman A, Reichart D, Venkatrao M, Pellow F, Pirahesh H (1997) Data cube: a relational aggregation operator generalizing groupby, cross-tab, and sub-totals. Data Min Knowl Discov 1(1):29–53
Abadi DJ, Carney D, Cetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik S (2003) Aurora: a new model and architecture for data stream management. VLDB J 12(2):120–139
Huebsch R, Garofalakis M, Hellerstein JM, Stoica I (2010) Sharing aggregate computation for distributed queries. In: Proceedings of 26th ACM SIGMOD International Conference on Management of Data, Beijing, China, pp 485–496
Babu S, Widom J (2001) Continuous queries over data streams. ACM SIGMOD Rec 30(3):109–120
Seshadri S, Kumar V, Cooper BF (2007) Optimizing Multiple Distributed Stream Queries Using Hierarchical Network Partitions. In: IEEE International Parallel and Distributed Processing Symposium, pp 1–10
Aniello L, Baldoni R, Querzoni L (2013) Adaptive online scheduling in storm. In: Proceedings of 7th ACM International Conference on Distributed Event-Based Systems (DEBS), Arlington, Texas, USA, pp 207–218
Arasu A, Babu S, Widom J (2006) The CQL continuous query language: semantic foundations and query execution. VLDB J 15(2):121–142
Krishnamurthy S, Franklin MJ, Davis J, Farina D, Golovko P, Li A, Thombre N (2010) Continuous analytics over discontinuous streams. In: Proceedings of 29th ACM SIGMOD International Conference on Management of Data, Indianapolis, Indiana, pp 1081–1092
Guirguis S, Sharaf MA, Chrysanthis PK, Labrinidis A (2011) Optimized processing of multiple aggregate continuous queries. In: Proceedings of 20th ACM International Conference on Information and Knowledge Management, Glasgow, United Kingdom, pp 1515–1524
Guirguis S, Sharaf MA, Chrysanthis PK, Labrinidis A (2012) Three-level processing of multiple aggregate continuous queries. In: Proceedings of 28th IEEE International Conference on Data Engineering (ICDE), Washington, DC, pp 929–940
Patroumpas K, Sellis T (2010) Multi-granular time-based sliding windows over data stream. International Symposium on Temporal Representation and Reasoning, IEEE Computer Society, pp 146–153
Patroumpas K, Sellis T (2011) Subsuming multiple sliding windows for shared stream computation. Adv Databases Inf Syst 36:56–69
Acknowledgements
This work is supported by Young Doctoral Science and Technology Talents of Xinjiang (2017Q085), Natural Science Foundation of Xinjiang (2017D01B09), Scientific Research Program of the Higher Education Institution of Xinjiang (XJEDU2016I049) and Doctoral Initiation Fund of Xinjiang Institute of Engineering (2017XGY022110).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, W., Zhang, T. & Liu, J. Window-based multiple continuous query algorithm for data streams. J Supercomput 75, 5782–5807 (2019). https://doi.org/10.1007/s11227-019-02856-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02856-z