Abstract
In recent years, with the increase of big data and the spread of IoT technology and the continual evolution of hardware technology, the demand for data stream processing is further increased. Meanwhile, in the field of database systems, a new demand for HTAP (hybrid transactional and analytical processing) that integrates the functions of on-line transaction processing (OLTP) and on-line analytical processing (OLAP) is emerging. Based on this background, our group started a new project to develop data stream processing technologies in the HTAP environment in cooperation with other research groups in Japan. Our main focus is to develop new data stream processing methodologies such as fault tolerance in cooperation with the OLAP engine. In this paper, we describe the background, the objectives and the issues of the research.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aggarwal, C.C. (ed.): Data Streams: Models and Algorithms, vol. 31. Springer, Heidelberg (2006). https://doi.org/10.1007/978-0-387-47534-9
Aggarwal, C.C., Yu, P.S.: A survey of synopsis construction in data streams. In: Aggarwal, C.C. (ed.) Data Streams. Advances in Database Systems, vol. 31, pp. 169–207. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-47534-9_9
Ailamaki, A., Liarou, E., Tözün, P., Porobic, D., Psaroudakis, I.: Databases on Modern Hardware. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2017)
Akidau, T., et al.: MillWheel: fault-tolerant stream processing at internet scale. PVLDB 6(11), 1033–1044 (2013)
Andrade, H.C.M., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing. Cambridge University Press, New York (2014)
Appuswamy, R., Karpathiotakis, M., Porobic, D., Ailamaki, A.: The case for heterogeneous HTAP. In: CIDR (2017)
Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. In: SIGMOD, pp. 13–24 (2005)
Barber, R., et al.: Evolving databases for new-gen big data applications. In: CIDR (2017)
Barber, R., et al.: Wildfire: concurrent blazing data ingest and analytics. In: SIGMOD, pp. 2077–2080 (2016)
Carbone, P., Ewen, S., Fóra, G., Haridi, S., Richter, S., Tzoumas, K.: State management in Apache Flink: consistent stateful distributed stream processing. PVLDB 10(12), 1718–1729 (2017)
Chandramouli, B., Goldstein, J.: Shrink: prescribing resiliency solutions for streaming. PVLDB 10(5), 505–516 (2017)
Chaudhry, N., Shaw, K., Abdelguerfi, M. (ed.) Stream Data Management. Springer, Heidelberg (2005). https://doi.org/10.1007/b106968
Cherniack, M., et al.: Scalable distributed stream processing. In: CIDR (2003)
Cormode, G., Garofalakis, M., Haas, P.J., Jermaine, C.: Synopses for massive data: samples, histograms, wavelets, sketches. Found. Trends Databases 4(1–3), 1–294 (2012)
da Silva, G.J., et al.: Consistent regions: guaranteed tuple processing in IBM streams. PVLDB 9(13), 1341–1352 (2016)
Ellis, B.: Real-Time Analytics. Wiley, Indianapolis (2014)
Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: SIGMOD, pp. 725–736 (2013)
Floratou, A., Agrawal, A., Graham, B., Rao, S., Ramasamy, K.: Dhalion: self-regulating stream processing in heron. PVLDB 10(12), 1825–1836 (2017)
Galakatos, A., Crotty, A., Zgraggen, E., Kraska, T., Binnig, C.: Revisiting reuse for approximate query processing. PVLDB 10(10), 1142–1153 (2017)
Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management: Processing High-Speed Data Streams. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-540-28608-0
Garofalakis, M., Gibbon, P.B.: Approximate query processing: taming the terabytes! In: VLDB (tutorial) (2001)
Golab, L., Özsu, M.T.: Data Stream Management. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2010)
Huang, Q., Lee, P.P.C.: Toward high-performance distributed stream processing via approximate fault tolerance. PVLDB 10(3), 73–84 (2016)
Hwang, J.-H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In: ICDE, pp. 779–790 (2005)
Hwang, J.-H., Xing, Y., Cetintemel, U., Zdonik, S.: A cooperative, self-configuring high-availability solution for stream processing. In: ICDE, pp. 176–185 (2007)
Krishnamurthy, S., et al.: Continuous analytics over discontinuous streams. In: SIGMOD, pp. 1081–1092 (2010)
Kulkarni, S., et al.: Twitter heron: stream processing at scale. In: SIGMOD, pp. 239–250 (2015)
Meehan, J., et al.: S-Store: streaming meets transaction processing. PVLDB 8(13), 2134–2145 (2015)
Mozafari, B., Niu, N.: A handbook for building an approximate query engine. IEEE Data Eng. Bull. 38(3), 3–29 (2015)
Muthukrishnan, S.: Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science. Now Publishers, Delft (2005)
Noghabi, S.A., et al.: Stateful scalable stream processing at LinkedIn. PVLDB 10(12), 1634–1645 (2017)
Özcan, F., Tian, Y., Tözün, P.: Hybrid transactional/analytical processing: a survey. In: SIGMOD (2017)
Ré, C., Letchner, J., Balazinksa, M., Suciu, D.: Event queries on correlated probabilistic streams. In: SIGMOD, pp. 715–728 (2008)
Shah, M.A., Hellerstein, J.M., Brewer, E.: Highly available, fault-tolerant, parallel dataflows. In: SIGMOD, pp. 827–838 (2004)
Stonebraker, M., Çetintemel, U., Zdonik, S.: The 8 requirements of real-time stream processing. SIGMOD Rec. 34(4), 42–47 (2005)
Storm. http://storm.apache.org/
Sugiura, K., Ishikawa, Y., Sasaki, Y.: Grouping methods for pattern matching over probabilistic data streams. IEICE Trans. Inf. Syst. E-100D(4), 718–729 (2017)
Toshniwal, A.: Storm@twitter. In: SIGMOD, pp. 147–156 (2014)
Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDBJ 21(5), 651–676 (2012)
Acknowledgments
This paper is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO) and a project supported by JSPS KAKENHI Grant Number 16H01722.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ishikawa, Y., Sugiura, K., Takao, D. (2018). Fault Tolerant Data Stream Processing in Cooperation with OLTP Engine. In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P., Somayajulu, D. (eds) Big Data Analytics. BDA 2018. Lecture Notes in Computer Science(), vol 11297. Springer, Cham. https://doi.org/10.1007/978-3-030-04780-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-04780-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04779-5
Online ISBN: 978-3-030-04780-1
eBook Packages: Computer ScienceComputer Science (R0)