Skip to main content

Fault Tolerant Data Stream Processing in Cooperation with OLTP Engine

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11297))

Abstract

In recent years, with the increase of big data and the spread of IoT technology and the continual evolution of hardware technology, the demand for data stream processing is further increased. Meanwhile, in the field of database systems, a new demand for HTAP (hybrid transactional and analytical processing) that integrates the functions of on-line transaction processing (OLTP) and on-line analytical processing (OLAP) is emerging. Based on this background, our group started a new project to develop data stream processing technologies in the HTAP environment in cooperation with other research groups in Japan. Our main focus is to develop new data stream processing methodologies such as fault tolerance in cooperation with the OLAP engine. In this paper, we describe the background, the objectives and the issues of the research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aggarwal, C.C. (ed.): Data Streams: Models and Algorithms, vol. 31. Springer, Heidelberg (2006). https://doi.org/10.1007/978-0-387-47534-9

    Book  MATH  Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: A survey of synopsis construction in data streams. In: Aggarwal, C.C. (ed.) Data Streams. Advances in Database Systems, vol. 31, pp. 169–207. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-47534-9_9

  3. Ailamaki, A., Liarou, E., Tözün, P., Porobic, D., Psaroudakis, I.: Databases on Modern Hardware. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2017)

    Google Scholar 

  4. Akidau, T., et al.: MillWheel: fault-tolerant stream processing at internet scale. PVLDB 6(11), 1033–1044 (2013)

    Google Scholar 

  5. Andrade, H.C.M., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing. Cambridge University Press, New York (2014)

    Google Scholar 

  6. Appuswamy, R., Karpathiotakis, M., Porobic, D., Ailamaki, A.: The case for heterogeneous HTAP. In: CIDR (2017)

    Google Scholar 

  7. Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. In: SIGMOD, pp. 13–24 (2005)

    Google Scholar 

  8. Barber, R., et al.: Evolving databases for new-gen big data applications. In: CIDR (2017)

    Google Scholar 

  9. Barber, R., et al.: Wildfire: concurrent blazing data ingest and analytics. In: SIGMOD, pp. 2077–2080 (2016)

    Google Scholar 

  10. Carbone, P., Ewen, S., Fóra, G., Haridi, S., Richter, S., Tzoumas, K.: State management in Apache Flink: consistent stateful distributed stream processing. PVLDB 10(12), 1718–1729 (2017)

    Google Scholar 

  11. Chandramouli, B., Goldstein, J.: Shrink: prescribing resiliency solutions for streaming. PVLDB 10(5), 505–516 (2017)

    Google Scholar 

  12. Chaudhry, N., Shaw, K., Abdelguerfi, M. (ed.) Stream Data Management. Springer, Heidelberg (2005). https://doi.org/10.1007/b106968

    MATH  Google Scholar 

  13. Cherniack, M., et al.: Scalable distributed stream processing. In: CIDR (2003)

    Google Scholar 

  14. Cormode, G., Garofalakis, M., Haas, P.J., Jermaine, C.: Synopses for massive data: samples, histograms, wavelets, sketches. Found. Trends Databases 4(1–3), 1–294 (2012)

    MATH  Google Scholar 

  15. da Silva, G.J., et al.: Consistent regions: guaranteed tuple processing in IBM streams. PVLDB 9(13), 1341–1352 (2016)

    Google Scholar 

  16. Ellis, B.: Real-Time Analytics. Wiley, Indianapolis (2014)

    Google Scholar 

  17. Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: SIGMOD, pp. 725–736 (2013)

    Google Scholar 

  18. Floratou, A., Agrawal, A., Graham, B., Rao, S., Ramasamy, K.: Dhalion: self-regulating stream processing in heron. PVLDB 10(12), 1825–1836 (2017)

    Google Scholar 

  19. Galakatos, A., Crotty, A., Zgraggen, E., Kraska, T., Binnig, C.: Revisiting reuse for approximate query processing. PVLDB 10(10), 1142–1153 (2017)

    Google Scholar 

  20. Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management: Processing High-Speed Data Streams. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-540-28608-0

    Google Scholar 

  21. Garofalakis, M., Gibbon, P.B.: Approximate query processing: taming the terabytes! In: VLDB (tutorial) (2001)

    Google Scholar 

  22. Golab, L., Özsu, M.T.: Data Stream Management. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2010)

    Article  Google Scholar 

  23. Huang, Q., Lee, P.P.C.: Toward high-performance distributed stream processing via approximate fault tolerance. PVLDB 10(3), 73–84 (2016)

    Google Scholar 

  24. Hwang, J.-H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In: ICDE, pp. 779–790 (2005)

    Google Scholar 

  25. Hwang, J.-H., Xing, Y., Cetintemel, U., Zdonik, S.: A cooperative, self-configuring high-availability solution for stream processing. In: ICDE, pp. 176–185 (2007)

    Google Scholar 

  26. Krishnamurthy, S., et al.: Continuous analytics over discontinuous streams. In: SIGMOD, pp. 1081–1092 (2010)

    Google Scholar 

  27. Kulkarni, S., et al.: Twitter heron: stream processing at scale. In: SIGMOD, pp. 239–250 (2015)

    Google Scholar 

  28. Meehan, J., et al.: S-Store: streaming meets transaction processing. PVLDB 8(13), 2134–2145 (2015)

    Google Scholar 

  29. Mozafari, B., Niu, N.: A handbook for building an approximate query engine. IEEE Data Eng. Bull. 38(3), 3–29 (2015)

    Google Scholar 

  30. Muthukrishnan, S.: Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science. Now Publishers, Delft (2005)

    MATH  Google Scholar 

  31. Noghabi, S.A., et al.: Stateful scalable stream processing at LinkedIn. PVLDB 10(12), 1634–1645 (2017)

    Google Scholar 

  32. Özcan, F., Tian, Y., Tözün, P.: Hybrid transactional/analytical processing: a survey. In: SIGMOD (2017)

    Google Scholar 

  33. Ré, C., Letchner, J., Balazinksa, M., Suciu, D.: Event queries on correlated probabilistic streams. In: SIGMOD, pp. 715–728 (2008)

    Google Scholar 

  34. Shah, M.A., Hellerstein, J.M., Brewer, E.: Highly available, fault-tolerant, parallel dataflows. In: SIGMOD, pp. 827–838 (2004)

    Google Scholar 

  35. Stonebraker, M., Çetintemel, U., Zdonik, S.: The 8 requirements of real-time stream processing. SIGMOD Rec. 34(4), 42–47 (2005)

    Article  Google Scholar 

  36. Storm. http://storm.apache.org/

  37. Sugiura, K., Ishikawa, Y., Sasaki, Y.: Grouping methods for pattern matching over probabilistic data streams. IEICE Trans. Inf. Syst. E-100D(4), 718–729 (2017)

    Article  Google Scholar 

  38. Toshniwal, A.: Storm@twitter. In: SIGMOD, pp. 147–156 (2014)

    Google Scholar 

  39. Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDBJ 21(5), 651–676 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

This paper is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO) and a project supported by JSPS KAKENHI Grant Number 16H01722.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoshiharu Ishikawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ishikawa, Y., Sugiura, K., Takao, D. (2018). Fault Tolerant Data Stream Processing in Cooperation with OLTP Engine. In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P., Somayajulu, D. (eds) Big Data Analytics. BDA 2018. Lecture Notes in Computer Science(), vol 11297. Springer, Cham. https://doi.org/10.1007/978-3-030-04780-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04780-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04779-5

  • Online ISBN: 978-3-030-04780-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics