Abstract
We propose a system combining stream processing engines and big data storages for analyzing large amounts of data streams. It allows us to analyze data online and to store data for later offline analysis. An emphasis is laid on designing a system to facilitate simple implementations of data analysis algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abadi, D.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., et al.: The design of the borealis stream processing engine. In: CIDR (2005)
Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12(2), 120–139 (2003)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2) (2008)
Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-Reduce for machine learning on multicore. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS, pp. 281–288. MIT Press (2006)
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: Map-Reduce online. In: NSDI, pp. 313–328. USENIX Association (2010)
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Gerth, J., Talbot, J., Elmeleegy, K., Sears, R.: Online aggregation and continuous query support in mapReduce. In: Elmagarmid, A.K., Agrawal, D. (eds.) SIGMOD Conference, pp. 1115–1118. ACM (2010)
Dean, J., Ghemawat, S.: Map-Reduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
EsperTech. Esper – complex event processing. Website (2013) esper.codehaus.org
The Apache Software Foundation. Apache Hadoop. Website (2013), hadoop.apache.org
The Apache Software Foundation. Mahout: Scalable machine-learning and data-mining library (2013) mahout.apache.org
Franklin, M.J., Jeffery, S.R., Krishnamurthy, S., Reiss, F., Rizvi, S., Wu, E., Cooper, O., Edakkunni, A., Hong, W.: Design considerations for high fan-in systems: The HiFi approach. In: CIDR (2005)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Scott, M.L., Peterson, L.L. (eds.) SOSP, pp. 29–43. ACM (2003)
Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, resource management, and approximation in a data stream management system. In: CIDR (2003)
Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed stream computing platform. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDM Workshops, pp. 170–177. IEEE Computer Society (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Steinmaurer, T., Traxler, P., Zwick, M., Stumptner, R., Lettner, C. (2014). Combining Stream Processing Engines and Big Data Storages for Data Analysis. In: Andreasen, T., Christiansen, H., Cubero, JC., RaÅ›, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2014. Lecture Notes in Computer Science(), vol 8502. Springer, Cham. https://doi.org/10.1007/978-3-319-08326-1_48
Download citation
DOI: https://doi.org/10.1007/978-3-319-08326-1_48
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08325-4
Online ISBN: 978-3-319-08326-1
eBook Packages: Computer ScienceComputer Science (R0)