Abstract
A potential problem for persisting large volume of streaming logs with conventional relational databases is that loading large volume of data logs produced at high rates is not fast enough due to the strong consistency model and high cost of indexing. As a possible alternative, state-of-the-art NoSQL data stores that sacrifice transactional consistency to achieve higher performance and scalability can be utilized. In this paper, we describe the challenges in large scale persisting and analysis of numerical streaming logs. We propose to develop a benchmark comparing relational databases with state-of-the-art NoSQL data stores to persist and analyze numerical logs. The benchmark will investigate to what degree a state-of-the-art NoSQL data store can achieve high performance persisting and large-scale analysis of data logs. The benchmark will serve as basis for investigating query processing and indexing of large-scale numerical logs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smart Vortex Project. http://www.smartvortex.eu/
Zeitler, E., Risch, T.: Massive scale-out of expensive continuous queries. In: VLDB (2011)
Truong, T., Risch, T.: Scalable numerical queries by algebraic inequality Transformations. In: DASFAA (2014)
Zhu, M., Stefanova, S., Truong, T., Risch, T.: Scalable numerical SPARQL queries over relational databases. In: LWDM Workshop (2014)
Doppelhammer, J., Höppler, T., Kemper, A., Kossmann, D.: Database performance in the real world. In: SIGMOD (1997)
Stonebraker, M.: SQL databases v. NoSQL databases. Comm. ACM. 53(4), 10–11 (2010)
Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 12–27 (2011)
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., Dewitt, D.J., Madden, S., Stonebraker, M.: A Comparison of approaches to large-scale data analysis. In: SIGMOD (2009)
Council, T.P.P.: TPC Benchmarks. http://www.tpc.org/information/benchmarks.asp
Arasu, A., Cherniack, M., Galvez, E., Maier, D., Maskey, A.S., Ryvkina, E., Stonebraker, M., Tibbetts, R.: Linear road: a stream data management benchmark. In: VLDB (2004)
Gaede, V., Günther, O.: Multidimensional access methods. ACM Comput. Surv. 30, 47–91 (1998)
Risch, T., Josifovski, V., Katchaounov, T.: Functional data integration in a distributed mediator system. In: Gray, P.M.D., Kerschberg, L., King, P.J.H., Poulovassilis, A. (eds.) The Functional Approach to Data Management. Springer, Heidelberg (2004)
Freedman, C., Ismert, E., Larson, P.-Å.: Compilation in the microsoft SQL server hekaton engine. IEEE Data Eng. Bull. 37, 22–30 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mahmood, K., Truong, T., Risch, T. (2015). NoSQL Approach to Large Scale Analysis of Persisted Streams. In: Maneth, S. (eds) Data Science. BICOD 2015. Lecture Notes in Computer Science(), vol 9147. Springer, Cham. https://doi.org/10.1007/978-3-319-20424-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-20424-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20423-9
Online ISBN: 978-3-319-20424-6
eBook Packages: Computer ScienceComputer Science (R0)