Abstract
Benchmarking data warehouses is a means to evaluate the performance of systems and the impacts of different technical choices. Developed on relational models which have been for a few years the most used to support classical data warehousing applications such as Star Schema Benchmark (SSB). SSB is designed to measure performance of database products when executing star schema queries. As the volume of data keeps growing, the types of data generated by applications become richer than before. As a result, traditional relational databases are challenged to manage big data. Many IT companies attempt to manage big data challenges using a NoSQL (Not only SQL) database, and may use a distributed computing system. NoSQL databases are known to be non-relational, horizontally scalable, distributed. We present in this paper a new benchmark for columnar NoSQL data warehouse, namely CNSSB (Columnar NoSQL Star Schema Benchmark). CNSSB is derived from SSB and allows generating synthetic data and queries set to evaluate column-oriented NoSQL data warehouse. We have implemented CNSSB under HBase column-oriented database management system (DBMS), and apply its charge of queries to evaluate performance between two SQL skins, Phoenix and HQL (Hive Query Language). That allowed us to observe a better performance of Phoenix compared to HQL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
O’Neil, P., O’Neil, B., Chen, X.: The Star Schema Benchmark (SSB) (2009), http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
Codd, E.: TA Relational Model of Data for Large Shared Data Banks, pp. 377–387. Association for Computing Machinery (ACM) (1970)
Kimball, R., Ross, M.: The Data Warehouse Toolkit, 2nd edn. Welly (2002)
Leavitt, N.: Will NoSQL databases live up to their promise?, pp. 12–14. IEEE Computer Society (2010)
Pokorny, J.: NoSQL Databases: A Step to Database Scalability in Web Environment. In: International Conference on Management of Data, pp. 278–283. Association for Computing Machinery (ACM) (2011)
Matei, G.: Column-Oriented Databases, an Alternative for Analytical Environment. Database Systems Journal, 3–16 (2010)
Jerzy, D.: Business Intelligence and NoSQL Databases. Information Systems in Management, 25–37 (2012)
Ballinger, C.: TPC-D: Benchmarking for Decision Support. In: The Benchmark Handbook for Database and Transaction Processing Systems. Morgan Kaufmann (1993)
Poess, M., Floyd, C.: New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Record, 64–71 (2000)
Hecht, R., Jablonski, S.: NoSQL Evaluation: A Use Case Oriented Survey. In: Proceedings of the 2011 International Conference on Cloud and Service Computing, pp. 336–341 (2011)
Apache Hive (2014), https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: ZooKeeper: Wait-free Coordination for Internet-scale Systems. In: Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, pp. 11–24 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dehdouh, K., Boussaid, O., Bentayeb, F. (2014). Columnar NoSQL Star Schema Benchmark. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds) Model and Data Engineering. MEDI 2014. Lecture Notes in Computer Science, vol 8748. Springer, Cham. https://doi.org/10.1007/978-3-319-11587-0_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-11587-0_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11586-3
Online ISBN: 978-3-319-11587-0
eBook Packages: Computer ScienceComputer Science (R0)