Abstract
Existing analytical query benchmarks, such as TPC-H, often assess database system performance on on-premises hardware installations. On the other hand, some benchmarks for cloud-based analytics deal with flexible infrastructure, but often focus on simpler queries and semi-structured data. With our benchmark draft we attempt to bridge the gap by challenging analytical platforms to answer complex queries on structured business data while leveraging the elastic infrastructure of the cloud to satisfy performance requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
http://oltpbenchmark.com/. Accessed 7 May 2014
Shark. http://shark.cs.berkeley.edu
Abouzeid, A., Bajda-Pawlikowski, K.: HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In: Proceedings of the VLDB Endowment (2009)
Baru, C., Bhandarkar, M., Nambiar, R.: Setting the direction for big data benchmark standards. In: Selected Topics in Performance Evaluation and Benchmarking, pp. 1–13 (2013)
Chen, Y., Raab, F., Katz, R.: From tpc-c to big data benchmarks: a functional workload model. In: Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.) WBDB 2012. LNCS, vol. 8163, pp. 28–43. Springer, Heidelberg (2014)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM symposium on Cloud computing - SoCC 2010, p. 143 (2010)
Dory, T., Mejías, B., Roy, P.V., Tran, N.: Measuring elasticity for cloud databases. In: CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization, pp. 154–160 (2011)
Floratou, A., Teletia, N., DeWitt, D.: Can the elephants handle the NoSQL onslaught? Proc. VLDB Endow. 5(12), 1712–1723 (2012)
Foundation, A.: Spark: Lightning-fast cluster computing (2014). http://spark.apache.org/. Accessed 21 March 2014
Gambi, A., Moldovan, D., Copil, G., Truong, H.-L., Dustdar, S.: On estimating actuation delays in elastic computing systems. In: 8th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, pp. 33–42 (2013)
Islam, S., Lee, K., Fekete, A., Liu, A.: How a consumer can measure elasticity for cloud platforms. In: Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering - ICPE 2012, p. 85 (2012)
Jia, Y.: Running the TPC-H Benchmark on Hive. Corresponding issue (2009). https://issues.apache.org/jira/browse/HIVE-600
Kim, K., Jeon, K., Han, H., Kim, S.-G.: Mrbench: a benchmark for mapreduce framework. In: Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2008, pp. 11–18 (2008)
Konstantinou, I., Angelou, E.: On the elasticity of NoSQL databases over cloud management platforms. In: Proceedings of the 20th ACM international conference on Information and Knowledge Management, pp. 2385–2388 (2011)
Meisner, D., Sadler, C.M., Barroso, L.A., Weber, W.-D., Wenisch, T.F.: Power management of online data-intensive services. In: Proceeding of the 38th Annual International Symposium on Computer Architecture - ISCA 2011, p. 319 (2011)
Mühlbauer, T., Rödiger, W., Reiser, A.: ScyPer: elastic OLAP throughput on transactional data. In: Proceedings of the Second Workshop on Data Analytics in the Cloud, pp. 1–5 (2013)
O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009)
Ousterhout, J.K., Agrawal, P., Erickson, D., Kozyrakis, C., Leverich, J., Mazières, D., Mitra, S., Narayanan, A., Parulkar, G.M., Rosenblum, M., Rumble, S.M., Stratmann, E., Stutsman, R.: The case for ramclouds: scalable high-performance storage entirely in dram. Operating Syst. Rev. 43(4), 92–105 (2009)
Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: Proceedings of the 35th SIGMOD international conference on Management of data, p. 165 (2009)
Rabl, T., Ghazal, A., Hu, M., Crolotte, A.: Bigbench specification V0. 1. In: Specifying Big Data Benchmarks (2012)
Rödiger, W., Mühlbauer, T., Unterbrunner, P.: Locality-sensitive operators for parallel main-memory database clusters (2014)
Stonebraker, M.: Mapreduce and parallel dbmss: friends or foes? Commun. ACM 53(4), 10 (2010)
Tinnefeld, C., Kossmann, D., Grund, M., Boese, J.-H., Renkes, F., Sikka, V., Plattner, H.: Elastic online analytical processing on ramcloud. In: Guerrini, G., Paton, N.W. (eds.), EDBT, pp. 454–464. ACM (2013)
Transaction Processing Performance Council. TPC-H specification (2010). www.tpc.org/tpch
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Vorona, D., Funke, F., Kemper, A., Neumann, T. (2015). Benchmarking Elastic Query Processing on Big Data. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-20233-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20232-7
Online ISBN: 978-3-319-20233-4
eBook Packages: Computer ScienceComputer Science (R0)