Abstract
We introduce an extension for TPC benchmarks addressing the requirements of big data processing in cloud environments. We characterize it as the Elasticity Test and evaluate under TPCx-BB (BigBench). First, the Elasticity Test incorporates an approach to generate real-world query submissions patterns with distinct data scale factors based on major industrial cluster logs. Second, a new metric is introduced based on Service Level Agreements (SLAs) that takes the quality of service requirements of each query under consideration.
Experiments with Apache Hive and Spark on the cloud platforms of three major vendors validate our approach by comparing to the current TPCx-BB metric. Results show how systems who fail to meet SLAs under concurrency due to queuing or degraded performance negatively affect the new metric. On the other hand, elastic systems meet a higher percentage of SLAs and thus are rewarded in the new metric. Such systems have the ability to scale up and down compute workers according to the demands of a varying workload and can thus save dollar costs.
N. Poggi—Contribution while at the BSC-MSR Centre, currently at Databricks Inc.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ghazal, A., et al.: Bigbench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, pp. 1197–1208. ACM, New York (2013)
Islam, S., Lee, K., Fekete, A., Liu, A.: How a consumer can measure elasticity for cloud platforms. In: Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, ICPE 2012, pp. 85–96. ACM, New York (2012)
Poess, M., Rabl, T., Jacobsen, H.A.: Analysis of TPC-DS: the first standard benchmark for SQL-based big data systems. In: Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, pp. 573–585. ACM, New York (2017)
Ramakrishnan, R., et al.: Azure data lake store: a hyperscale distributed file service for big data analytics. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD 2017, pp. 51–63. ACM, New York (2017). https://doi.org/10.1145/3035918.3056100
Zhang, L., Ardagna, D.: Sla based profit optimization in autonomic computing systems. In: Proceedings of the 2Nd International Conference on Service Oriented Computing, ICSOC 2004, pp. 173–182. ACM, New York (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Poggi, N. et al. (2020). Benchmarking Elastic Cloud Big Data Services Under SLA Constraints. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking for the Era of Cloud(s). TPCTC 2019. Lecture Notes in Computer Science(), vol 12257. Springer, Cham. https://doi.org/10.1007/978-3-030-55024-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-55024-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55023-3
Online ISBN: 978-3-030-55024-0
eBook Packages: Computer ScienceComputer Science (R0)