Benchmarking the Availability and Fault Tolerance of Cassandra

Rosselli, Marten; Niemann, Raik; Ivanov, Todor; Tolle, Karsten; Zicari, Roberto V.

doi:10.1007/978-3-319-49748-8_5

Marten Rosselli^19,20,
Raik Niemann^19,21,
Todor Ivanov¹⁹,
Karsten Tolle¹⁹ &
…
Roberto V. Zicari¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10044))

Included in the following conference series:

866 Accesses
3 Citations

Abstract

To be able to handle big data workloads, modern NoSQL database management systems like Cassandra are designed to scale well over multiple machines. However, with each additional machine in a cluster, the likelihood for hardware failure increases. In order to still achieve high availability and fault tolerance, the data needs to be replicated within the cluster. Predictable and stable response times are required by many applications even in the case of a node failure. While Cassandra guarantees high availability, the influence of a node failure on the system performance is still unclear.

In this paper, we therefore focus on the availability and fault tolerance of Cassandra. We analyze the impact of a node outage within a Cassandra cluster on the throughput and latency for different workloads. Our results show that Cassandra is well suited to achieve high availability while preserving table response times in case of a node failure. Especially for read intensive applications that require high availability, Cassandra is a good choice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See http://cassandra.apache.org.
2.
YCSB Cassandra binding based on CQL: https://github.com/jbellis/YCSB.

References

Beyer, F., Koschel, A., Schulz, C., Schäfer, M., Astrova, I., Grivas, S.G., Schaaf, M., Reich, A.: Testing the suitability of cassandra for cloud computing environments. In: CLOUD COMPUTING 2011, The Second International Conference on Cloud Computing, GRIDs, and Virtualization, pp. 86–91 (2011)
Google Scholar
Cattell, R.: Scalable sql and nosql data stores. ACM SIGMOD Rec. 39(4), 12–27 (2011)
Article Google Scholar
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)
Google Scholar
DataStax, Inc.: Datastax cassandra documentation (2015). http://www.datastax.com/docs
DataStax, Inc.: Datastax enterprise cassandra distribution (2015). http://www.datastax.com
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41, 205–220 (2007). ACM
Article Google Scholar
Fan, H., Ramaraju, A., McKenzie, M., Golab, W., Wong, B.: Understanding the causes of consistency anomalies in apache cassandra. Proc. VLDB Endowment 8(7), 810–813 (2015)
Article Google Scholar
George, L.: HBase: The Definitive Guide. O’Reilly Media Inc., Sebastopol (2011)
Google Scholar
Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.A.: Bigbench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208. ACM (2013)
Google Scholar
Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The hibench benchmark suite: characterization of the mapreduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010)
Google Scholar
Kuhlenkamp, J., Klems, M., Röss, O.: Benchmarking scalability and elasticity of distributed database systems. Proc. VLDB Endowment 7(13), 1219–1230 (2014)
Article Google Scholar
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Article Google Scholar
Nambiar, R., Poess, M., Dey, A., Cao, P., Magdon-Ismail, T., Bond, A., et al.: Introducing tpcx-hs: the first industry standard for benchmarking big data systems. In: Nambiar, R., Poess, M. (eds.) Performance Characterization and Benchmarking. Traditional to Big Data. LNCS, vol. 8904, pp. 1–12. Springer, Switzerland (2014)
Google Scholar
Rabl, T., Gómez-Villamor, S., Sadoghi, M., Muntés-Mulero, V., Jacobsen, H.A., Mankovskii, S.: Solving big data challenges for enterprise application performance management. Proc. VLDB Endowment 5(12), 1724–1735 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Frankfurt Big Data Lab, Goethe University Frankfurt am Main, Frankfurt, Germany
Marten Rosselli, Raik Niemann, Todor Ivanov, Karsten Tolle & Roberto V. Zicari
Accenture Germany, Frankfurt, Germany
Marten Rosselli
Institute of Information Systems, University of Applied Science Hof, Hof, Germany
Raik Niemann

Authors

Marten Rosselli
View author publications
You can also search for this author in PubMed Google Scholar
Raik Niemann
View author publications
You can also search for this author in PubMed Google Scholar
Todor Ivanov
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Tolle
View author publications
You can also search for this author in PubMed Google Scholar
Roberto V. Zicari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marten Rosselli .

Editor information

Editors and Affiliations

Technical University of Berlin, Berlin, Germany
Tilmann Rabl
Cisco Systems, Inc., San Jose, California, USA
Raghunath Nambiar
University of California at San Diego, La Jolla, California, USA
Chaitanya Baru
Ampool, Inc., Santa Clara, California, USA
Milind Bhandarkar
Oracle Corporation, Redwood Shores, California, USA
Meikel Poess
Indian Institute of Public Health, Hyderabad, India
Saumyadipta Pyne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rosselli, M., Niemann, R., Ivanov, T., Tolle, K., Zicari, R.V. (2016). Benchmarking the Availability and Fault Tolerance of Cassandra. In: Rabl, T., Nambiar, R., Baru, C., Bhandarkar, M., Poess, M., Pyne, S. (eds) Big Data Benchmarking. WBDB WBDB 2015 2015. Lecture Notes in Computer Science(), vol 10044. Springer, Cham. https://doi.org/10.1007/978-3-319-49748-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-49748-8_5
Published: 01 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49747-1
Online ISBN: 978-3-319-49748-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics