Skip to main content

Benchmarking the Availability and Fault Tolerance of Cassandra

  • Conference paper
  • First Online:
Big Data Benchmarking (WBDB 2015, WBDB 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10044))

Included in the following conference series:

Abstract

To be able to handle big data workloads, modern NoSQL database management systems like Cassandra are designed to scale well over multiple machines. However, with each additional machine in a cluster, the likelihood for hardware failure increases. In order to still achieve high availability and fault tolerance, the data needs to be replicated within the cluster. Predictable and stable response times are required by many applications even in the case of a node failure. While Cassandra guarantees high availability, the influence of a node failure on the system performance is still unclear.

In this paper, we therefore focus on the availability and fault tolerance of Cassandra. We analyze the impact of a node outage within a Cassandra cluster on the throughput and latency for different workloads. Our results show that Cassandra is well suited to achieve high availability while preserving table response times in case of a node failure. Especially for read intensive applications that require high availability, Cassandra is a good choice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See http://cassandra.apache.org.

  2. 2.

    YCSB Cassandra binding based on CQL: https://github.com/jbellis/YCSB.

References

  1. Beyer, F., Koschel, A., Schulz, C., Schäfer, M., Astrova, I., Grivas, S.G., Schaaf, M., Reich, A.: Testing the suitability of cassandra for cloud computing environments. In: CLOUD COMPUTING 2011, The Second International Conference on Cloud Computing, GRIDs, and Virtualization, pp. 86–91 (2011)

    Google Scholar 

  2. Cattell, R.: Scalable sql and nosql data stores. ACM SIGMOD Rec. 39(4), 12–27 (2011)

    Article  Google Scholar 

  3. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)

    Google Scholar 

  4. DataStax, Inc.: Datastax cassandra documentation (2015). http://www.datastax.com/docs

  5. DataStax, Inc.: Datastax enterprise cassandra distribution (2015). http://www.datastax.com

  6. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41, 205–220 (2007). ACM

    Article  Google Scholar 

  7. Fan, H., Ramaraju, A., McKenzie, M., Golab, W., Wong, B.: Understanding the causes of consistency anomalies in apache cassandra. Proc. VLDB Endowment 8(7), 810–813 (2015)

    Article  Google Scholar 

  8. George, L.: HBase: The Definitive Guide. O’Reilly Media Inc., Sebastopol (2011)

    Google Scholar 

  9. Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.A.: Bigbench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208. ACM (2013)

    Google Scholar 

  10. Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The hibench benchmark suite: characterization of the mapreduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51. IEEE (2010)

    Google Scholar 

  11. Kuhlenkamp, J., Klems, M., Röss, O.: Benchmarking scalability and elasticity of distributed database systems. Proc. VLDB Endowment 7(13), 1219–1230 (2014)

    Article  Google Scholar 

  12. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  13. Nambiar, R., Poess, M., Dey, A., Cao, P., Magdon-Ismail, T., Bond, A., et al.: Introducing tpcx-hs: the first industry standard for benchmarking big data systems. In: Nambiar, R., Poess, M. (eds.) Performance Characterization and Benchmarking. Traditional to Big Data. LNCS, vol. 8904, pp. 1–12. Springer, Switzerland (2014)

    Google Scholar 

  14. Rabl, T., Gómez-Villamor, S., Sadoghi, M., Muntés-Mulero, V., Jacobsen, H.A., Mankovskii, S.: Solving big data challenges for enterprise application performance management. Proc. VLDB Endowment 5(12), 1724–1735 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marten Rosselli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Rosselli, M., Niemann, R., Ivanov, T., Tolle, K., Zicari, R.V. (2016). Benchmarking the Availability and Fault Tolerance of Cassandra. In: Rabl, T., Nambiar, R., Baru, C., Bhandarkar, M., Poess, M., Pyne, S. (eds) Big Data Benchmarking. WBDB WBDB 2015 2015. Lecture Notes in Computer Science(), vol 10044. Springer, Cham. https://doi.org/10.1007/978-3-319-49748-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49748-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49747-1

  • Online ISBN: 978-3-319-49748-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics