Skip to main content
Log in

A dynamic data dissemination mechanism for Cassandra NoSQL data store

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cassandra as a type of NoSQL databases has been put forward so as to surmount the hurdles of traditional relational databases in the scope of big data as well as real-time applications whose paramount traits are high-speed data production (volume) and miscellaneous data formats (variety). The dynamic nature of distributed data, distributed systems, and their concomitant applications results in skewed data access patterns, thereby causing imbalanced data issues, and in turn consecutive performance deterioration after all. In this study, we have proposed a dynamic data dissemination (D3) strategy well conforming to the dynamic behavior of distributed environment, including diversified as well as temporal popularity of data requests, and heterogeneous node capacity. The assessment results have shed light on performance improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. McAfee A, Brynjolfsson E, Davenport T (2012) Big data: the management revolution. Harv Bus Rev 90:60–68

    Google Scholar 

  2. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19(2):171–209

    Article  Google Scholar 

  3. Zikopoulos P, Eaton C (2011) Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York

    Google Scholar 

  4. Li Y, Manoharan S (2013) A performance comparison of SQL and NoSQL databases. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19

  5. Xhafa F (2018) Special issue on advanced techniques for cloud data storage and collaborative systems. Concurr Comput Pract Exp 30(1):e4373

    Article  MathSciNet  Google Scholar 

  6. Wang X, Qi D, Lin W, Yu M, Zheng Z, Zhou N, Chen P (2018) A general framework for big data knowledge discovery and integration. Concurr Comput Pract Exp 30(13):e4422

    Article  Google Scholar 

  7. Punceva M, Rodero I, Parashar M, Rana OF, Petri I (2015) Incentivising resource sharing in social clouds. Concurr Comput Pract Exp 27(6):1483–1497

    Article  Google Scholar 

  8. Brewer EA (2000) PODC keynote. http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf. Accessed 10 Oct 2009

  9. Huang X, Wang J, Yu PS, Bai J, Zhang J (2017) An experimental study on tuning the consistency of NoSQL systems. Concurr Comput Pract Exp 29(12):e4129

    Article  Google Scholar 

  10. Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2017) Big data technologies: a survey. J King Saud Univ Comput Inf Sci 30:431–448

    Google Scholar 

  11. Corbellini A, Mateos C, Zunino A, Godoy D, Schiaffino S (2017) Persisting big-data: the NoSQL landscape. Inf Syst 63:1–23

    Article  Google Scholar 

  12. Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2013) Comparison and classification of nosql databases for big data. Int J Database Theory Appl 6(4):83–87

    Google Scholar 

  13. Makris A, Tserpes K, Andronikou V, Anagnostopoulos D (2016) A classification of NoSQL data stores based on key design characteristics. Procedia Comput Sci 97:94–103

    Article  Google Scholar 

  14. Ananthanarayanan G, Agarwal S, Kandula S, Greenberg A, Stoica I, Harlan D, Harris E (2011) Scarlett: coping with skewed content popularity in mapreduce clusters. In: Proceedings of the Sixth Conference on Computer Systems, ACM, pp 287–300

  15. Cruz F, Maia F, Oliveira R, Vilaça R (2014) Workload-aware table splitting for NoSQL. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, ACM, pp 399–404

  16. Makris A, Tserpes K, Anagnostopoulos D (2016) Load balancing in in-memory key-value stores for response time minimization. In: International Conference on the Economics of Grids, Clouds, Systems, and Services, Springer, Cham, pp 62–73

    Chapter  Google Scholar 

  17. Fernandez Afonso CE (2016) An elasticity controller for applications orquestrated with Cloudify. Master dissertation

  18. Μπέκας E (2017) Service management in NoSQL data stores via replica-group reconfigurations. Doctoral dissertation

  19. Papaioannou A, Magoutis K (2017) Incremental elasticity for NoSQL data stores. In: IEEE 36th Symposium on Reliable Distributed Systems (SRDS), pp 174–183

  20. Neeraj N (2013) Mastering Apache Cassandra. Packt Publishing Ltd, Birmingham

    Google Scholar 

  21. Cattell R (2011) Scalable SQL and NoSQL data stores. Acm Sigmod Rec 39:12–27

    Article  Google Scholar 

  22. Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 2011 6th International Conference on Pervasive Computing and Applications. IEEE, Port Elizabeth, South Africa

  23. Makris A, Tserpes K, Anagnostopoulos D (2016) Load balancing in in-memory key-value stores for response time minimization. In: International Conference on the Economics of Grids, Clouds, Systems, and Services. Springer, Cham

  24. Gudivada V, Rao D (2014) NoSQL systems for big data management. In 2014 IEEE World Congress on Services, Anchorage, AK, USA

  25. Klein J, Gorton I, Ernst N, Donohoe P, Pham K (2015) Performance evaluation of NoSQL databases: a case study. In: Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems. ACM

  26. Featherston D (2010) Cassandra: principles and application. Department of Computer Science University of Illinois, Urbana

    Google Scholar 

  27. Győrödi C, Győrödi R, Sotoc R (2015) A comparative study of relational and non-relational database models in a web-based application. Int J Adv Comput Sci Appl 6(11):78–83

    Google Scholar 

  28. Cattell R (2011) Scalable SQL and NoSQL data stores. Acm Sigmod Rec 39(4):12–27

    Article  Google Scholar 

  29. Mohring T (2016) Design and implementation of a NoSQL-concept for an international and multicentral clinical database. Doctoral dissertation, Ulm University

  30. Padhy RP, Patra MR, Satapathy SC (2011) RDBMS to NoSQL: reviewing some next-generation non-relational database’s. Int J Adv Eng Sci Technol 11(1):15–30

    Google Scholar 

  31. Rabl T, Gómez-Villamor S, Sadoghi M, Muntés-Mulero V, Jacobsen HA, Mankovskii S (2012) Solving big data challenges for enterprise application performance management. Proc VLDB Endow 5(12):1724–1735

    Article  Google Scholar 

  32. Orend K (2010) Analysis and classification of NoSQL databases and evaluation of their ability to replace an object-relational Persistence Layer. Architecture 1:1–100

    Google Scholar 

  33. Muhammad Y (2011) Evaluation and implementation of distributed NoSQL database for MMO gaming environment. M.S. Thesis, Uppsala University

  34. Floratou A, Teletia N, DeWitt DJ, Patel JM, Zhang D (2012) Can the elephants handle the nosql onslaught? Proc VLDB Endow 5(12):1712–1723

    Article  Google Scholar 

  35. Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International Conference on Cloud and Service Computing (CSC), pp 336–341

  36. Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 44(2):35–40

    Article  Google Scholar 

  37. Carpenter J, Hewitt E (2016) Cassandra: the definitive guide: distributed data at web scale. O’Reilly Media Inc, Sebastopol

    Google Scholar 

  38. Tech DHM, Omar MM (2017) A view on load balancing of NoSQL databases (Couchbase, Cassandra, Neo4j and Voldemort). Int J Adv Res Comput Eng Technol (IJARCET) 6(2)

  39. Konstantinou I, Tsoumakos D, Mytilinis I, Koziris N (2013) DBalancer: distributed load balancing for NoSQL data-stores. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 1037–1040

  40. Pandey SK (2018) An approach to improve load balancing in distributed storage systems for NoSQL databases: MongoDB. In: Pattnaik PK, Rautaray SS, Das H, Nayak J (eds) Progress in computing, analytics and networking. Springer, Singapore, pp 529–538

    Google Scholar 

  41. Cruz F, Maia F, Matos M, Oliveira R, Paulo J, Pereira J, Vilaça R (2013) Met: workload aware elasticity for nosql. In: Proceedings of the 8th ACM European Conference on Computer Systems, pp 183–196

  42. Konstantinou I, Tsoumakos D, Koziris N (2011) Fast and cost-effective online load-balancing in distributed range-queriable systems. IEEE Trans Parallel Distrib Syst 22(8):1350–1364

    Article  Google Scholar 

  43. https://docs.datastax.com/en/developer/java-driver-dse/1.6/manual/load_balancing/

  44. http://cassandra.apache.org/doc/latest/operating/index.html

  45. https://docs.datastax.com/en/dse/6.0/dse-arch/

  46. Khatibi E et al (2012) Dynamic multilevel feedback based searching strategy in unstructured peer-to-peer systems. In: IEEE International Conference on Green Computing and Communications

  47. Dede E, Sendir B, Kuzlu P, Hartog J, Govindaraju M (2013) An evaluation of cassandra for hadoop. In: IEEE Sixth International Conference on Cloud Computing (CLOUD), pp 494–501

  48. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint: arXiv:1301.3781

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seyedeh Leili Mirtaheri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khatibi, E., Mirtaheri, S.L. A dynamic data dissemination mechanism for Cassandra NoSQL data store. J Supercomput 75, 7479–7496 (2019). https://doi.org/10.1007/s11227-019-02959-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-02959-7

Keywords

Navigation