Abstract
Cassandra as a type of NoSQL databases has been put forward so as to surmount the hurdles of traditional relational databases in the scope of big data as well as real-time applications whose paramount traits are high-speed data production (volume) and miscellaneous data formats (variety). The dynamic nature of distributed data, distributed systems, and their concomitant applications results in skewed data access patterns, thereby causing imbalanced data issues, and in turn consecutive performance deterioration after all. In this study, we have proposed a dynamic data dissemination (D3) strategy well conforming to the dynamic behavior of distributed environment, including diversified as well as temporal popularity of data requests, and heterogeneous node capacity. The assessment results have shed light on performance improvement.
Similar content being viewed by others
References
McAfee A, Brynjolfsson E, Davenport T (2012) Big data: the management revolution. Harv Bus Rev 90:60–68
Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19(2):171–209
Zikopoulos P, Eaton C (2011) Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York
Li Y, Manoharan S (2013) A performance comparison of SQL and NoSQL databases. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19
Xhafa F (2018) Special issue on advanced techniques for cloud data storage and collaborative systems. Concurr Comput Pract Exp 30(1):e4373
Wang X, Qi D, Lin W, Yu M, Zheng Z, Zhou N, Chen P (2018) A general framework for big data knowledge discovery and integration. Concurr Comput Pract Exp 30(13):e4422
Punceva M, Rodero I, Parashar M, Rana OF, Petri I (2015) Incentivising resource sharing in social clouds. Concurr Comput Pract Exp 27(6):1483–1497
Brewer EA (2000) PODC keynote. http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf. Accessed 10 Oct 2009
Huang X, Wang J, Yu PS, Bai J, Zhang J (2017) An experimental study on tuning the consistency of NoSQL systems. Concurr Comput Pract Exp 29(12):e4129
Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2017) Big data technologies: a survey. J King Saud Univ Comput Inf Sci 30:431–448
Corbellini A, Mateos C, Zunino A, Godoy D, Schiaffino S (2017) Persisting big-data: the NoSQL landscape. Inf Syst 63:1–23
Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2013) Comparison and classification of nosql databases for big data. Int J Database Theory Appl 6(4):83–87
Makris A, Tserpes K, Andronikou V, Anagnostopoulos D (2016) A classification of NoSQL data stores based on key design characteristics. Procedia Comput Sci 97:94–103
Ananthanarayanan G, Agarwal S, Kandula S, Greenberg A, Stoica I, Harlan D, Harris E (2011) Scarlett: coping with skewed content popularity in mapreduce clusters. In: Proceedings of the Sixth Conference on Computer Systems, ACM, pp 287–300
Cruz F, Maia F, Oliveira R, Vilaça R (2014) Workload-aware table splitting for NoSQL. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, ACM, pp 399–404
Makris A, Tserpes K, Anagnostopoulos D (2016) Load balancing in in-memory key-value stores for response time minimization. In: International Conference on the Economics of Grids, Clouds, Systems, and Services, Springer, Cham, pp 62–73
Fernandez Afonso CE (2016) An elasticity controller for applications orquestrated with Cloudify. Master dissertation
Μπέκας E (2017) Service management in NoSQL data stores via replica-group reconfigurations. Doctoral dissertation
Papaioannou A, Magoutis K (2017) Incremental elasticity for NoSQL data stores. In: IEEE 36th Symposium on Reliable Distributed Systems (SRDS), pp 174–183
Neeraj N (2013) Mastering Apache Cassandra. Packt Publishing Ltd, Birmingham
Cattell R (2011) Scalable SQL and NoSQL data stores. Acm Sigmod Rec 39:12–27
Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 2011 6th International Conference on Pervasive Computing and Applications. IEEE, Port Elizabeth, South Africa
Makris A, Tserpes K, Anagnostopoulos D (2016) Load balancing in in-memory key-value stores for response time minimization. In: International Conference on the Economics of Grids, Clouds, Systems, and Services. Springer, Cham
Gudivada V, Rao D (2014) NoSQL systems for big data management. In 2014 IEEE World Congress on Services, Anchorage, AK, USA
Klein J, Gorton I, Ernst N, Donohoe P, Pham K (2015) Performance evaluation of NoSQL databases: a case study. In: Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems. ACM
Featherston D (2010) Cassandra: principles and application. Department of Computer Science University of Illinois, Urbana
Győrödi C, Győrödi R, Sotoc R (2015) A comparative study of relational and non-relational database models in a web-based application. Int J Adv Comput Sci Appl 6(11):78–83
Cattell R (2011) Scalable SQL and NoSQL data stores. Acm Sigmod Rec 39(4):12–27
Mohring T (2016) Design and implementation of a NoSQL-concept for an international and multicentral clinical database. Doctoral dissertation, Ulm University
Padhy RP, Patra MR, Satapathy SC (2011) RDBMS to NoSQL: reviewing some next-generation non-relational database’s. Int J Adv Eng Sci Technol 11(1):15–30
Rabl T, Gómez-Villamor S, Sadoghi M, Muntés-Mulero V, Jacobsen HA, Mankovskii S (2012) Solving big data challenges for enterprise application performance management. Proc VLDB Endow 5(12):1724–1735
Orend K (2010) Analysis and classification of NoSQL databases and evaluation of their ability to replace an object-relational Persistence Layer. Architecture 1:1–100
Muhammad Y (2011) Evaluation and implementation of distributed NoSQL database for MMO gaming environment. M.S. Thesis, Uppsala University
Floratou A, Teletia N, DeWitt DJ, Patel JM, Zhang D (2012) Can the elephants handle the nosql onslaught? Proc VLDB Endow 5(12):1712–1723
Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International Conference on Cloud and Service Computing (CSC), pp 336–341
Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 44(2):35–40
Carpenter J, Hewitt E (2016) Cassandra: the definitive guide: distributed data at web scale. O’Reilly Media Inc, Sebastopol
Tech DHM, Omar MM (2017) A view on load balancing of NoSQL databases (Couchbase, Cassandra, Neo4j and Voldemort). Int J Adv Res Comput Eng Technol (IJARCET) 6(2)
Konstantinou I, Tsoumakos D, Mytilinis I, Koziris N (2013) DBalancer: distributed load balancing for NoSQL data-stores. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 1037–1040
Pandey SK (2018) An approach to improve load balancing in distributed storage systems for NoSQL databases: MongoDB. In: Pattnaik PK, Rautaray SS, Das H, Nayak J (eds) Progress in computing, analytics and networking. Springer, Singapore, pp 529–538
Cruz F, Maia F, Matos M, Oliveira R, Paulo J, Pereira J, Vilaça R (2013) Met: workload aware elasticity for nosql. In: Proceedings of the 8th ACM European Conference on Computer Systems, pp 183–196
Konstantinou I, Tsoumakos D, Koziris N (2011) Fast and cost-effective online load-balancing in distributed range-queriable systems. IEEE Trans Parallel Distrib Syst 22(8):1350–1364
https://docs.datastax.com/en/developer/java-driver-dse/1.6/manual/load_balancing/
Khatibi E et al (2012) Dynamic multilevel feedback based searching strategy in unstructured peer-to-peer systems. In: IEEE International Conference on Green Computing and Communications
Dede E, Sendir B, Kuzlu P, Hartog J, Govindaraju M (2013) An evaluation of cassandra for hadoop. In: IEEE Sixth International Conference on Cloud Computing (CLOUD), pp 494–501
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint: arXiv:1301.3781
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Khatibi, E., Mirtaheri, S.L. A dynamic data dissemination mechanism for Cassandra NoSQL data store. J Supercomput 75, 7479–7496 (2019). https://doi.org/10.1007/s11227-019-02959-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02959-7