Abstract
Data replication is widely used in cloud storage and data grid to improve the parallel service efficiency and the performance of system, which can promote the file availability and system load balancing, reducing the response time with multiple copies. But high volume of big data gives a new, enormous and rigorous challenge to data storage and business access of cloud storage, specially to the quality of cloud services. In this paper, a novel dynamic predicted replication strategy (DPRS) combining with the access frequency of files and prediction method is proposed to predict the future access of each file and calculating the optional number of replicas based on the real access and future access periodically. The experiment results show that DPRS can availably decrease the response time of a file request and reduce the additional cost of the cloud storage system simultaneously.
Similar content being viewed by others
References
Goli-Malekabadi Z, Sargolzaei-Javan M, Akbari MK (2016) An effective model for store and retrieve big health data in cloud computing. Comput Methods Programs Biomed 132:75–82
Chang V, Wills G (2016) A model to compare cloud and non-cloud storage of big data. Future Gen Comput Syst 57:56–76
Liu K, Jiang Dong L (2012) Research on cloud data storage technology and its architecture implementation. In: Procedia Engineering. 2012 International Workshop on Information and Electronics Engineering, vol 29, pp 133–137
Januzaj Y, Ajdari J, Selimi B (2015) DBMS as a cloud service: advantages and disadvantages. In: Procedia—Social and Behavioral Sciences, 2015 World Conference on Technology, Innovation and Entrepreneurship, vol 195, pp 1851–1859
Ubaidillah SHSA, Noraziah A (2017) Overview of replication techniques on distributed database in cloud environment. Adv Sci Lett 23(11):11105–11108
Gudadhe MB, Agrawal AJ (2017) Performance analysis survey of data replication strategies in cloud environment. In: International Conference on Big Data Research. ACM
Luo Y, Luo S, Guan J, Zhou S (2013) A RAMCloud storage system based on hdfs: architecture, implementation and evaluation. J Syst Softw 86:744–750
Long SQ, Zhao YL, Chen W (2014) MORM: a multi-objective optimized replication management strategy for cloud storage cluster. J Syst Architect 60:234–244
Amjad T, Sher M, Daud A (2012) A survey of dynamic replication strategies for improving data availability in data grids. Future Gen Comput Syst 28:337–349
Tos U, Mokadem R, Hameurlain A, Ayav T, Bora S (2015) Dynamic replication strategies in data grid systems: a survey. J Supercomput 71:4116–4140
Grace RK, Manimegalai R (2014) Dynamic replica placement and selection strategies in data grids a comprehensive survey. J Parallel Distrib Comput 74:2099–2108
Dogra N, Singh S (2015) A survey of dynamic replication strategies in distributed systems. Int J Comput Appl 110:1–4
Lee MC, Leu FY, ping Chen Y (2012) PFRF: An adaptive data replication algorithm based on star-topology data grids. Future Gen Comput Syst 28:1045–1057
Mansouri N, Rafsanjani MK, Javidi MM (2017) DPRS: a dynamic popularity aware replication strategy with parallel download scheme in cloud environments. Simul Model Pract Theory 77:177–196
SonglingFu LH, XiangkeLiao CH (2016) Developing the cloud-integrated data replication framework indecentralized online social networks. J Comput Syst Sci 82:113–129
Mansouri N, Dastghaibyfard GH (2012) A dynamic replica management strategy in data grid. J Netw Comput Appl 35:1297–1303
Wang T, Yao S, Xu Z (2017) Dynamic replication to reduce access latency based on fuzzy logic system. Comput Electr Eng 60:48–57
Houdt BV (2014) On the necessity of hot and cold data identification to reduce the write amplification in hash-based SSDs. Perform Eval 82:1–14
Chen L, Qiu M, Song J, Xiong Z, Hassan H (2016) E2FS: an elastic storage system for cloud computing. Journal of Supercomputer. 74(3):1045–1060. https://doi.org/10.1007/s11227-016-1827-3
Li R, Feng W, Wu H, Huang Q (2014) A replication strategy for a distributed high-speed caching system based on spatiotemporal access patterns of geospatial data. Comput Environ Urban Syst 34:3231–3242
Xu X, Wang S, Yao K, Zhou X (2012) Research on the strategy of FLDC replication dynamically created in cloud storage, vol 9, pp 2815–2818
Mansouri N, Kuchaki Rafsanjani M, Javidi MM (2017) DPRS: a dynamic popularity aware replication strategy with parallel download scheme in cloud environments. Simul Model Pract Theory 77:177–196
Fu S, He L, Liao X, Huang C (2016) Developing the cloud-integrated data replication framework in decentralized online social networks. J Comput Syst Sci 82:113–129
Boru D, Kliazovich D, Granelli F, Bouvry P, Zomaya AY (2013) Energy-effcient data replication in cloud computing datacenters. Clust Comput 18:446–451
Villalpando LEB, April A, Abran A (2014) Performance analysis model for big data applications in cloud computing. J Cloud Comput 3:1C20
Zhang L, Deng Y, Zhu W, Zhou J, Wang F (2015) Skewly replicating hot data to construct a power-effcient storage cluster. J Netw Comput Appl 50:168–179
Gill NK, Sarbjeet S (2016) A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gen Comput Syst 65:10–32
Hashem IAT, Anuar NB, Marjani M (2017) Multi-objective scheduling of MapReduce jobs in big data processing. Multimed Tools Appl 1:1–16
Nachiappan R, Javadi B, Calherios R (2017) Cloud storage reliability for big data applications: a state of the art survey. J Netw Comput Appl 97:35–47
Pan S, Xu Z, Meng Q (2017) A combination replication strategy for data-intensive services in distributed geographic information system. Int J Distrib Sens Netw 13(5):1550147717707112
Xie F, Yan J, Shen J (2017) Towards cost reduction in cloud-based workflow management through data replication. In: International Conference on Advanced Cloud and Big Data. IEEE, pp 94–99
Guo C, Li Y, Wu Z (2017) SLA-DO: A SLA-based data distribution strategy on multiple cloud storage systems. In: IEEE, International Conference on Parallel and Distributed Systems. IEEE, pp 602–609
Nivetha N K, Vijayakumar D (2016) Modeling fuzzy based replication strategy to improve data availability in cloud datacenter. In: International Conference on Computing Technologies and Intelligent Data Engineering. IEEE, pp 1-6
Milani BA, Navimipour NJ (2016) A comprehensive review of the data replication techniques in the cloud environments. Academic Press Ltd., Cambridge
Mansouri N (2016) QDR: a QoS-aware data replication algorithm for data grids considering security factors. Clust Comput 19(3):1–17
Guerrero C, Lera I, Juiz C (2018) Migration-aware genetic optimization for MapReduce scheduling and replica placement in hadoop. J Grid Comput 2:1–20
Acknowledgements
The authors would like to thank the Chongqing Basic and Frontier Research Project under Grant Nos. cstc2017jcyjA0818. The work is partly funded by the National Nature Science Foundation of China (No.61602073, 61672004).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
He, L., Qian, Z. & Shang, F. A novel predicted replication strategy in cloud storage. J Supercomput 76, 4838–4856 (2020). https://doi.org/10.1007/s11227-018-2647-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2647-4