Skip to main content
Log in

Dynamic decision-making strategy of replica number based on data hot

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

As a fast-rising storage model in recent years, cloud storage has adopted a “pay-as-you-go” approach to provide users with highly reliable, highly available, low-cost, and secure storage services that have received widespread attention and use within enterprises and individuals. Data replica management technology, as an essential part of cloud storage systems, has irreplaceable advantages in improving cluster fault tolerance and availability, so it has become the focus of many experts and scholars. It replicates multiple data blocks and places them in various nodes in the cluster; this makes the data more secure and reliable and improves the access rate while ensuring system load balance. Data replica technology runs through the process from replica creation to consistency maintenance. Each part of it has an essential impact on the performance of the cloud storage system. This article focuses on the dynamic decision of the number of data replicas in the cloud storage system. Considering the shortcomings of the static replica strategy, a dynamic decision strategy for the number of replicas based on the popularity of the data is proposed. By using the gray prediction model GM (1, 1) to predict the future data access frequency and using the Markov model to modify the prediction result, the data can be divided into hot data and non-hot data according to the predicted value, thereby determining the data replica number. Finally, through simulation experiments, the experimental results of the static replica strategy and the data hot-based replica number dynamic decision strategy are compared and analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Chang W, Wang P (2019) Write-aware replica placement for cloud computing. IEEE J Sel Areas Commun 37(3):656–667

    Article  Google Scholar 

  2. Fu X, Li J, Liu W, Deng S and Wang J, (2019) Data replica placement policy based on load balance in cloud storage system. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, China, pp 682–685

  3. Ibrahim IA and Bassiouni M (2019) Improvement of data throughput in data-intensive cloud computing applications. In: 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (Big Data Service), Newark, CA, USA, pp 49–53

  4. Aral A, Ovatman T (2018) A decentralized replica placement algorithm for edge computing. IEEE Trans Netw Serv Manage 15(2):516–529

    Article  Google Scholar 

  5. Cui L, Zhang J, Yue L, Shi Y, Li H, Yuan D (2018) A genetic algorithm based data replica placement strategy for scientific applications in clouds. IEEE Trans Serv Comput 11(4):727–739

    Article  Google Scholar 

  6. Guerrero C, Lera I, Bermejo B, Juiz C (2018) Multi-objective optimization for virtual machine allocation and replica placement in virtualized hadoop. IEEE Trans Parallel Distrib Syst 29(11):2568–2581

    Article  Google Scholar 

  7. L. Zhang et al., 2018 Cost-effective and traffic-optimal data placement strategy for cloud-based online social networks. In: 2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design (CSCWD), Nanjing, pp 110–115

  8. Ping Z, Shengxi L (2018) Research on optimization of HDFS dynamic copy factor. Comput Technol Develop 28(7):68–72

    Google Scholar 

  9. Gongli LI, Xiaoyan Z, Hui L (2015) A cloud computing data copy dynamic management strategy. J Henan Norm Univ Nat Sci Edit 43(4):138–143

    Google Scholar 

  10. Song Z, Qingwei D, Jing S et al (2015) Cloud-based hot data copy factor decision algorithm based on prediction. Comput Mod 2:62–66

    Google Scholar 

  11. Lili X, Hong L, Jin L (2019) Research on population prediction based on grey prediction and radial basis network. Comput Sci 46(S1):431–435

    Google Scholar 

  12. Bingzhou W, Ruixia S (2016) Forecast of my country’s energy demand based on combined model. Math Pract Theory 46(20):45–53

    MathSciNet  Google Scholar 

  13. Jiayue L, Bo Z, Xiang LI et al (2020) Anomaly prediction method of network traffic based on deep learning. Comput Eng Appl 56(6):39–50

    Google Scholar 

  14. Fuzhong L (2019) Prediction and simulation of damage rate of high energy consumption equipment in physics laboratory. Comput Simul 36(3):235–238

    Google Scholar 

  15. Liu Y Wu CQ Wang M Hou A and Wang Y (2018) On a Dynamic data placement strategy for heterogeneous hadoop clusters. In: 2018 International Symposium on Networks, Computers and Communications (ISNCC), Rome pp 1-7

  16. Zhao Y, Li C, Li L and Zhang P, (2017) Dynamic replica creation strategy based on file hot and node load in hybrid cloud. In: 2017 19th International Conference on Advanced Communication Technology (ICACT), Bongpyeong, pp 213–220

  17. Qu K Meng L Yang Y (2016) A dynamic replica strategy based on Markov model for hadoop distributed file system (HDFS). In: 2016 4th International Conference on Cloud Computing and Intelligence Systems (CCIS), Beijing, pp 337–342

  18. Liu X, Hu Z, Pan S (2016) Control strategy for the number of replica in smart city cloud stroage system. Geomat Inform Sci Wuhan Univ 41(9):1205–1210

    Google Scholar 

  19. Huo L, Yi R. (2015) Research on replica strategy in cloud storage System[C]. In: 2015 International Conference on Computer Science and Applications (CSA). IEEE

  20. Cai CX, ABAD CL, (2013) Campbell RH. Storage-efficient data replica number computation for multi-level priority data in distributed storage systems[C]. In: Dependable Systems and Networks Workshop (DSN-W). 2013 43rd Annual IEEE/IFIP Conference on IEEE

  21. Guo L, Yang S, Wang S (2012) Replica deletion strategy based on gray prediction theory and cost in P2P network[C]. In: Computer Science Service System (CSSS). 2012 International Conference on IEEE

  22. Deng M and Dong Y (2019) Application of improved grey GM (1, 1) model in power prediction of wind farm. In: 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, pp 3764–3769

  23. Liu S, Lin C and Yang Y, (2017) Several problems need to be studied in grey system theory. In: 2017 International Conference on Grey Systems and Intelligent Services (GSIS), Stockholm, pp 1–3

  24. Xu H Wang G Luo L Lei M (2018)The design of reliability simulation of cloud system in the cloudsim. In: 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, pp 215–219

  25. He Q, Zhang F, Bian G et al (2022) Real-time network virtualization based on SDN and docker container. Cluster Comput. https://doi.org/10.1007/s10586-022-03731-y

    Article  Google Scholar 

  26. He Q, Bian G, Zhang W et al (2022) RTFTL: design and implementation of real-time FTL algorithm for flash memory. J Super Comput 78:18959–18993

    Article  Google Scholar 

  27. Scheid EJ, Rodrigues BB, Granville LZ and Stiller B (2019) Enabling dynamic SLA compensation using blockchain-based smart contracts, 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Arlington, VA, USA, pp 53-61

  28. Dahdouh K, Dakkak A, Oughdir L et al (2019) Large-scale e-learning recommender system based on Spark and Hadoop. J Big Data 6(1):1–23

    Article  Google Scholar 

  29. Tao XU, Yuanyuan SUN (2020) LU Min Airline passenger flow forecast based on grey neural network. Comput Appl Softw 37(1):31–36

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank anonymous referees for their invaluable suggestions and comments. This work is supported by the National Natural Science Foundation of China (61872284); Industrial field of general projects of science and Technology Department of Shaanxi Province (2023-YBGY-203); Industrialization Project of Shaanxi Provincial Department of Education (21JC017); "Thirteenth Five-Year" National Key R&D Program Project (Project Number: 2019YFD1100901); Yulin science and technology project(2019-175); Natural Science Foundation of Shannxi Province, China(2014JM2-6127); The project sponsored by the scientific research Foundation for the returned overseas Chinese scholars, SEM No. [2014] 1685.

Author information

Authors and Affiliations

Authors

Contributions

Qinlu He and Zhen Li contributed significantly to Design Algorithm. Genqing Bian and Weiqi Zhang performed the perfor analysis and wrote the manuscript. Fan Zhang and Chen Chen contributed to manuscript preparation.

Corresponding author

Correspondence to Qinlu He.

Ethics declarations

Competing interests

The authors declare no competing interests.

Conflict of interest

I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethics approval and consent to participate

The authors declare that they have no conflicts of interest.

Consent for publication

I have read and understood the publishing policy, and submit this manuscript in accordance with this policy. The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration (from you or one of your Contributing Authors) by another publisher.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, Q., Zhang, F., Bian, G. et al. Dynamic decision-making strategy of replica number based on data hot. J Supercomput 79, 9584–9603 (2023). https://doi.org/10.1007/s11227-022-05029-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-05029-7

Keywords

Navigation