Skip to main content
Log in

Modeling a Dynamic Data Replication Strategy to Increase System Availability in Cloud Computing Environments

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Failures are normal rather than exceptional in the cloud computing environments. To improve system availability, replicating the popular data to multiple suitable locations is an advisable choice, as users can access the data from a nearby site. This is, however, not the case for replicas which must have a fixed number of copies on several locations. How to decide a reasonable number and right locations for replicas has become a challenge in the cloud computing. In this paper, a dynamic data replication strategy is put forward with a brief survey of replication strategy suitable for distributed computing environments. It includes: 1) analyzing and modeling the relationship between system availability and the number of replicas; 2) evaluating and identifying the popular data and triggering a replication operation when the popularity data passes a dynamic threshold; 3) calculating a suitable number of copies to meet a reasonable system byte effective rate requirement and placing replicas among data nodes in a balanced way; 4) designing the dynamic data replication algorithm in a cloud. Experimental results demonstrate the efficiency and effectiveness of the improved system brought by the proposed strategy in a cloud.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Foster I, Zhao Y, Raicu I, Lu S Y. Cloud computing and grid computing 360-degree compared. In Proc. Grid Computing Environments Workshop, Austin, TX, USA, Nov. 12-16, 2008, pp.1–10.

  2. Buyya R, Yeo C S, Venugopal S, Broberg J, Brandic I. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, 2009, 25(6): 599–616.

    Article  Google Scholar 

  3. Armbrust M, Fox A, Griffith R, Joseph A D, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50–58.

    Article  Google Scholar 

  4. Mell P, Grance T. The NIST definition of cloud computing. Communications of the ACM, 2010, 53(6): 50.

    Google Scholar 

  5. Iosup A, Ostermann S, Yigitbasi N, Prodan R, Fahringer T, Epema D H J. Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(6): 931–945.

    Article  Google Scholar 

  6. Han Y B, Sun J Y, Wang G L, Li H F. A cloud-based BPM architecture with user-end distribution of non-compute-intensive activities and sensitive data. Journal of Computer Science and Technology, 2010, 25(6): 1157–1167.

    Article  Google Scholar 

  7. Wang H. Privacy-preserving data sharing in cloud computing. Journal of Computer Science and Technology, 2010, 25(3): 401–414.

    Article  Google Scholar 

  8. He K Q, Wang J A, Liang P. Semantic interoperability aggregation in service requirements refinement. Journal of Computer Science and Technology, 2010, 25(6): 1103–1117.

    Article  Google Scholar 

  9. Xu B M, Zhao C Y, Hu E Z, Hu B. Job scheduling algorithm based on Berger model in cloud environment. Advances in Engineering Software, 2011, 42(7): 419–425.

    Article  Google Scholar 

  10. Ghemawat S, Gobioff H, Leung S T. The Google file system. ACM SIGOPS Operating Systems Review, 2003, 37(5): 29–43.

    Article  Google Scholar 

  11. Shvachko K, Hairong K, Radia S, Chansler R. The Hadoop distributed file system. In Proc. the 26th Symposium on Mass Storage Systems and Technologies, Incline Village, NV, USA, May 3-7, 2010, pp.1–10.

  12. Wang S S, Yan K Q, Wang S C. Achieving efficient agreement within a dual-failure cloud-computing environment. Expert System with Applications, 2010, 38(1): 906–915.

    Article  Google Scholar 

  13. Chang R S, Chang H P. A dynamic data replication strategy using access-weights in data grids. Journal of Supercomputing, 2008, 45(3): 277–295.

    Article  Google Scholar 

  14. Kim Y H, Jung M J, Lee C H. Energy-aware real-time task scheduling exploiting temporal locality. IEICE Transactions on Information and Systems, 2010, 93(5): 1147–1153.

    Article  Google Scholar 

  15. Wei Q, Veeravalli B, Gong B, Zeng L, Feng D. CDRM: A cost-effective dynamic replication management scheme for cloud storage cluster. In Proc. 2010 IEEE International Conference on Cluster Computing, Heraklion, Crete, Greece, Sept. 20-24, 2010, pp.188–196.

  16. Bonvin N, Papaioannou T G, Aberer K. A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In Proc. the 1st ACM Symposium on Cloud Computing, Indianapolis, IN, USA, June 10-11, 2010, pp.205–216.

  17. Nguyen T, Cutway A, Shi W. Differentiated replication strategy in data centers. In Proc. the IFIP International Conference on Network and Parallel Computing, Zhengzhou, China, Sept. 13-15, 2010, pp.277–288.

  18. Mckusick M, Quinlan S. GFS: Evolution on fast-forward. Communications of the ACM, 2010, 53(3): 42–47.

    Article  Google Scholar 

  19. Ahmad N, Fauzi A A C, Sidek R M, Zin N M, Beg A H. Lowest data replication storage of binary vote assignment data grid. In Proc. the 2nd International Conference Networked Digital Technologies, Prague, Czech Republic, July 7-9, 2010, pp.466–473.

  20. Rahman R M, Barker K, Alhajj R. Replica placement design with static optimality and dynamic maintainability. In Proc. the 6th IEEE International Symposium on Cluster Computing and the Grid, Singapore, May 16-19, 2006, pp.434–437.

  21. Dogan A. A study on performance of dynamic file replication algorithms for real-time file access in data grids. Future Generation Computer Systems, 2009, 25(8): 829–839.

    Article  Google Scholar 

  22. Lei M, Vrbsky S V, Hong X. An on-line replication strategy to increase availability in data grids. Future Generation Computer Systems, 2008, 24(2): 85–98.

    Article  MATH  Google Scholar 

  23. Litke A, Skoutas D, Tserpes K, Varvarigou T. Efficient task replication and management for adaptive fault tolerance in mobile grid environments. Future Generation Computer Systems, 2007, 23(2): 163–178.

    Article  Google Scholar 

  24. Dobber M, van der Mei R, Koole G. Dynamic load balancing and job replication in a global-scale grid environment: A comparison. IEEE Transactions on Parallel and Distributed Systems, 2009, 20(2): 207–218.

    Article  Google Scholar 

  25. Yuan D, Yang Y, Liu X, Chen J. A data placement strategy in scientific cloud workflows. Future Generation Computer Systems, 2010, 26(8): 1200–1214.

    Article  Google Scholar 

  26. Rood B, Lewis M J. Grid resource availability prediction-based scheduling and task replication. Journal of Grid Computing, 2009, 7(4): 479–500.

    Article  Google Scholar 

  27. Latip R, Othman M, Abdullah A, Ibrahim H, Md Sulaiman N. Quorum-based data replication in grid environment. International Journal of Computational Intelligence Systems, 2009, 2(4): 386–397.

    Google Scholar 

  28. Avizienis A, Laprie J C, Randell B R, Landwehr C. Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 2004, 1(1): 11–33.

    Article  Google Scholar 

  29. Al-Kuwaiti M, Kyriakopoulos N, Hussein S. A comparative analysis of network dependability, fault-tolerance, reliability, security, and survivability. IEEE Communications Surveys & Tutorials, 2009, 11(2): 106–124.

    Article  Google Scholar 

  30. Ray I, Ray I, Chakraborty S. An interoperable context sensitive model of trust. Journal of Intelligent Information Systems, 2009, 32(1): 75–104.

    Article  Google Scholar 

  31. Tu M, Li P, Yen I L, Thuraisingham B M, Khan L. Secure data objects replication in data grid. IEEE Transactions on Dependable and Secure Computing, 2010, 7(1): 50–64.

    Article  Google Scholar 

  32. Wang J Y, Jea K F. A near-optimal database allocation for reducing the average waiting time in the grid computing environment. Information Sciences, 2009, 179(21): 3772–3790.

    Article  MathSciNet  MATH  Google Scholar 

  33. Jung D, Chin S H, Chung K S, Suh T, Yu H C, Gil J M. An effective job replication technique based on reliability and performance in mobile grids. InProc. the 5th International Conference Advances in Grid and Pervasive Computing, Hualien, Taiwan, China, May 10-13, 2010, pp.47–58.

  34. Buyya R, Ranjan R, Calheiros R N. Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: Challenges and opportunities. In Proc. 2009 International Conference on High Performance Computing & Simulation, Leipzig, Germany, June 21-24, 2009, pp.1–11.

  35. Belalem G, Tayeb F Z, Zaoui W. Approaches to improve the resources management in the simulator CloudSim. In Proc. the 1st International Conference Information Computing and Applications, Tangshan, China, Oct. 15-18, 2010, pp.189–196.

  36. Calheiros R N, Ranjan R, Beloglazov A, De Rose C A F, Buyya R. CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software-Practice & Experience, 2011, 41(1): 23–50.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xing-Wei Wang.

Additional information

Supported by the National Natural Science Foundation of China under Grant Nos. 61070162, 71071028 and 70931001, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant Nos. 20110042110024 and 20100042110025, the Fundamental Research Funds for the Central Universities of China under Grant Nos. N100604012, N090504003 and N090504006.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 108 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, DW., Chang, GR., Gao, S. et al. Modeling a Dynamic Data Replication Strategy to Increase System Availability in Cloud Computing Environments. J. Comput. Sci. Technol. 27, 256–272 (2012). https://doi.org/10.1007/s11390-012-1221-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1221-4

Keywords

Navigation