Skip to main content
Log in

An optimal replication strategy for data grid systems

  • Research Article
  • Published:
Frontiers of Computer Science in China Aims and scope Submit manuscript

Abstract

Data access latency is an important metric of system performance in data grid. By means of efficient replication strategy, the amount of data transferred in a wide area network will decrease, and the average access latency of data will decrease ultimately. The motivation of our research is to solve the optimized replica distribution problem in a data grid; that is, the system should utilize many replicas for every data with storage constraints to minimize the average access latency of data. This paper proposes a model of replication strategy in federated data grid and gives the optimized solution. The analysis results and simulation results show that the optimized replication strategy proposed in this paper is superior to LRU caching strategy, uniform replication strategy, proportional replication strategy and square root replication strategy in terms of wide area network bandwidth requirement and in the average access latency of data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Foster I, Kesselman C. The Grid: Blueprint for a New Computing Infrastructure. Beijing: China Machine Press, 2005, 391–429

    Google Scholar 

  2. Chervenak A, Foster I, Kesselman C, et al. The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications, 2001, 23(3): 187–200

    Article  Google Scholar 

  3. Allcock B, Bester J, Bresnahan J, et al. Data management and transfer in high-performance computational grid environments. Parallel Computing, 2002, 28(5): 749–771

    Article  Google Scholar 

  4. Venugopal S, Buyya R, Ramamohanarao K. A taxonomy of data grids for distributed data sharing, management, and processing. ACM Computing Surveys, 2006, 38(1): 1–53

    Article  Google Scholar 

  5. Bell W, Cameron D, Carvajal-Schiaffino R, et al. Evaluation of an economy-based file replication strategy for a data grid. In: Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid. Washington, DC: IEEE Computer Society, 2003, 661–668

    Chapter  Google Scholar 

  6. Cameron D, Millar A, Nicholson C, et al. Analysis of scheduling and replica optimisation strategies for data grids using OptorSim. Journal of Grid Computing, 2004, 2(1): 57–69

    Article  Google Scholar 

  7. Otoo E, Rotem D, Romosan A. Optimal file-bundle caching algorithms for data-grids. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing. Washington, DC: IEEE Computer Society, 2004

    Google Scholar 

  8. Iamnitchi A, Doraimani S, Garzoglio G. Filecules in high-energy physics: characteristics and impact on resource management. In: Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing. Washington, DC: IEEE Computer Society, 2006, 69–80

    Google Scholar 

  9. http://eu-datagrid.web.cern.ch/eu-datagrid/

  10. Liu P, Wu J. Optimal replica placement strategy for hierarchical data grid systems. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid. Washington, DC: IEEE Computer Society, 2006, 417–420

    Google Scholar 

  11. Lamehamedi H, Shentu Z, Szymanski B, et al. Simulation of dynamic data replication strategies in data grids. In: Proceedings of the International Parallel and Distributed Processing Symposium. Washington, DC: IEEE Computer Society, 2003

    Google Scholar 

  12. Cohen E, Shenker S. Replication strategies in unstructured peer-to-peer networks. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communications. ACM Press: New York, 2002, 177–190

    Google Scholar 

  13. Lv Q, Cao P, Cohen E, et al. Search and replication in unstructured Peer-to-Peer networks. In: Proceedings of the 16th international conference on Supercomputing. ACM Press: New York, 2002, 84–95

    Chapter  Google Scholar 

  14. Tewari S, Kleinrock L. Optimal search performance in unstructured Peer-to-Peer networks with clustered demands. IEEE Journal on Selected Areas in Communications, 2007, 25(1): 84–95

    Article  Google Scholar 

  15. Faloutsos M, Faloutsos P, Faloutsos C. On power-law relationships of the Internet topology. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communications. ACM Press: New York, 1999, 251–262

    Google Scholar 

  16. Tewari S, Kleinrock L. Proportional replication in Peer-to-Peer network. In: Proceedings of the 25th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 2006

    Google Scholar 

  17. Laoutaris N, Zervas G, Bestavros A, et al. The cache inference problem and its application to content and request routing. In: Proceedings of the 26th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 2007, 848–856

    Chapter  Google Scholar 

  18. Leff A, Wolf J, Yu P. Replication algorithms in a remote caching architecture. IEEE Transactions on Parallel and Distributed Systems, 1993, 4(11): 1185–1204

    Article  Google Scholar 

  19. Laoutaris N, Telelis O, Zissimopoulos V, et al. Distributed selfish replication. IEEE Transactions on Parallel and Distributed Systems, 2006, 17(12): 1401–1413

    Article  Google Scholar 

  20. Breslau L, Cao P, Fan L, et al. Web caching and zipf-like distributions: evidence and implications. In: Proceedings of the 18th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 1999, 126–134

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang Jianjin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, J., Yang, G. An optimal replication strategy for data grid systems. Front. Comput. Sc. China 1, 338–348 (2007). https://doi.org/10.1007/s11704-007-0033-0

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-007-0033-0

Keywords

Navigation