Abstract
Data access latency is an important metric of system performance in data grid. By means of efficient replication strategy, the amount of data transferred in a wide area network will decrease, and the average access latency of data will decrease ultimately. The motivation of our research is to solve the optimized replica distribution problem in a data grid; that is, the system should utilize many replicas for every data with storage constraints to minimize the average access latency of data. This paper proposes a model of replication strategy in federated data grid and gives the optimized solution. The analysis results and simulation results show that the optimized replication strategy proposed in this paper is superior to LRU caching strategy, uniform replication strategy, proportional replication strategy and square root replication strategy in terms of wide area network bandwidth requirement and in the average access latency of data.
Similar content being viewed by others
References
Foster I, Kesselman C. The Grid: Blueprint for a New Computing Infrastructure. Beijing: China Machine Press, 2005, 391–429
Chervenak A, Foster I, Kesselman C, et al. The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications, 2001, 23(3): 187–200
Allcock B, Bester J, Bresnahan J, et al. Data management and transfer in high-performance computational grid environments. Parallel Computing, 2002, 28(5): 749–771
Venugopal S, Buyya R, Ramamohanarao K. A taxonomy of data grids for distributed data sharing, management, and processing. ACM Computing Surveys, 2006, 38(1): 1–53
Bell W, Cameron D, Carvajal-Schiaffino R, et al. Evaluation of an economy-based file replication strategy for a data grid. In: Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid. Washington, DC: IEEE Computer Society, 2003, 661–668
Cameron D, Millar A, Nicholson C, et al. Analysis of scheduling and replica optimisation strategies for data grids using OptorSim. Journal of Grid Computing, 2004, 2(1): 57–69
Otoo E, Rotem D, Romosan A. Optimal file-bundle caching algorithms for data-grids. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing. Washington, DC: IEEE Computer Society, 2004
Iamnitchi A, Doraimani S, Garzoglio G. Filecules in high-energy physics: characteristics and impact on resource management. In: Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing. Washington, DC: IEEE Computer Society, 2006, 69–80
Liu P, Wu J. Optimal replica placement strategy for hierarchical data grid systems. In: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid. Washington, DC: IEEE Computer Society, 2006, 417–420
Lamehamedi H, Shentu Z, Szymanski B, et al. Simulation of dynamic data replication strategies in data grids. In: Proceedings of the International Parallel and Distributed Processing Symposium. Washington, DC: IEEE Computer Society, 2003
Cohen E, Shenker S. Replication strategies in unstructured peer-to-peer networks. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communications. ACM Press: New York, 2002, 177–190
Lv Q, Cao P, Cohen E, et al. Search and replication in unstructured Peer-to-Peer networks. In: Proceedings of the 16th international conference on Supercomputing. ACM Press: New York, 2002, 84–95
Tewari S, Kleinrock L. Optimal search performance in unstructured Peer-to-Peer networks with clustered demands. IEEE Journal on Selected Areas in Communications, 2007, 25(1): 84–95
Faloutsos M, Faloutsos P, Faloutsos C. On power-law relationships of the Internet topology. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communications. ACM Press: New York, 1999, 251–262
Tewari S, Kleinrock L. Proportional replication in Peer-to-Peer network. In: Proceedings of the 25th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 2006
Laoutaris N, Zervas G, Bestavros A, et al. The cache inference problem and its application to content and request routing. In: Proceedings of the 26th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 2007, 848–856
Leff A, Wolf J, Yu P. Replication algorithms in a remote caching architecture. IEEE Transactions on Parallel and Distributed Systems, 1993, 4(11): 1185–1204
Laoutaris N, Telelis O, Zissimopoulos V, et al. Distributed selfish replication. IEEE Transactions on Parallel and Distributed Systems, 2006, 17(12): 1401–1413
Breslau L, Cao P, Fan L, et al. Web caching and zipf-like distributions: evidence and implications. In: Proceedings of the 18th IEEE International Conference on Computer Communications. Washington, DC: IEEE Computer Society, 1999, 126–134
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jiang, J., Yang, G. An optimal replication strategy for data grid systems. Front. Comput. Sc. China 1, 338–348 (2007). https://doi.org/10.1007/s11704-007-0033-0
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11704-007-0033-0