Abstract
Data replication can be used to reduce bandwidth consumption and access latency in the distributed system where users require remote access to large data objects. In this paper, according to the intrinsic characteristic of distributed storage system, the prediction-based parallel replication algorithm PPR is proposed. In the PPR, according to the characteristic of spatial data, the data that will be accessed is predicted, then the data is prefetched; during replication, according to the network state, several replicas of a data object are selected, which are of the least access cost; the different parts of the data object are transferred from these replicas, and they are used to make a new replica. The results of performance evaluation show that the PPR can utilize the network bandwidth efficiently, provide high data replication efficiency and substantially better access efficiency, and can avoid the interference between different replications efficiently.
This work is supported by the National Grand Fundamental Research 973 Program of China (No.2002CB312105), A Foundation for the Author of National Excellent Doctoral Dissertation of PR China (No.200141), and the National Natural Science Foundation of China (No.69903011, No.69933030).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adya, A., Bolosky, W., Castro, M., Cermak, G., Chaiken, R., Douceur, J., Howell, J., Lorch, J., Theimer, M., Wattenhofer, R.: FARSITE: Federated available and reliable storage for incompletely trusted environments. In: 5th Symp on Operating Systems Design and Impl. (December 2002)
Chen, Y., Edler, J., Goldberg, A., Gottlieb, A., Sobti, S., Yianilos, P.: A prototype implementation of archival intermemory. In: Proceedings of the 4th ACM Conference on Digital libraries, Berkeley, CA, August 1999, pp. 28–37 (1999)
Clark, I., Sandberg, O., Wiley, B., Hong, T.F.: A distributed anonymous information storage and retrieval system. In: Proc. of the Workshop on Design Issues in Anonymity and Unobservability, Berkeley, CA, July 2000, pp. 311–320 (2000)
Dabek, F., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Wide-area cooperative storage with CFS. In: Proc. of ACM SOSP (October 2001)
Druschel, P., Rowstron, A.: Storage management and caching in PAST, a largescale, persistent peer-to-peer storage utility. In: Proc. of ACM SOSP (2001)
Kubiatowicz, J., et al: Oceanstore: An architecture for global-scale persistent storage. In: Proc. of ASPLOS 2000. ACM, New York (2000)
Lamehamedi, H., Szymanski, B., shentu, Z., Deelman, E.: Data Replication Strategies in Grid Environments. In: Proc. Of the Fifth International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2002 (2002)
Stockinger, H., Hanushevsky, A.: HTTP Redirection for Replica Catalogue Lookups in Data Grids. In: ACM Symposium on Applied Computing, SAC 2002 (2002)
Carman, M., Zini, F., Serafini, L., Stockinger, K.: Towards an Economy-Based Optimisation of File Access and Replication on a Data Grid. In: Workshop on Agent based Cluster and Grid Computing at Int. Symposium on Cluster Computing and the Grid (CCGrid 2002), Berlin, Germany. IEEE-CS Press, Los Alamitos (2002)
WP2 Optimisation Team. OptorSim - A Replica Optimiser Simulation, http://cern.ch/edg-wp2/optimization/optorsim.html/
Bell, W.H., Cameron, D.G., Capozza, L., Millar, P., Stockinger, K., Zini, F.: Optorsim - a grid simulator for studying dynamic data replication strategies. International Journal of High Performance Computing Applications 17(4) (2003)
Bell, W.H., Cameron, D.G., Capozza, L., Millar, P., Stockinger, K., Zini, F.: Design of a Replica Optimisation Framework. TechnicalReport DataGrid-02-TED- 021215, CERN, Geneva, Switzerland (December 2002)
Bell, W.H., Cameron, D.G., Carvajal-Schiaffino, R., Millar, A.P., Stockinger, K., Zini, F.: Evaluation of an Economy-Based File Replication Strategy for a Data Grid. In: Proceedings of 3nd IEEE Int. Symposium on Cluster Computing and the Grid, CCGrid 2003 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Y., Zhang, X. (2005). A Prediction-Based Parallel Replication Algorithm in Distributed Storage System. In: Zhuge, H., Fox, G.C. (eds) Grid and Cooperative Computing - GCC 2005. GCC 2005. Lecture Notes in Computer Science, vol 3795. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11590354_86
Download citation
DOI: https://doi.org/10.1007/11590354_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30510-1
Online ISBN: 978-3-540-32277-1
eBook Packages: Computer ScienceComputer Science (R0)