Abstract
Data Grid integrates geographically distributed resources for solving data sensitive scientific applications. Dynamic data replication algorithms are becoming increasingly valuable in solving large-scale, realistic, difficult problems, and selecting replica with multiple selection criteria—availability, security and time- is one of these problems. The current algorithms do not offer balanced QoS levels and the mechanism of rating QoS parameters. In this paper, we propose a new replica selection strategy, which based on response time and security. However, replication should be used wisely because the storage size of each Data Grid site is limited. Thus, the site must keep only the important replicas. We also present a new replica replacement strategy based on the availability of the file, the last time the replica was requested, number of access, and size of replica. We evaluate our algorithm using the OptorSim simulator and find that it offers better performance in comparison with other algorithms in terms of mean job execution time, effective network usage, SE usage, replication frequency, and hit ratio.
Similar content being viewed by others
References
Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: towards architecture for the distributed management and analysis of large scientific datasets. J. Netw. Comput. Appl. 23, 187–200 (2001)
Rahmani, A.M., Fadaie, Z., Chronopoulos, A.T.: Data placement using Dewey Encoding in a hierarchical data grid. J. Netw. Comput. Appl. 49, 88–98 (2015)
Grace, R.K., Manimegalai, R.: Dynamic replica placement and selection strategies in data grids—a comprehensive survey. J. Parallel Distrib. Comput. 74(2), 2099–2108 (2014)
Torkestani, J.A.: A new approach to the job scheduling problem in computational grids. Cluster Comput. 15, 201–201 (2011)
Pinel, F., Dorronsoro, B., Pecero, J.E., Bouvry, P., Khan, S.U.: A two-phase heuristic for the energy-efficient scheduling of independent tasks on computational grids. Cluster Comput. 16, 421–433 (2012)
Gallicchio, E., Xia, J., Flynn, W.F., Zhang, B., Samlalsingh, S., Mentes, A., Levy, R.M.: Asynchronous replica exchange software for grid and heterogeneous computing. Comput. Phys. Commun. 196, 236–246 (2015)
Wu, J.J., Lin, Y.F., Liu, P.: Optimal replica placement in hierarchical data grids with locality assurance. J. Parallel Distrib. Comput. 68, 1517–1538 (2008)
Skałkowski, K., Słota, R., Krol, D., Kitowski, J.: QoS-based storage resources provisioning for grid applications. Future Gener. Comput. Syst. 29, 713–727 (2013)
Holtman, K.: CMS requirements for the grid. In: Proceedings of 2001 Conference on Computing in High Energy Physics (2001)
Jianjin, J., Guangwen, Y.: An optimal replication strategy for data grid systems. Front Comput. Sci. China 1, 338–348 (2007)
Amjad, T., Sher, M., Daud, A.: A survey of dynamic replication strategies for improving data availability in data grids. Future Gener. Comput. Syst. 28, 337–349 (2012)
Nong, X., Wei, F., XiCheng, L.: QoS-awared replica placement techniques in data grid applications. Sci. China Inf. Sci. 53, 1487–1496 (2010)
Taheri, J., Lee, Y.C., Zomaya, A., Siegel, H.J.: A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput. Oper. Res. 40, 1564–1578 (2013)
He, J., Zhang, Y., Huang, G., Shi, Y., Cao, J.: Distributed data possession checking for securing multiple replicas in geographically-dispersed clouds. J. Comput. Syst. Sci. 78(5), 1345–1358 (2012)
Munir, E.U., Li, J., Shi, S.: QoS suffrage heuristic for independent task scheduling in grid. Inf. Technol. J. 6, 1166–1170 (2007)
OptorSim–A Replica Optimizer Simulation: http://edg-wp2.web.cern.ch/edgwp2/optimization/optorsim.html
Job Statistics for CMS Data Challenge (2004). http://cmsdoc.cern.ch/cms/LCG/LCG-2/dc04/fakeanalysis/pierro/
Bunn, J., Newman, H.: Grid computing: making the global infrastructure a reality. chapter Data Intensive Grids for High Energy Physics. Wiley Press, London (2003)
Szalay, A.S., Proceedings of SPIE Conference on Virtual Observatories, vol 4846, Waikoloa, SPIE (2002)
Laser Interferometer Gravitational Wave Observatory (2005). http://www.ligo.caltech.edu/
Sloan Digital Sky Survey (2005). http://www.sdss.org/
Hamrouni, T., Slimani, S., Charrada, F.B.: A critical survey of Data Grid replication strategies based on data mining techniques. Proc. Comput. Sci. 51, 2779–2788 (2015)
Lou, C., Zheng, M., Liu, X., Li, X.: Replica selection strategy based on individual QoS sensitivity constraints in cloud environment. Pervasive Comput. Netw. World 8351, 393–399 (2014)
Long, S.Q., Zhao, Y.L., Chen, W.: MORM: a multi-objective optimized replication management strategy for cloud storage cluster. J. Syst. Archit. 60(2), 234–244 (2014)
Andronikou, V., Mamouras, K., Tserpes, K., Kyriazis, D., Varvarigou, T.: Dynamic QoS-aware data replication in grid environments based on data “Importance”. Future Gener. Comput. Syst. 28, 544–553 (2012)
Shorfuzzaman, M., RasitEskicioglu, P.G.: QoS-aware distributed replica placement in hierarchical data grids. In: International Conference on Advanced Information Networking and Applications (2011)
Foster, I., Ranganathan, K.: Design and evaluation of dynamic replication strategies for high performance Data Grids. In: Proceedings of International Conference on Computing in High Energy and Nuclear Physics (2001)
Foster, I., Ranganathan, K.: Identifying dynamic replication strategies for high performance Data Grids. In: Proceedings of 3rd IEEE/ACM International Workshop on Grid Computing, in: Lecture Notes on Computer Science, pp. 75–86 (2002)
Foster, I., Ranganathan, K.: Decoupling computation and data scheduling in distributed data-intensive applications. In: Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing, HPDC-11, IEEE, pp. 352–358. CS Press, Edinburgh (2002)
Bsoul, M., Al-Khasawneh, A., Abdallah, E.E., Kilani, Y.: Enhanced fast spread replication strategy for Data Grid. J. Netw. Comput. Appl. 34, 575–580 (2011)
Sashi, K., Thanamani, A.S.: Dynamic replica management for Data Grid. IACSIT Int. J. Eng. Technol. 2, 329–333 (2010)
Chang, R.S., Chang, H.P.: A dynamic data replication strategy using access-weight in Data Grids. J. Supercomput. 45, 277–295 (2008)
Park, S.M., Kim, J.H., Ko, Y.B., Yoon, W.S.: Dynamic grid replication strategy based on internet hierarchy. Int. Workshop Grid Coop. Comput. 1001, 1324–1331 (2003)
Sashi, K., Thanamani, A.S.: Dynamic replication in a Data Grid using a modified BHR region based algorithm. Future Gener. Comput. Syst. 27, 202–210 (2011)
Horri, A., Sepahvand, R., Dastghaibyfard, G.H.: A hierarchical scheduling and replication strategy. Int. J. Comput. Sci. Netw. Secur. 8, 30–35 (2008)
Mansouri, N., Dastghaibyfard, G.H.: Job scheduling and dynamic data replication in Data Grid environment. J. Supercomput. 64, 204–225 (2013)
Chang, R., Chang, J., Lin, S.: Job scheduling and data replication on Data Grids. Future Gener. Comput. Syst. 23, 846–860 (2007)
Mansouri, N., Dastghaibyfard, G.H.: A dynamic replica management strategy in Data Grid. J. Netw. Comput. Appl. 35, 1297–1303 (2012)
Mansouri, N., Dastghaibyfard, G.H., Mansouri, E.: Combination of data replication and scheduling algorithm for improving data availability in Data Grids. J. Netw. Comput. Appl. 36, 711–722 (2013)
Mansouri, N.: A threshold-based dynamic data replication and parallel job scheduling strategy to enhance Data Grid. Cluster Comput. 17, 957–977 (2014)
Yang, C., Fu, C., Hsu, C.: File replication, maintenance, and consistency management services in Data Grids. J. Supercomput. 53, 411–439 (2010)
Sun, D.W., Chang, G.R., Gao, S., Jin, L.Z., Wang, X.W.: Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. J. Comput. Sci. Technol. 27(2), 256–272 (2012)
Rajalakshmi, A., Vijayakumar, D., Srinivasagan, K.G.: An improved dynamic data replica selection and placement in cloud. In: International Conference on Recent Trends in Information Technology (2014)
Li, B., Song, S., Bezakova, I., Cameron, W.: Energy-aware replica selection for data-intensive services in Cloud. In: IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 6, 504–50 (2012)
Vingralek, R., Breitbart, Y., Sayal, M., Scheuermann, P.: Web++: a system for fast and reliable web service, In: Proceedings of the USENIX Annual Technical Conference (1999)
Sayal, M., Breitbart, Y., Scheuermann, P., Vingralek, R.: Selection algorithms for replicated web servers. In: Proceedings of the Workshop on Internet Server Performance (1998)
Cuevas, A., Uruena, M., Veciana, G., Yadav, A.: STARR-DCS: spatio-temporal adaptation of random replication for data-centric storage. In: ACM Transactions on Sensor Networks, vol. 10 (2013)
Ranganathan, K., Foster, I.: Simulation studies of computation and data scheduling algorithms for data grids. J. Grid Comput. 1, 53–62 (2003)
Lewontin, S., Martin, E.: Client side load balancing for the web, In: Proceedings of 6th International World Wide Web Conference (1997)
Fei, Z., Bhattacharjee, S., Zegura, E., Ammar, M.: A novel server selection technique for improving response time of a replicated service. In: Proceedings IEEE INFOCOM, pp. 783–791 (1998)
Mansouri, N., Asadi, A.: Weighted data replication strategy for data grid considering economic approach. Int. J. Comput. Elect. Auto. Control Inf. Eng. 8, 1336–1345 (2014)
Ceryen, T., Kevin, M.: Performance characterization of decentralized algorithms for replica selection in distributed object systems. In: Proceedings of the 5th International Workshop on Software Performance, pp. 257–262 (2005)
Cameron, D.G., Carvajal-schiaffino, R., Paul Millar, A., Nicholson, C., Stockinger, K., Zini, F.: UK Grid Simulation with OptorSim. UK e-Science All Hands Meeting (2003)
Acknowledgments
Author has received research Grants from Iranian National Science Foundation (INSF).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mansouri, N. QDR: a QoS-aware data replication algorithm for Data Grids considering security factors. Cluster Comput 19, 1071–1087 (2016). https://doi.org/10.1007/s10586-016-0576-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-016-0576-7