Skip to main content
Log in

Adaptive popularity-driven replica placement in hierarchical data grids

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Data grids support access to widely distributed storage for large numbers of users accessing potentially many large files. Efficient access is hindered by the high latency of the Internet. To improve access time, replication at nearby sites may be used. Replication also provides high availability, decreased bandwidth use, enhanced fault tolerance, and improved scalability. Resource availability, network latency, and user requests in a grid environment may vary with time. Any replica placement strategy must be able to adapt to such dynamic behavior. In this paper, we describe a new dynamic replica placement algorithm, Popularity Based Replica Placement (PBRP), for hierarchical data grids which is guided by file “popularity”. Our goal is to place replicas close to clients to reduce data access time while still using network and storage resources efficiently. The effectiveness of PBRP depends on the selection of a threshold value related to file popularity. We also present Adaptive-PBRP (APBRP) that determines this threshold dynamically based on data request arrival rates. We evaluate both algorithms using simulation. Results for a range of data access patterns show that our algorithms can shorten job execution time significantly and reduce bandwidth consumption compared to other dynamic replication methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Allcock B, Bester J, Bresnahan J, Chervenak AL, Foster I, Kesselman C, Meder S, Nefedova V, Quesnal D, Tuecke S (2002) Data management and transfer in high performance computational grid environments. Parallel Comput J 28(3):749–771

    Article  Google Scholar 

  2. Allcock W, Foster I, Nefedova V, Chervenak A, Deelman E, Kesselman C, Lee J, Sim A, Shoshani A, Drach B, Williams D (2001) High-performance remote access to climate simulation data: a challenge problem for data grid technologies. In: Proceedings of the supercomputing, 2001, pp 46–60

  3. Bell W, Cameron D, Capozza L, Millar A, Stockinger K, Zini F (2002) Simulation of dynamic grid replication strategies in optorsim. In: Proceedings of the 3rd international IEEE workshop on grid computing (Grid’2002), 2002, pp 46–57

  4. Bell W, Cameron D, Capozza L, Millar P, Stockinger K, Zini F (2003) Optorsim—a grid simulator for studying dynamic data replication strategies. Int J High Perform Comput Appl 17:403–416

    Article  Google Scholar 

  5. Bell WH, Cameron DG, Carvajal-Schiaffino R, Millar AP, Stockinger K, Zini F (2003) Evaluation of an economy-based file replication strategy for a data grid. In: Proceedings of the 3rd IEEE/ACM international symposium on cluster computing and the grid, 2003, pp 667–674

  6. Foster I, Alpert E, Chervenak A, Drach B, Kesselman C, Nefedova V, Middleton D, Shoshani A, Sim A, Williams D (2001) The earth system grid: turning climate datasets into community resources. In: Proceedings of the American meteorological society conference, 2001

  7. Holtman K (2001) CMS Data grid system overview and requirements. CMS Experiment Note 2001/037, CERN

  8. Kesselman C, Foster I (1998) The grid: blueprint for a new computing infrastructure. Morgan Kaufmann, San Mateo

    Google Scholar 

  9. LHC Computing Grid (2009) http://lcg.web.cern.ch/lcg/. Distributed production environment for physics data processing

  10. Lin Y, Liu P, Wu J (2006) Optimal placement of replicas in data grid environments with locality assurance. In: Proceedings of the 12th international conference on parallel and distributed systems (ICPADS’06), 2006, vol 1, pp 465–474

  11. Park S, Kim J, Ko Y, Yoon W (2003) Dynamic data grid replication strategy based on Internet hierarchy. In: Proceedings of the second international workshop on grid and cooperative computing (GCC’2003), 2003, pp 838–846

  12. Ranganathan K, Foster I (2001) Design and evaluation of dynamic replication strategies for a high performance data grid. In: Proceedings of the international conference on computing in high energy and nuclear physics, 2001

  13. Ranganathan K, Foster IT (2001) Identifying dynamic replication strategies for a high-performance data grid. In: Proceedings of the international workshop on grid computing (GRID’2001), 2001, pp 75–86

  14. Revees CR (1993) Modern heuristic techniques for combinatorial problems. Oxford Blackwell Scientific Publication, Oxford

    Google Scholar 

  15. Russel M, Allen G, Daues G, Foster I, Seidel E, Novotny J, Shalf J, von Laszewski G (2002) The astrophysics simulation collaboratory: a science portal enabling community software development. Clust Comput 5(3):297–304

    Article  Google Scholar 

  16. Tang M, Lee B, Yeo C, Tang X (2005) Dynamic replication algorithms for the multi-tier data grid. Future Gener Comput Syst 21(5):775–790

    Article  Google Scholar 

  17. The ATLAS experiment (2009) http://atlas.ch/. Particle Physics Experiment at CERN

  18. The European Data Grid project (2001) The datagrid architecture. http://eu-datagrid.web.cern.ch/eu-datagrid/

  19. Venugopal S, Buyya R, Ramamohanarao K (2006) A taxonomy of data grids for distributed data sharing, management, and processing. ACM Comput Surv 1:1–53

    Google Scholar 

  20. Wang H, Liu P, Wu J (2006) A QoS-aware heuristic algorithm for replica placement. J Grid Comput, 96–103

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Shorfuzzaman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shorfuzzaman, M., Graham, P. & Eskicioglu, R. Adaptive popularity-driven replica placement in hierarchical data grids. J Supercomput 51, 374–392 (2010). https://doi.org/10.1007/s11227-009-0371-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-009-0371-9

Navigation