Skip to main content

Advertisement

Log in

Data Placement in P2P Data Grids Considering the Availability, Security, Access Performance and Load Balancing

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Data dependability is an important issue in data Grids. Replication schemes have been widely used in distributed systems to ensure availability and improve access performance. Alternatively, data partitioning schemes (secret sharing, erasure coding with encryption) can be used to provide availability and, in addition, to offer confidentiality protection. In peer-to-peer data Grids, such confidentiality protection is essential since the nodes hosting the data shares may not be trustworthy or may be compromised. However, difficulties in generating new shares and potential security concerns for share reallocation make a pure data partitioning scheme not easily adaptable to dynamic user access patterns. In this paper, we consider combining replication and data partitioning to assure data availability, confidentiality, load balance, and efficient access for data Grid applications. Data are partitioned and shares are dispersed. The shares may be replicated to achieve better performance, load balance, and availability. Models for assessing confidentiality, availability, load balance, and communication cost are developed and used as the metrics to guide placement decisions. Due to the nature of contradicting goals, we model the placement decision problem as a multi-objective problem and use a genetic algorithm to determine solutions that are approximate to the Pareto optimal placement solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adan, I., Resing, J.: Queueing theory. http://www.win.tue.nl/~iadan/queueing.pdf (2001)

  2. Aguilera, M., et al.: Using erasure codes efficiently for storage in a distributed system. In: Proceedings of DSN (2005)

  3. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Data management and transfer in high-performance computational Grid environments. J. Parallel Comput. 28(5), 749–771 (2002)

    Article  Google Scholar 

  4. Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Proceedings of the 30th ACM STOC (1998)

  5. AviZienis, A., Laprie, J., Randell, B.: Fundamental concepts of dependability. In: The extension version of the Proceedings of the 3rd IEEE Information Survivability Workshop (ISW-2000) (2000)

  6. Buda, G., Allen, B., Linthicum, H.: Security standards for the global information Grid. In: Proceedings of IEEE MILCOM 2001-Communications for Network-centric Operations: Creating the Information Force. Vienna, VA (2001)

    Google Scholar 

  7. Butt, A., Adabala, S., Kapadia, N., Figueiredo, R.: Fine grain access control for securing shared resources in computation Grids. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002)

  8. CERT Coordination Center, UNIX Configuration Guidelines. Available at http://www.cert.org/tech_tips/unix_configuration_guidelines.html

  9. Chandra, B., et al.: End-to-End WAN service availability. In: Proceedings of the 3rd Usenix Symposium on Internet Technologies and Systems (2001)

  10. Dalvi, N., et al.: Adversary classification. In: Proceedings of KDD’04 (2004)

  11. Deb, K., et al.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  MathSciNet  Google Scholar 

  12. Dowdy, D., Foster, D.: Comparative models of the file assignment problem. Comput. Surv. 14(2), 287–313 (1982)

    Article  Google Scholar 

  13. Elnikety, S., Nahum, E., Tracey, J., Zwaenepoel, W.: A method for transparent admission control and request scheduling in e-commerce web sites. In: Proceeding of WWW’04 (2004)

  14. Foster, I., Lamnitche, A.: On death, taxes, and convergence of peer-to-peer and Grid computing. In: IPTPS’03 (2003)

  15. The Globus Project. Retrieved from www.globus.org

  16. Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: ACM SIGMOD Conference (1996)

  17. Kalpakis, K., et al.: Optimal placement of replicas in trees with read, write, and storage costs. IEEE Trans. Parallel Distrib. Syst. 12(6), 628–637 (2001)

    Article  Google Scholar 

  18. Kariv, O., Hakimi, S.L.: An algorithmic approach to location problems II: the p-medians. SIAM J. Appl. Math. 37(3), 539–560 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  19. Kosar, T., Livny, M.: Stork: making data placement a first class citizen in the Grid. In: Proceedings of the 24th IEEE International Conference of Distributed Computing Systems (ICDCS’04) (2004)

  20. Krawczyk, H.: Dsitributed fingerprints and secure information dispersal. In: Proceedings of the 12th Annual ACM Symposium on Principles of Distributed Computing (PODC’93) (1993)

  21. Krawczyk, H.: Secret sharing made short. In: Crypto’93 (1993)

  22. Kubitowicz, et al.: OceanStore: an archiotecute for global-scale persistent storage. In: Proceedings of ASPLOS (2000)

  23. Lakshmanan, S., Ahamad, M., Venkateswaran, H.: Responsive security for stored data. IEEE Trans. Parallel Distrib. Syst. 14(9), 818–828 (2003)

    Article  Google Scholar 

  24. Lala, J.H.: Foundations of the intrusion tolerant systems OASIS. IEEE Comput. Soc., ISBN 076952057X (2004)

  25. Lamehamedi, H., Shentu, Z., Szymanski, B., Deelman, E.: Simulation of dynamic data replication in data Grids. In: Proceedings of 18th International Parallel and Distributed Processing Symposium (2003)

  26. Mei, A., et al.: Secure dynamic fragment and replica allocation in large-scale distributed File systems. IEEE Trans. Parallel Distrib. Syst. 14(9), 885–896 (2003)

    Article  Google Scholar 

  27. Nagaratnam, N., et al.: The Security Architecture for Open Grid Services. OGSA Working draft (2002)

  28. Nicol, D., Sanders, W., Trivedi, K.: Model-based evaluation from dependability to security. IEEE T. Depend. Secure 1(1), 1–17 (2004)

    Article  Google Scholar 

  29. On, G., Schmitt, J., Steinmetz, R.: On availability Qos for replicated multimedia service and content. In: Proceedings of International Workshop on Interactive Distributed Multimedia Systems (IDMS02) (2002)

  30. On, G., Schmitt, J., Steinmetz, R.: Quality of availability: replica placement for widely distributed systems. In: IWQos’03 (2003)

  31. Park, S., Kim, J., Ko, Y., Yoon, W.: Dynamic data replication strategy based on internet hierarchy. In: Proceedings of 2nd International Workshop on Grid and Cooperative Computing (GCC’03) (2003)

  32. Paxson, V.: End-to-end routing behavior in the internet. IEEE/ACM Trans. Netw. 5(5), 601–615 (1997)

    Article  Google Scholar 

  33. Qin, X.: Design and analysis of a load balancing strategy in data Grids. J. Grid Computing 23(1), 132–137 (2007)

    Google Scholar 

  34. Rabin, M.O.: Efficient dispersal of information for security, load balancing and fault tolerance. J. ACM 36(2), 335–348 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  35. Ranganathan, K., Foster, I.: Identifying dynamic replication strategies for a high performance data Grid. In: Proceedings of 2nd International Workshop on Grid Computing (2001)

  36. Ranganathan, K., et al.: Improve data availability through dynamic model-driven replication in large peer-to-peer communities. In: Proceedings of 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (2002)

  37. Reiter, M., Rohatgi, P.: Homeland security. IEEE Internet Comput. 8(6), 16–17 (2004)

    Article  Google Scholar 

  38. Riedel, E., et al.: A framework for evaluating storage system security. In: Proceedings of the 1st Conference on File and Storage Technology (FAST). Monterey, CA (2002)

    Google Scholar 

  39. Schintke, F., Reinefeld, A.: Modeling replica availability in large data Grids. J. Grid Computing 1(2), 219–227 (2003)

    Article  Google Scholar 

  40. Shamir, A.: How to share a secret. Commun. ACM 22, 612–613 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  41. Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar, M., Patil, S., Pearlman, L.: A metadata catalog service for data intensive applications. In: Proceedings of 2003 IEEE/ACM Conference on Supercomputing (2003)

  42. Steuer, E.: Multiple Criteria Optimization: Theory, Computation, and Application. Wiley (1986)

  43. Symantec. Anatomy of a data breach: why breaches happen and what to do about it. White paper, http://eval.symantec.com/mktginfo/enterprise/white_papers/b-anatomy_of_a_data_breach_WP_20049424-1.en-us.pdf (2009)

  44. Thuraisingham, B.M., Maurer, J.A.: Information survivability for evolvable and adaptable real-time command and control systems. IEEE Trans. Knowl. Data Eng. 11(1), 228–238 (1999)

    Article  Google Scholar 

  45. Tu, M., Li, P., Ma, Q., Yen, I., Bastani, F.: On the optimal placement of secure data over Internet. In: Proceedings of IPDPS’05. Denver, Colorado, USA (2005)

  46. Tu, M., Xiao, L., Ma, H., Yen, I., Bastani, F.: Data placement in secure and dependable P2P data Grid. In: Proceedings of the IEEE 10th International Symposium on High Assurance System Engineering (HASE07) (2007)

  47. Tu, M., Li, P., Ma, Q., Yen, I., Bastani, F.: Secure data object placement in the P2P data Grid. IEEE T. Depend. Secure 7(1), 50–64 (2010)

    Article  Google Scholar 

  48. Welch, V., Siebenlist, F., Foster, I., Bresnahan, J., Czajkowski, K., Gawor, J., Kesselman, C., Meder, S., Pearlman, L., Tuecke, S.: Security for Grid service. In: Proceedings of 12th International Symposium on High Performance Distributed Computing (HPDC-12) (2003)

  49. Weatherspoon, H., Kubiatowicz, J.: Erasure coding vs. replication: a quantitative comparison. In: Proceedings of Peer-to-Peer Systems: First International Workshop (IPTPS) (2002)

  50. Wolfson, O., Milo, A.: The multicast policy and its relationship to replicated data placement. ACM Trans. Database Syst. 16(1), 181–205 (1991)

    Article  MathSciNet  Google Scholar 

  51. Wolfson, O., Jajodia, S., Huang, Y.: An adaptive data replication algorithm. ACM Trans. Database Syst. 22(2), 255–314 (1997)

    Article  Google Scholar 

  52. Wu, T., Malkin, M., Boneh, D.: Building intrusion tolerant applications. In: DARPA Information Survivability Conference & Exposition I (2000)

  53. Wylie, J., et al.: Selecting the right data distribution scheme for a survivable storage system. Technical Report CMU (2000)

  54. Yu, H., Vahdat, A.: The costs and limits of availability for replicated services. In: Proceedings of the ACM Symposium on Operating Systems Principles (SOSP) (2001)

  55. Zitzler, E., Deb, K., Thiele, L.: Comparison of multi-objective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manghui Tu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tu, M., Ma, H., Xiao, L. et al. Data Placement in P2P Data Grids Considering the Availability, Security, Access Performance and Load Balancing. J Grid Computing 11, 103–127 (2013). https://doi.org/10.1007/s10723-012-9232-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9232-5

Keywords

Navigation