Abstract
Security-sensitive applications that access and generate large data sets are emerging in various areas including bioinformatics and high energy physics. Data grids provide such data-intensive applications with a large virtual storage framework with unlimited power. However, conventional scheduling algorithms for data grids are unable to meet the security needs of data-intensive applications. In this paper we address the problem of scheduling data-intensive jobs on data grids subject to security constraints. Using a security- and data-aware technique, a dynamic scheduling strategy is proposed to improve quality of security for data-intensive applications running on data grids. To incorporate security into job scheduling, we introduce a new performance metric, degree of security deficiency, to quantitatively measure quality of security provided by a data grid. Results based on a real-world trace confirm that the proposed scheduling strategy significantly improves security and performance over four existing scheduling algorithms by up to 810% and 1478%, respectively.
Similar content being viewed by others
References
Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. J. Netw. Comput. Appl. 23, 187–200 (2001)
Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. Journal Supercomput. Appl. 15(3), 200–222 (2001)
Keahey, K., Welch, V.: Fine-grain authorization for resource management in the grid environment. In: Proc. Int’l Workshop Grid Computing, 2002
Novotny, J., Tuecke, S., Welch, V.: An online credential repository for the grid: MyProxy. In: Proc. Int’l Symp. High Performance Distributed Computing, August 2001
Park, S.-M., Kim, J.-H.: Chameleon: a resource scheduler in a data grid environment. In: Proc. Int’l Symp. Cluster Computing and the Grid, 2003
Qin, X., Jiang, H.: Data grids: supporting data-intensive applications in wide area networks. In: Yang, L., Guo, M.-Y. (eds.) High Performance Computing: Paradigm and Infrastructure, Wiley, Hoboken (2004)
Ranganathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: Proc. IEEE Int. Symp. High Performance Distributed Computing, 2002, pp. 352–358
Song, S., Kwok, Y.-K., Hwang, K.: Trusted job scheduling in open computational grids: security-driven heuristics and a fast genetic algorithms. In: Proc. Int’l Symp. Parallel and Distributed Processing, 2005
Welch, V., Siebenlist, F., Foster, I., Bresnahan, J., Czajkowski, K., Gawor, J., Kesselman, C., Meder, S., Pearlman, L., Tuecke, S.: Security for grid services. In: Proc. Int’l Symp. High Performance Distr. Computing, 2003
Winton, L.: Data grids and high energy physics: a Melbourne perspective. Space Sci. Rev. 107(1–2), 523–540 (2003)
Xie, T., Qin, X., Sung, A.: SAREC: a security-aware scheduling strategy for real-time applications on clusters. In: Proc. 34th Int’l Conf. Parallel Processing, Norway, June 2005
Xie, T., Qin, X.: Enhancing security of real-time applications on grids through dynamic scheduling. In: Proc. 11th Workshop Job Scheduling Strategies for Parallel Processing, MA, June 2005
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xie, T., Qin, X. Security-driven scheduling for data-intensive applications on grids. Cluster Comput 10, 145–153 (2007). https://doi.org/10.1007/s10586-007-0015-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-007-0015-x