Service task partition and distribution in star topology computer grid subject to data security constraints

https://doi.org/10.1016/j.ress.2011.06.013Get rights and content

Abstract

The paper considers grid computing systems in which the resource management systems (RMS) can divide service tasks into execution blocks (EBs) and send these blocks to different resources. In order to provide a desired level of service reliability the RMS can assign the same blocks to several independent resources for parallel execution.

The data security is a crucial issue in distributed computing that affects the execution policy. By the optimal service task partition into the EBs and their distribution among resources, one can achieve the greatest possible service reliability and/or expected performance subject to data security constraints. The paper suggests an algorithm for solving this optimization problem. The algorithm is based on the universal generating function technique and on the evolutionary optimization approach. Illustrative examples are presented.

Highlights

► Grid service with star topology is considered. ► An algorithm for evaluating service reliability and data security is presented. ► A tradeoff between the service reliability and data security is analyzed. ► A procedure for optimal service task partition and distribution is suggested.

Introduction

Cloud computing and grid computing are closely interconnected recently emerged concepts of delivering information technology services as computing utilities [1]. These concepts are based on distributed internet-based computing where shared hardware resources, software, and information are provided to customers on demand [2].

The real and specific problem that underlies the grid concept is coordinated resource sharing in distributed dynamic and multi-institutional virtual organizations. The resource sharing is not primarily file exchange but rather direct access to computers, software, data, and other resources. This is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering.

The sharing is controlled by a Resource Management System (RMS) [3]. When the RMS receives a service request from a user, the task can be divided into a set of execution blocks (EBs) that are executed in parallel. The RMS assigns those EBs to available resources for execution. After the resources finish the assigned jobs, they return the results back to the RMS and then the RMS integrates the received results into entire task output, which is requested by the user.

The above grid service process can be approximated by a structure with star topology, as depicted in Fig. 1, where the center is the RMS directly connected with the resources through respective communication channels.

The performance of grid computing is of great concern. Usually the measure of grid performance is the task execution time (service time). This index can be significantly improved using the RMS that divides a task into a set of EBs, which can be executed in parallel by multiple online resources. Many complicated and time-consuming tasks that could not be implemented before are currently working well under the grid computing environment.

The service time is a random variable affected by many factors [3]. First, there are many resources available online, that have different task processing speeds. Thus, the task execution time can vary depending on which resource is assigned to execute the EB. Second, some resources can fail when running the jobs, so the execution time is also affected by the resource reliability. Similarly, the communication links in grid service can fail during the data transmission. Thus, the communication reliability influences the service time as well as data transmission speed in the communication channels.

The choice of the group of subtasks assigned to the same EB and running on the same resource can influence the total amount of data transmitted between the RMS and the resource, since different subtasks can use common input data blocks. This also affects the entire service time.

The data security is also a crucial issue in distributed computing. The greater number of resources and communication channels process and transmit some data blocks the greater chances of unauthorized access (UA) to these blocks. Different channels and resources can have different security levels. Therefore the distribution of subtasks among the resources affects the data security during the service execution.

Most of the previous researchers separated performance, reliability, and security into different fields and studied them individually. However in fact, performance reliability and security are closely related and affect each other, in particular when the grid computing is implemented. For example, when a task is divided into n different EBs executed by n resources simultaneously, the performance is high but the reliability can be low because failure of any resource makes the entire task incomplete. This causes the task restart, which inversely increases its execution time (i.e. reduces its performance). Therefore, it is worth having some redundant resources to execute same EB especially for those failure-prone resources. However, too many redundancies, even though improving the reliability, can decrease the performance by not fully parallelizing the execution of different subtask and considerably reduces data security as multiple replicas of the same data are processed by different resources, which increases chances of UA. Thus performance, reliability and security should be studied together in the grid service analysis.

The first algorithm for optimizing the division of a service task into EBs and distribution of these EBs among available grid resources was suggested in [4]. The algorithm considered performance and reliability of grid service without taking into account the data security issue. This paper considers the service performance optimization subject to reliability and data security constraints.

Section 2 of the paper presents the grid service reliability, security and performance model. Section 3 describes an algorithm to efficiently obtain the service time distribution using universal generating function technique. Section 4 describes the optimization technique. Section 5 provides illustrative examples and Section 6 concludes.

Section snippets

Service execution by the grid system with star architecture

Different resources are distributed in the grid system. The considered service can use a given set of resources. All the resources and communication channels from this set are available at the time when the request for service arrives to the RMS (unavailable resources are detected by the RMS and, thus, not involved in the service). Each resource is directly connected to the RMS by a single communication channel forming the star topology. Each resource with the corresponding channel is

Algorithm for determining the pmf of the service time

The procedure used for the evaluation of service time distribution is based on the universal generating function (u-function) technique, which was introduced in [10] and which proved to be very effective for the reliability evaluation of different types of multi-state systems [11]. The main advantage of this technique is its high computational efficiency that allows it to be used in optimization procedures where a large number of different solutions should be estimated.

Optimization technique

Formulations (10), (11) define a complicated NP complete partitioning/allocation problem. An exhaustive examination of all possible solutions is not realistic, considering reasonable time limitations. As in most combinatorial optimization problems, the quality of a given solution is the only information available during the search for the optimal solution. Therefore, a heuristic search algorithm is needed which uses only estimates of solution quality and which does not require derivative

Numerical examples

Consider a grid service that uses six resources distributed in the grid system. Parameters of grid resources and the corresponding communication channels (processing/transmission speeds and failure rates) are presented in Table 1. The entire service task can be divided into eight independent subtasks. The computational complexity, the amount of output data and the list of input data blocks for each subtask are presented in Table 2. The amount of data in each input data block is presented in

Conclusions

Cloud and grid computing are newly developed concepts of delivering information technology services in large-scale distributed systems. These concepts allow effective distribution of computational tasks among different resources presented in the grid. The resource management system (RMS) can divide service task into subtasks and send the subtasks to different resources for parallel execution. In order to provide desired level of service reliability the RMS can assign the same subtasks to

Acknowledgement

Yanping Xiang acknowledges support from National Natural Science Foundation of China (No. 60974089).

References (16)

  • G. Levitin et al.

    Optimal service task partition and distribution in grid system with star topology

    Reliability Engineering and System Safety

    (2008)
  • J. Rubinovitz et al.

    Genetic algorithm for assembly line balancing

    International Journal of Production Economics

    (1995)
  • G. Levitin et al.

    Genetic algorithm for open loop distribution system design

    Electric Power Systems Research

    (1995)
  • Foster I, Yong Zhao, Raicu I, Lu S. Cloud Computing and Grid Computing 360-Degree Compared. Grid Computing Environments...
  • K. Krauter et al.

    A taxonomy and survey of grid resource management systems for distributed computing

    Software—Practice and Experience

    (2002)
  • England D, Weissman JB. A stochastic control model for the deployment of dynamic grid services. In: Proceedings of the...
  • O. Alhazmi et al.

    Application of vulnerability discovery models to major operating systems

    IEEE Transactions on Reliability

    (2008)
  • Alhazmi O, Malaiya Y. Quantitative vulnerability assessment of systems software. In: Proceedings of the Annual...
There are more references available in the full text version of this article.

Cited by (4)

  • Ensuring confidentiality and availability of sensitive data over a network system under cyber threats

    2021, Reliability Engineering and System Safety
    Citation Excerpt :

    This idea has been incorporated into the modern distributed data storage systems (e.g., cloud storage) in practice [4,5]. In the reliability engineering, there is a growing interest in developing attack-defense models for protecting the sensitive data over a network system or distributed online storage system [9–13]. For example, Hasan et al. [14] modeled the interaction between the attacker and defender as a two-player game for power systems by considering dynamic cyber attacks.

  • Optimal defense of a distributed data storage system against hackers’ attacks

    2020, Reliability Engineering and System Safety
    Citation Excerpt :

    In addition to intelligent algorithms, simulation is also an effective method to analyze such large and complex systems [16,17]. Except for reducing the storage cost, how to enhance the reliability of a distributed data storage system is also a key issue [18,19]. Owing to the complex and uncertain network security environments, the data availability and the data security of a distributed data storage system are not only influenced by internal structures of the system such as hardware redundancies deployment and data replication but also threatened by hackers [20,21].

  • Independent task-oriented topology optimization of star-based grid

    2014, Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science)
View full text