Novel heuristic speculative execution strategies in heterogeneous distributed environments

doi:10.1016/j.compeleceng.2015.06.013

Computers & Electrical Engineering

Volume 50, February 2016, Pages 166-179

https://doi.org/10.1016/j.compeleceng.2015.06.013 Get rights and content

Abstract

MapReduce is a promising distributed computing platform for large-scale data processing applications. Hadoop MapReduce has been considered as one of the most extensively used open-source implementations of MapReduce frameworks for its flexible customization and convenient usage. Despite these advantages, a relatively slow running task called straggler task impedes job progress. In this study, two novel speculative strategies, namely, Estimate Remaining time Using Linear relationship model (ERUL) and extensional Maximum Cost Performance (exMCP), are developed to improve the estimation of the remaining time of a task. ERUL is a dynamic system load-aware strategy; using this strategy, we can overcome some drawbacks of the Longest Approximate Time to End (LATE) that misleads speculative execution in some cases. In exMCP, different slot values are considered. Extensive experiments show that ERUL and exMCP are applied to accurately estimate the remaining execution times of running tasks and reduce the running time of a job.

Graphical abstract

Highlights

•
A novel speculative strategy of remaining time estimation is presented.
•
An extensional maximum cost performance is developed.
•
The system load is considered while estimating the remaining time.
•
The proposed Hadoop-ERUL works more precisely and rapidly.

Introduction

Distributed computing platforms, such as MapReduce [1] and Dryad [2], have been considered as the mainstream computing platforms of data processing, data mining, web indexing, and e-business. Interconnected by commodity computers through networks, these platforms can implement tasks in parallel with high reliability and at low prices, and can also easily add or remove nodes. Hadoop is an open-source implementation of the MapReduce framework and is fully implemented with Java language, which provides program interfaces, such as C++, Python, Perl, and Shell. Large companies, Yahoo!, Aliyun, Facebook and so on replace expensive computers with Hadoop to conduct large-scale computing because Hadoop can be easily customized and used. Hadoop MapReduce includes computing nodes (called TaskTracker) and storage nodes (called DataNodes). In general, a computing node can function as a storage node at the same time. The data blocks of MapReduce are stored in a Hadoop Distributed File System (HDFS). The HDFS is a distributed file system designed to run on commodity hardware [3]. A MapReduce job is divided into multiple map tasks and reduce tasks by JobTracker. JobTracker then assigns these tasks to TaskTracker to execute. A map task processes a data block by using a user-customized mapper operator and delivers the corresponding output to reduce tasks. Reduce tasks fetch input data from the map output through networks and process data by using a user-customized reducer operator. Since a computing node can also be a storage node, map tasks and data blocks on which share the same node, it is known as data locality. Apparently, data locality can reduce the execution time of map tasks. Map tasks and reduce tasks can be executed in parallel because the data of these two types of tasks are independent.

If a task of a job requires an abnormally long execution time, then the total completion time of the job is affected. Such a task is called a straggler. A speculative copy of this task (also called a backup task) is run on another faster node to ensure that this task is finished earlier than the original task [4]. This mechanism is called speculative execution. In heterogeneous distributed environments, computing nodes differ in terms of computing capability and network bandwidth. In addition, a specific job may cause bugs. These problems can cause stragglers. Thus, the completion time of a job is affected. As such, schedulers cannot acquire accurate execution information of nodes and tasks during task assignment. Consequently, the performance of a scheduling strategy is affected. As a fault-tolerant technology, speculative execution can correct the wrong decisions of schedulers to some extent, thereby improving computing efficiency.

An original implementation of speculative execution (as Hadoop-Naive) [5] is examined in Hadoop-0.20 to enhance performance. However, this strategy cannot work well in heterogeneous MapReduce systems. To schedule tasks in heterogeneous systems, Li [6] analyzed the performance of heuristic power allocation and scheduling algorithms of parallel tasks with constrained precedence. Zhang et al. [7] developed algorithms to enhance task reliability when a dynamic voltage scaling (DVS) technique is applied to achieve low power consumption in heterogeneous computing systems. Xu et al. [8] proposed a genetic algorithm to perform parallel task scheduling on heterogeneous distributed environments by utilizing multiple priority queues. Shen et al. [9] presented several methods to schedule necessary computations to trace the vasculature in retinal images. Tian et al. [10] explored an adaptive data collection strategy by using different communication radii for nodes distributed in different locations to balance the energy consumption in heterogeneous wireless sensor networks. In terms of task management in Hadoop systems, Longest Approximate Time to End (LATE) is proposed as a strategy (Hadoop-LATE) to adapt to heterogeneous environments [11], [12]. The LATE strategy can yield some improvements but may cause misjudgment when stragglers are determined. The strategy fails when the misjudgement occurs. Some drawbacks, including inaccurate estimated time and system resource wastage, also exist.

Several other components involving MapReduce and HDFS, such as HBase [13], ZooKeeper [14], Pig [15], and Hive [16], constitute the Hadoop system. The nodes should process the tasks of MapReduce and the tasks of other components at the same time. This process results in an unstable system load of computing nodes, thereby affecting the remaining execution time. Hadoop-LATE and Maximum Cost Performance (MCP; as Hadoop-MCP), which was described in a previous study [17], does not consider the effect of system load when the remaining execution time is estimated. As a result, the estimated remaining time becomes inaccurate, thereby affecting the speculation effect. In addition, MCP does not consider the different values in the slots when the values of cluster resources are evaluated.

To solve these problems, we devise a heuristic speculative execution strategy called Estimate Remaining time Using Linear relationship model (Hadoop-ERUL). Based on heterogeneous computing systems, the system load used to estimate the remaining time of tasks is considered in Hadoop-ERUL. Hadoop-ERUL can overcome some deficiencies of Hadoop-LATE; the former is also more concise and efficient than Hadoop-MCP. Hadoop-ERUL can estimate the remaining time more accurately and detect stragglers more rapidly and more accurately. Moreover, backup tasks can run on suitable nodes with this strategy. Compared with Hadoop-LATE, Hadoop-ERUL can reduce the execution time of a job by 26%. Hence, we presented the paper entitled “A Heuristic Speculative Execution Strategy in Heterogeneous Distributed Environments” in the Proceedings of the Sixth International Symposium on Parallel Architectures, Algorithms, and Programming 2014 (PAAP-2014) [18]. In this work, a heuristic speculative execution strategy was implemented with ERUL on heterogeneous environments. Another novel strategy called extensional Maximum Cost Performance (exMCP) is devised in this work by fully using the values of different slots that were ignored in MCP.

In our further study, the contents affecting the efficiency of the algorithm are extended. In particular, the conference paper is significantly extended and composed of more than 30% new contents, including new algorithms, discussions, and solid experimental results that are not shown in the conference version. The three major contributions of this further study are listed as follows:

•
We propose a heuristic speculative execution strategy with Hadoop-ERUL on heterogeneous environments.
•
We implement Hadoop-exMCP to overcome the drawback of ignoring the different values of the slots in heterogeneous computing systems.
•
We consider the system load to estimate the remaining time of tasks. This strategy can overcome some defects of Hadoop-LATE; this strategy is also more concise and efficient than Hadoop-MCP.
•
We demonstrate that our proposed Hadoop-ERUL can be used not only to estimate the remaining time more precisely but also to detect stragglers more rapidly and more accurately, as revealed by the experimental results of a set of randomly generated task graphs and graphs of real-world problems with various characteristics.

The remainder of this paper is organized as follows: Several related works on speculative execution in the MapReduce framework and the relationship between system load and execution time of tasks are discussed in Section 2. Several drawbacks of previous studies are analyzed in Section 2.3. Our new strategies, ERUL and exMCP, are presented in detail in Section 3. The performance of ERUL and exMCP is evaluated in Section 4. The conclusions and future works are presented in Section 5.

Section snippets

Related studies

In this section, related previous studies on speculative execution strategies and their drawbacks, along with the effect of system load on the execution time of a task, are discussed. At the end of this section, a novel motivation strategy to enhance the performance in Hadoop is presented.

Model and algorithm

The model and algorithms of this study are presented in detail in this section.

Experiments and evaluation

Several important performance metrics are introduced in this section to evaluate the proposed strategies. Hadoop-NONE, Hadoop-NAIVE, and Hadoop-LATE are compared with Hadoop-ERUL. Hadoop-NONE, Hadoop-NAIVE, and Hadoop-LATE are also compared with Hadoop-exMCP.

Conclusion

The impact of system loads is important while making the speculative strategy in a MapReduce system. Straggler tasks are speculatively backed up on other nodes to execute in some speculative execution strategies in order to reduce the completion time of a job. Considering the linear relationship between the system load and execution time of tasks, we presented the ERUL model. The model functions more accurately in estimating the remaining running time of reduce tasks than LATE. Based on this

Acknowledgements

The research was partially funded by the Key Program of National Natural Science Foundation of China (Grant Nos. 61133005 and 61432005), and the National Natural Science Foundation of China (Grant Nos. 61370095 and 61472124), and Graduate Innovative Fund of Hunan Province (Grant No. CX2011B127).

Xin Huang received his M.S. in computer science from the Hunan University, Changsha, in 2010. He is currently pursuing his Ph.D. degree at the Hunan University in computer sciences. His research interests include parallel high-performance computing, cloud computing and automotive distributed systems.

References (22)

Y. Xu et al.
A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queues
Inf Sci
(2014)
J. Dean et al.
MapReduce: simplified data processing on large clusters
Commun ACM
(2008)
M. Isard et al.
Dryad: distributed data-parallel programs from sequential building blocks
SIGOPS Oper Syst Rev
(2007)
Hadoop hdfs architecture guide. <http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html> [accessed on...
S. Sakr et al.
The family of MapReduce and large-scale data processing systems
ACM Comput Surv
(2013)
Schwerz A, Liberato R, Wiese I, Steinmacher I, Gerosa M, Ferreira J. Prediction of developer participation in issues of...
K. Li
Scheduling precedence constrained tasks with reduced processor energy on multiprocessor computers
IEEE Trans Comput
(2012)
L. Zhang et al.
Maximizing reliability with energy conservation for parallel task scheduling in a heterogeneous cluster
Inf Sci
(2015)
H. Shen et al.
Optimal scheduling of tracing computations for real-time vascular landmark extraction from retinal fundus images
IEEE Trans Inf Technol Biomed
(2001)
H. Tian et al.
Maximizing network lifetime in wireless sensor networks with regular topologies
J Supercomput
(2014)

M. Zaharia et al.

Improving MapReduce performance in heterogeneous environments

Cited by (20)

Cluster load based content distribution and speculative execution for geographically distributed cloud environment
2021, Computer Networks
Citation Excerpt :
This framework uses the neural network algorithm to calculate the stage weights of the task execution to estimate the task execution time. The authors of [27] proposed a heuristic speculative execution strategy, which uses the system load to estimate the remaining execution time of the tasks, and accurately detects the straggler tasks according to the remaining execution time of the tasks. The authors of [28] proposed a data transformation graph-based algorithm, which can select an execution path for executing the sequence of MapReduce jobs on the geographically distributed data set to reduce the execution time and the cost.
The scale of big data has shown an explosive growth, which makes the processing of big data put forward higher requirements on data centers, and a single data center can no longer meet the needs of big data processing. To deal with this situation, a geographically distributed cloud system needs to be built. However, in the geographically distributed cloud system, each data center is distributed in different geographic locations, which makes the data placement operations in the geographically distributed cloud system lead to greater overhead. To solve this problem, this paper proposes a data placement strategy. This strategy comprehensively considers the data transmission latency, bandwidth cost, cloud server storage capacity, and load capacity during the data placement process, and formulates a data placement problem that minimizes the energy consumption of data transmission. Then the minimum set cover method based on Lagrangian relaxation is used to solve this problem and obtain the optimal data placement scheme. On the other hand, in a geographically distributed cloud data center, the execution progress of the job submitted by the user will be affected by the straggler task. To solve this problem, this paper proposes a speculative execution strategy for the geographically distributed cloud system. This strategy performs different speculative execution operations according to the state of the cluster load, and then calculates the load capacity of the nodes in the cluster. The node with the strongest load capacity in the cluster is used to perform speculative execution operations. Experimental results show that the proposed data placement strategy can effectively improve the performance of the energy consumption, the data storage cost, the network transmission cost and the data transmission time. The proposed speculative execution strategy can effectively improve the performance of the job completion time, cluster throughput and QoS satisfaction rate.
Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments
2020, Information Sciences
Citation Excerpt :
Selvitopi et al. [17] developed a scheduling model to balance between load balance and data locality in the map and reduce phases. Huang et al. [18] proposed a novel algorithm to solve the data skew problem in the MapReduce computing framework. Chen et al. [19] extended Flink to heterogeneous CPU-GPU clusters to achieve high-performance dataflow processing.
Data centers for cloud computing must accommodate numerous parallel task executions simultaneously. Therefore, data centers have many virtual machines (VMs). Minimizing the scheduling length of parallel task sets becomes a critical requirement in cloud computing systems. In this study, we propose an efficient priority and relative distance (EPRD) algorithm to minimize the task scheduling length for precedence constrained workflow applications without violating the end-to-end deadline constraint. This algorithm consists of two processes. First, a task priority queue is established. Then, a VM is mapped for a task in accordance with its relative distance. The proposed method can effectively improve VM utilization and scheduling performance. Extensive rigorous experiments based on randomly generated and real-world workflow applications demonstrate that the resource reduction rate and scheduling length of the EPRD algorithm significantly surpass those of existing algorithms.
Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems
2017, Information Sciences
Citation Excerpt :
Zhang and Li [33] investigated the energy minimization problem for real-time software systems with task execution time following a statistical distribution. Huang et al. [16] presented novel heuristic speculative execution strategies in a cluster that runs Hadoop MapReduce. The non-dominated sorting genetic algorithm (NSGA-II) was presented by Deb et al. [28] in 2002.
Recent studies focus primarily on low energy consumption or execution time for task scheduling with precedence constraints in heterogeneous computing systems. In most cases, system reliability is more important than other performance metrics. In addition, energy consumption and system reliability are two conflicting objectives. A novel bi-objective genetic algorithm (BOGA) to pursue low energy consumption and high system reliability for workflow scheduling is presented in this paper. The proposed BOGA offers users more flexibility when jobs are submitted to a data center. On the basis of real-world and randomly generated application graphs, numerous experiments are conducted to evaluate the performance of the proposed algorithm. In comparison with excellent algorithms such as multi-objective heterogeneous earliest finish time (MOHEFT) and multi-objective differential evolution (MODE), BOGA performs significantly better in terms of finding the spread of compromise solutions.
Joint optimization of energy efficiency and system reliability for precedence constrained tasks in heterogeneous systems
2016, International Journal of Electrical Power and Energy Systems
Citation Excerpt :
Xu et al. [16] proposed a hybrid chemical reaction optimization scheme for task scheduling on heterogeneous computing systems. Huang et al. [17] developed novel heuristic speculative execution strategies in heterogeneous distributed environments. Yan et al. [18] developed an intelligent particle swarm optimization (IPSO) algorithm for short-term traffic flow predictors to tackle time-invariant assumptions in the on-road sensor systems.
Voltage scaling is a fundamental technique in the energy efficient computing field. Recent studies tackling this topic show degraded system reliability as frequency scales. To address this conflict, the subject of reliability aware power management (RAPM) has been extensively explored and is still under investigation. Heterogeneous Computing Systems (HCS) provide high performance potential which attracts researchers to consider these systems. Unfortunately, the existing scheduling algorithms for precedence constrained tasks with shared deadline in HCS do not adequately consider reliability conservation. In this study, we design joint optimization schemes of energy efficiency and system reliability for directed acyclic graph (DAG) by adopting the shared recovery technique, which can achieve high system reliability and noticeable energy preservation. To the best of our knowledge, this is the first time to address the problem in HCS. The extensive comparative evaluation studies for both randomly generated and some real-world applications graphs show that our scheduling algorithms are compelling in terms of enhancement of both system reliability and energy saving.
Introduction to the special section on parallel architectures, algorithms and programming
2016, Computers and Electrical Engineering
Citation Excerpt :
The main idea of the algorithm is totally relaxing the fairness rule so as to reduce the scheduling overheads. The third paper [3] presents two novel speculative strategies for scheduling optimization in heterogeneous distributed environments such as Hadoop, to accurately estimate a task's remaining time. The first one estimates the remaining time using liner relationship model, which is a dynamic system loads aware strategy.
Improving MapReduce Speculative Executions with Global Snapshots
2023, International Journal of Advanced Computer Science and Applications

View all citing articles on Scopus

Longxin Zhang is working towards the Ph.D. degree at the College of Computer Science and Electronic Engineering, Hunan University, China. His research interests include real-time systems, power aware computing and fault-tolerant systems, modeling and scheduling for distributed computing systems, distributed system reliability, parallel algorithms, cloud computing, and big data computing.

Renfa Li received his B.S., M.S. and Ph.D. in computer science from Tianjin University, Tianjin. He has been a professor and Ph.D. supervisor in Hunan University since 2000. His research interests include embed systems, cyber-physical systems and wireless sensor networks.

Lanjun Wan received his M.S. degree in computer science from the Hunan University of Technology in 2009. He is currently pursuing his Ph.D. degree at the college of computer science and electronic engineering in Hunan University. His research interests include high performance parallel computing, parallel algorithm design and implementations, and hybrid CPU-GPU computing.

Keqin Li is a SUNY distinguished professor of computer science. His current research interests include parallel computing and high-performance computing, energy-efficient computing and communication, heterogeneous computing systems, cloud computing, big data computing, CPU-GPU hybrid and cooperative computing, storage and file systems, wireless communication networks, sensor networks, peer-to-peer file sharing systems, mobile computing, service computing. He is an IEEE fellow.

^☆: Reviews processed and recommended for publication to the Editor-in-Chief by Guest Editor Dr. Hui Tian.

View full text

Novel heuristic speculative execution strategies in heterogeneous distributed environments☆

Abstract

Graphical abstract

Highlights

Introduction

Section snippets

Related studies

Model and algorithm

Experiments and evaluation

Conclusion

Acknowledgements

Inf Sci

MapReduce: simplified data processing on large clusters

Commun ACM

Dryad: distributed data-parallel programs from sequential building blocks

SIGOPS Oper Syst Rev

The family of MapReduce and large-scale data processing systems

ACM Comput Surv

Scheduling precedence constrained tasks with reduced processor energy on multiprocessor computers

IEEE Trans Comput

Maximizing reliability with energy conservation for parallel task scheduling in a heterogeneous cluster

Inf Sci

Optimal scheduling of tracing computations for real-time vascular landmark extraction from retinal fundus images

IEEE Trans Inf Technol Biomed

Maximizing network lifetime in wireless sensor networks with regular topologies

J Supercomput

Improving MapReduce performance in heterogeneous environments