Loading [a11y]/accessibility-menu.js
SLA-Based Scheduling of Spark Jobs in Hybrid Cloud Computing Environments | IEEE Journals & Magazine | IEEE Xplore

SLA-Based Scheduling of Spark Jobs in Hybrid Cloud Computing Environments


Abstract:

Big data frameworks such as Apache Spark is becoming prominent to perform large-scale data analytics jobs in various domains. However, due to limited resource availabilit...Show More

Abstract:

Big data frameworks such as Apache Spark is becoming prominent to perform large-scale data analytics jobs in various domains. However, due to limited resource availability, the local or on-premise computing resources are often not sufficient to run these jobs. Therefore, public cloud resources can be hired on a pay-per-use basis from the cloud service providers to deploy a Spark cluster entirely on the cloud. Nevertheless, using only cloud resources can be costly. Hence, both local and cloud resources nowadays are used together to deploy a hybrid cloud computing cluster. However, scheduling jobs in a cluster deployed on hybrid clouds is challenging in the presence of various Service-Level Agreement (SLA) demands such as cost minimization and job deadline guarantee. Most of the existing works either consider a public or a locally deployed cluster and mainly focus on improving job performance in the cluster. In this article, we propose efficient scheduling algorithms that leverage from different VM instance pricing in a hybrid cloud deployed cluster to optimize the Virtual Machine (VM) usage cost for both local and cloud resources and maximize the job deadline met percentage. We have conducted extensive simulation-based experiments to compare our proposed algorithms with the baseline approaches. In addition, we have developed a prototype system on top of Apache Mesos cluster manager and performed real experiments to evaluate the applicability of our proposed approaches in a real platform with benchmark applications. The results show that our proposed algorithms are highly scalable and reduce the cost of VM usage of a hybrid cluster for up to 20 percent.
Published in: IEEE Transactions on Computers ( Volume: 71, Issue: 5, 01 May 2022)
Page(s): 1117 - 1132
Date of Publication: 27 April 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.