Design and implementation of an analytical framework for interference aware job scheduling on Apache Spark platform

Wang, Kewen; Khan, Mohammad Maifi Hasan; Nguyen, Nhan; Gokhale, Swapna

doi:10.1007/s10586-017-1466-3

Design and implementation of an analytical framework for interference aware job scheduling on Apache Spark platform

Published: 23 December 2017

Volume 22, pages 2223–2237, (2019)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Kewen Wang¹,
Mohammad Maifi Hasan Khan¹,
Nhan Nguyen¹ &
…
Swapna Gokhale¹

918 Accesses
13 Citations
9 Altmetric
Explore all metrics

Abstract

Apache Spark is one of the recently popularized open-source platforms that is increasingly being used for large-scale data analytic applications. However, while performance prediction in such systems is important for efficient job scheduling and optimizing resource allocation, interference among multiple Apache Spark jobs running concurrently in a virtualized environment makes it extremely difficult, which is addressed in this paper. Towards that, first, we develop data-driven analytical models to estimate the effect of interference among multiple Apache Spark jobs on job execution time in virtualized cloud environments. Next, we present the design of an interference aware job scheduling algorithm leveraging the developed analytical framework. We evaluated the accuracy of our models using four real-life applications (e.g., Page rank, K-means, Logistic regression, and Word count) on a 6 node cluster while running up to four jobs concurrently. Our experimental results show that the scheduling algorithm reduces the average execution time of individual jobs and the total execution time significantly, and ranges between 47 and 26% for individual jobs and 2–13% for total execution time respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

Article Open access 14 August 2021

N. Ahmed, Andre L. C. Barczak, … Teo Susnjak

A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

Article Open access 14 December 2020

N. Ahmed, Andre L. C. Barczak, … Mohammed A. Rashid

A Method to Identify Spark Important Parameters Based on Machine Learning

Notes

http://spark.apache.org/faq.html.

References

Barbierato, E., Gribaudo, M., Iacono, M.: Performance evaluation of NoSQL big-data applications using multi-formalism models. Fut. Gen. Comput. Syst. 37, 345–353 (2014)
Article Google Scholar
Brun, C., Artées, T., Margalef, T., Cortées, A.: Coupling wind dynamics into a DDDAS forest fire propagation prediction system. Procedia Comput. Sci. 9, 1110–1118 (2012)
Article Google Scholar
Bu, X., Rao, J., Xu, C.Z.: Interference and locality-aware task scheduling for MapReduce applications in virtual clusters. In: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, pp. 227–238. ACM, New York (2013)
Chaisiri, S., Lee, B.S., Niyato, D.: Optimization of resource provisioning cost in cloud computing. IEEE Trans. Serv. Comput. 5(2), 164–177 (2012)
Article Google Scholar
Chen, X., Rupprecht, L., Osman, R., Pietzuch, P., Franciosi, F., Knottenbelt, W.: CloudScope: diagnosing and managing performance interference in multi-tenant clouds. In: 2015 IEEE 23rd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 164–173. IEEE (2015)
Cheng, D., Rao, J., Jiang, C., Zhou, X.: Resource and deadline-aware job scheduling in dynamic Hadoop clusters. In: 2015 IEEE International on Parallel and Distributed Processing Symposium (IPDPS), pp. 956–965. IEEE (2015)
Chiang, R.C., Hwang, J., Huang, H.H., Wood, T.: Matrix: achieving predictable virtual machine performance in the clouds. In: 11th International Conference on Autonomic Computing (ICAC 14) (2014)
Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-Aware Cluster Management. ACM SIGPLAN Not. 49(4), 127–144 (2014)
Google Scholar
Didona, D., Quaglia, F., Romano, P., Torre, E.: Enhancing performance prediction robustness by combining analytical modeling and machine learning. In: Proceedings of the International Conference on Performance Engineering (ICPE). ACM, New York (2015)
Dstat: Versatile resource statistics tool. http://dag.wiee.rs/home-made/dstat/
Fujimoto, R., Guin, A., Hunter, M., Park, H., Kanitkar, G., Kannan, R., Milholen, M., Neal, S., Pecher, P.: A dynamic data driven application system for vehicle tracking. Procedia Comput. Sci. 29, 1203–1215 (2014)
Article Google Scholar
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytics. CIDR 11, 261–272 (2011)
Google Scholar
https://hadoop.apache.org/
Khan, M., Jin, Y., Li, M., Xiang, Y., Jiang, C.: Hadoop performance modeling for job estimation and resource provisioning. IEEE Trans. Parallel Distrib. Syst. 27(2), 441–454 (2016)
Article Google Scholar
Lai, C.A., Wang, Q., Kimball, J., Li, J., Park, J., Pu, C.: IO Performance interference among consolidated n-tier applications: sharing is better than isolation for disks. In: 2014 IEEE 7th International Conference on Cloud Computing (CLOUD), pp. 24–31. IEEE (2014)
Li, S., Da Xu, L., Zhao, S.: The internet of things: a survey. Inf. Syst. Front. 17(2), 243–259 (2015)
Article Google Scholar
Mozafari, B., Curino, C., Jindal, A., Madden, S.: Performance and resource modeling in highly-concurrent OLTP workloads. In: Proceedings of the 2013 ACM Sigmod International Conference on Management of Data, pp. 301–312. ACM, New York (2013)
Mozafari, B., Curino, C., Madden, S.: DBSeer: resource and performance prediction for building a next generation database Cloud. In: CIDR (2013)
Noorshams, Q., Busch, A., Rentschler, A., Bruhn, D., Kounev, S., Tuma, P., Reussner, R.: Automated modeling of I/O performance and interference effects in virtualized storage systems. In: 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW), pp. 88–93. IEEE (2014)
Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., Chun, B.G., ICSI, V.: Making sense of performance in data analytics frameworks. In: NSDI, vol. 15, pp. 293–307 (2015)
Patel, P., Bansal, D., Yuan, L., Murthy, A., Greenberg, A., Maltz, D.A., Kern, R., Kumar, H., Zikos, M., Wu, H., et al.: Ananta: cloud scale load balancing. In: ACM SIGCOMM Computer Communication Review, vol. 43, pp. 207–218. ACM, New York (2013)
Patra, A., Bursik, M., Dehn, J., Jones, M., Pavolonis, M., Pitman, E.B., Singh, T., Singla, P., Webley, P.: A DDDAS framework for volcanic ash propagation and hazard analysis. Procedia Comput. Sci. 9, 1090–1099 (2012)
Article MATH Google Scholar
Popescu, A.D., Balmin, A., Ercegovac, V., Ailamaki, A.: PREDIcT: towards predicting the runtime of large scale iterative analytics. Proc. VLDB Endow. 6(14), 1678–1689 (2013)
Article Google Scholar
Prudencio, E.E., Bauman, P.T., Williams, S., Faghihi, D., Ravi-Chandar, K., Oden, J.T.: A dynamic data driven application system for real-time monitoring of stochastic damage. Procedia Comput. Sci. 18, 2056–2065 (2013)
Article Google Scholar
Sharma, B.P., Wood, T., Das, C.R.: HybridMR: A hierarchical MapReduce scheduler for hybrid data centers. In: 2013 IEEE 33rd International Conference on Distributed Computing Systems (ICDCS), pp. 102–111. IEEE (2013)
Sloan Digital Sky Survey. http://www.sdss.org/
Stanford SNAP. http://snap.stanford.edu/
Tan, Y., Nguyen, H., Shen, Z., Gu, X., Venkatramani, C., Rajan, D.: Prepare: predictive performance anomaly prevention for virtualized cloud systems. In: 2012 IEEE 32nd International Conference on Distributed Computing Systems (ICDCS), pp. 285–294. IEEE (2012)
Vodacek, A., Kerekes, J.P., Hoffman, M.J.: Adaptive optical sensing in an object tracking DDDAS. Procedia Comput. Sci. 9, 1159–1166 (2012)
Article Google Scholar
Wang, K., Khan, M.M.H.: Performance prediction for Apache Spark platform. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC), pp. 166–173. IEEE (2015)
Wang, K., Khan, M.M.H., Gokhale, S.: Modeling interference for Apache Spark jobs. In: Proceedings of IEEE International Conference on Cloud Computing (CLOUD). San Francisco, USA (2016)
Xen Project. http://www.xenproject.org/
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (2010)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2–2. USENIX Association, Berkeley (2012)
Zhang, W., Rajasekaran, S., Wood, T., Zhu, M.: MIMP: Deadline and interference aware scheduling of Hadoop virtual machines. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 394–403. IEEE (2014)
Zhang, Z., Cherkasova, L., Loo, B.T.: Performance modeling of MapReduce jobs in heterogeneous cloud environments. In: Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, pp. 839–846. IEEE Computer Society (2013)
Zhu, Q., Tung, T.: A performance interference model for managing consolidated workloads in QoS-aware clouds. In: 2012 IEEE 5th International Conference on Cloud Computing (CLOUD). IEEE (2012)

Download references

Acknowledgements

This material is based upon work supported by the Air Force Office of Scientific Research Award No. FA9550-15-1-0184 under the DDDAS program. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agency.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA
Kewen Wang, Mohammad Maifi Hasan Khan, Nhan Nguyen & Swapna Gokhale

Authors

Kewen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Maifi Hasan Khan
View author publications
You can also search for this author in PubMed Google Scholar
Nhan Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Swapna Gokhale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Maifi Hasan Khan.

Additional information

This paper is a significantly extended version of the authors’ prior work [30, 31] and includes the design and evaluation of interference aware job scheduling algorithms, which is not presented in prior efforts.

Appendix

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, K., Khan, M.M.H., Nguyen, N. et al. Design and implementation of an analytical framework for interference aware job scheduling on Apache Spark platform. Cluster Comput 22 (Suppl 1), 2223–2237 (2019). https://doi.org/10.1007/s10586-017-1466-3

Download citation

Received: 27 November 2016
Revised: 25 May 2017
Accepted: 29 November 2017
Published: 23 December 2017
Issue Date: 16 January 2019
DOI: https://doi.org/10.1007/s10586-017-1466-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design and implementation of an analytical framework for interference aware job scheduling on Apache Spark platform

Abstract

Access this article

Similar content being viewed by others

A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

A Method to Identify Spark Important Parameters Based on Machine Learning

Notes

References

Acknowledgements