ABSTRACT
Monotask is a unit of work that uses only a single type of resource (e.g., CPU, network, disk I/O). While monotask was primarily introduced as a means to reason about job performance, in this paper we show that this fine-grained, resource-oriented abstraction can be leveraged by job schedulers to maximize cluster resource utilization. Although recent cluster schedulers have significantly improved resource allocation, the utilization of the allocated resources is often not high due to inaccurate resource requests. In particular, we show that existing scheduling mechanisms are ineffective for handling jobs with dynamic resource usage, which exists in common workloads, and propose a resource negotiation mechanism between job schedulers and executors that makes use of monotasks. We design a new framework, called Ursa, which enables the scheduler to capture accurate resource demands dynamically from the execution runtime and to provide timely, fine-grained resource allocation based on monotasks. Ursa also enables high utilization of the allocated resources by the execution runtime. We show by experiments that Ursa is able to improve cluster resource utilization, which effectively translates to improved makespan and average JCT.
- Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2019. A Large Time-Aware Graph. http://law.di.unimi.it/webdata/uk-union-2006-06-2007-05/Google Scholar
- Léon Bottou. 2019. MNIST8M - The infinite MNIST dataset. https://leon.bottou.org/projects/infimnistGoogle Scholar
- Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou. 2014. Apollo: scalable and coordinated scheduling for cloud-scale computing. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). 285--300.Google ScholarDigital Library
- Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink™: Stream and Batch Processing in a Single Engine. IEEE Data Eng. Bull. 38, 4 (2015), 28--38. http://sites.computer.org/debull/A15dec/p28.pdfGoogle Scholar
- Carlos Castillo. 2019. Datasets for Research on Web Spam Detection. http://chato.cl/webspam/datasets/Google Scholar
- Mosharaf Chowdhury, Zhenhua Liu, Ali Ghodsi, and Ion Stoica. 2016. HUG: Multi-Resource Fairness for Correlated and Elastic Demands. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). 407--424.Google Scholar
- Cloudera. 2015. How-to: Tune Your Apache Spark Jobs. https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/Google Scholar
- Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP 17). ACM, 153--167.Google ScholarDigital Library
- Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2016. Job-aware scheduling in eagle: Divide and stick to your probes. In Proceedings of the 7th ACM Symposium on Cloud Computing (SoCC 16). ACM, 497--509.Google ScholarDigital Library
- Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, and Willy Zwaenepoel. 2015. Hawk: Hybrid datacenter scheduling. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 499--510.Google Scholar
- Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 13), Vol. 48. ACM, 77--88.Google ScholarDigital Library
- Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 14), Vol. 42. ACM, 127--144.Google ScholarDigital Library
- Christina Delimitrou, Daniel Sanchez, and Christos Kozyrakis. 2015. Tarcil: reconciling scheduling speed and quality in large shared clusters. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC 15). ACM, 97--110.Google ScholarDigital Library
- Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types.. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), Vol. 11. 24--24.Google Scholar
- Ionel Gog, Malte Schwarzkopf, Adam Gleave, Robert NM Watson, and Steven Hand. 2016. Firmament: Fast, centralized cluster scheduling at scale. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 99--115.Google Scholar
- Google. 2019. Production-Grade Container Orchestration - Kubernetes. https://kubernetes.io/Google Scholar
- Robert Grandl, Ganesh Ananthanarayanan, Srikanth Kandula, Sriram Rao, and Aditya Akella. 2015. Multi-resource packing for cluster schedulers. In Proceedings of the ACM SIGCOMM Computer Communication Review (SIGCOMM 15), Vol. 44. ACM, 455--466.Google ScholarDigital Library
- Robert Grandl, Mosharaf Chowdhury, Aditya Akella, and Ganesh Ananthanarayanan. 2016. Altruistic scheduling in multi-resource clusters. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 65--80.Google Scholar
- Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, and Janardhan Kulkarni. 2016. GRAPHENE: Packing and Dependency-Aware Scheduling for Data-Parallel Clusters. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 81--97.Google Scholar
- Apache Hadoop. 2019. Hadoop: Capacity Scheduler. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.htmlGoogle Scholar
- Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center.. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), Vol. 11. 22--22.Google Scholar
- Yuzhen Huang, Yingjie Shi, Zheng Zhong, Yihui Feng, James Cheng, Jiwei Li, Haochuan Fan, Chao Li, Tao Guan, and Jingren Zhou. 2019. Yugong: Geo-Distributed Data and Job Placement at Scale. PVLDB 12, 12 (2019), 2155--2169. Google ScholarDigital Library
- Yuzhen Huang, Xiao Yan, Guanxian Jiang, Tatiana Jin, James Cheng, An Xu, Zhanhao Liu, and Shuo Tu. 2019. Tangram: Bridging Immutable and Mutable Abstractions for Distributed Data Analytics. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), Dahlia Malkhi and Dan Tsafrir (Eds.). USENIX Association, 191--206. https://www.usenix.org/conference/atc19/presentation/huangGoogle Scholar
- Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. 2009. Quincy: fair scheduling for distributed computing clusters. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP 09). ACM, 261--276.Google ScholarDigital Library
- Prajakta Kalmegh and Shivnath Babu. 2019. MIFO: A Query-Semantic Aware Resource Allocation Policy. In Proceedings of the 2019 ACM International Conference on Management of Data (SIGMOD 19). ACM, 1678--1695.Google ScholarDigital Library
- Konstantinos Karanasos, Sriram Rao, Carlo Curino, Chris Douglas, Kishore Chaliparambil, Giovanni Matteo Fumarola, Solom Heddaya, Raghu Ramakrishnan, and Sarvesh Sakalanaga. 2015. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 485--497.Google Scholar
- James E Kelley Jr and Morgan R Walker. 1959. Critical-path planning and scheduling. In Papers presented at the December 1--3, 1959, eastern joint IRE-AIEE-ACM computer conference. ACM, 160--173.Google ScholarDigital Library
- Marcel Kornacker, Alexander Behm, Victor Bittorf, Taras Bobrovytsky, Casey Ching, Alan Choi, Justin Erickson, Martin Grund, Daniel Hecht, Matthew Jacobs, Ishaan Joshi, Lenni Kuff, Dileep Kumar, Alex Leblang, Nong Li, Ippokratis Pandis, Henry Robinson, David Rorke, Silvius Rus, John Russell, Dimitris Tsirogiannis, Skye Wanderman-Milne, and Michael Yoder. 2015. Impala: A Modern, Open-Source SQL Engine for Hadoop.. In Proceedings of the 7th Biennial Conference on Innovative Data Systems (CIDR 15), Vol. 1. 9.Google Scholar
- Kubernetes. 2015. Taking network bandwidth into account for scheduling. https://github.com/kubernetes/kubernetes/issues/16837Google Scholar
- Haoyuan Li, Ali Ghodsi, Matei Zaharia, Scott Shenker, and Ion Stoica. 2014. Tachyon: Reliable, memory speed storage for cluster computing frameworks. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 14). ACM, 1--15.Google ScholarDigital Library
- Libin Liu and Hong Xu. 2018. Elasecutor: Elastic Executor Scheduling in Data Analytics Systems. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 18). ACM, 107--120.Google ScholarDigital Library
- Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, and Theo Vassilakis. 2010. Dremel: interactive analysis of web-scale datasets. Proceedings of the 36th International Conference on Very Large Data Bases (VLDB 10) 3, 1--2 (2010), 330--339.Google ScholarDigital Library
- Kay Ousterhout, Christopher Canel, Sylvia Ratnasamy, and Scott Shenker. 2017. Monotasks: Architecting for performance clarity in data analytics frameworks. In Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP 17). ACM, 184--200.Google ScholarDigital Library
- Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: distributed, low latency scheduling. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP 13). ACM, 69--84.Google ScholarDigital Library
- Jeff Rasley, Konstantinos Karanasos, Srikanth Kandula, Rodrigo Fonseca, Milan Vojnovic, and Sriram Rao. 2016. Efficient queue management for cluster scheduling. In Proceedings of the 11th European Conference on Computer Systems (EuroSys 16). ACM, 36.Google ScholarDigital Library
- Bikas Saha, Hitesh Shah, Siddharth Seth, Gopal Vijayaraghavan, Arun Murthy, and Carlo Curino. 2015. Apache tez: A unifying framework for modeling and building data processing applications. In Proceedings of the 2015 ACM International Conference on Management of Data (SIGMOD 15). ACM, 1357--1369.Google ScholarDigital Library
- Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. 2013. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th European Conference on Computer Systems (EuroSys 13).Google ScholarDigital Library
- Xiaoyang Sun, Chunming Hu, Renyu Yang, Peter Garraghan, Tianyu Wo, Jie Xu, Jianyong Zhu, and Chao Li. 2018. ROSE: Cluster Resource Scheduling via Speculative Over-Subscription. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS 18). IEEE, 949--960.Google ScholarCross Ref
- Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. 2009. Hive: a warehousing solution over a map-reduce framework. Proceedings of the 35th International Conference on Very Large Data Bases (VLDB 09) 2, 2 (2009), 1626--1629.Google ScholarDigital Library
- TPC-H. 2019. Decision support benchmark. http://www.tpc.org/tpch/Google Scholar
- Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A Kozuch, Mor Harchol-Balter, and Gregory R Ganger. 2016. TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. In Proceedings of the 11th European Conference on Computer Systems (EuroSys 16). ACM, 35.Google ScholarDigital Library
- Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Saha Bikas, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. 2013. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing (SoCC 13). ACM, 5.Google ScholarDigital Library
- Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-scale cluster management at Google with Borg. In Proceedings of the 10th European Conference on Computer Systems (EuroSys 15). ACM, 18.Google ScholarDigital Library
- Markus Weimer, Yingda Chen, Byung-Gon Chun, Tyson Condie, Carlo Curino, Chris Douglas, Yunseong Lee, Tony Majestro, Dahlia Malkhi, Sergiy Matusevych, Brandon Myers, Shravan M. Narayanamurthy, Raghu Ramakrishnan, Sriram Rao, Russell Sears, Beysim Sezgin, and Julia Wang. 2013. Reef: Retainable evaluator execution framework. Proceedings of the 39th International Conference on Very Large Data Bases (VLDB 13) 6, 12 (2013), 1370--1373.Google Scholar
- Eric P Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, and Yaoliang Yu. 2015. Petuum: A new platform for distributed machine learning on big data. IEEE Transactions on Big Data 1, 2 (2015), 49--67.Google ScholarCross Ref
- Fan Yang, Jinfeng Li, and James Cheng. 2016. Husky: Towards a More Efficient and Expressive Distributed Computing Framework. PVLDB 9, 5 (2016), 420--431. Google ScholarDigital Library
- Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 1 (2015), 181--213.Google ScholarDigital Library
- Yi Yao, Han Gao, Jiayin Wang, Ningfang Mi, and Bo Sheng. 2016. OpERA: opportunistic and efficient resource allocation in Hadoop YARN by harnessing idle resources. In Proceedings of the 25th International Conference on Computer Communication and Networks (ICCCN 16). IEEE, 1--9.Google ScholarCross Ref
- Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European conference on Computer systems (EuroSys 10). ACM, 265--278.Google ScholarDigital Library
- Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX Association, 2--2.Google Scholar
- Xiaoda Zhang, Zhuzhong Qian, Sheng Zhang, Xiangbo Li, Xiaoliang Wang, and Sanglu Lu. 2018. COBRA: Toward Provably Efficient Semi-Clairvoyant Scheduling in Data Analytics Systems. In 2018 IEEE Conference on Computer Communications (IEEE INFOCOM). IEEE, 513--521.Google Scholar
- Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 301--316.Google Scholar
Recommendations
Self-characteristics based Energy-Efficient Resource Scheduling for Cloud
AbstractEnergy optimization in cloud computing is more attractive and much consideration in resource management, where, energy consumption is also one of the important metrics that should be included to resource management technique apart from performance ...
Improving Resource Utilization of a Cloud-Based Testing Platform for Android Applications
MS '15: Proceedings of the 2015 IEEE International Conference on Mobile ServicesThe Cloud Testing Platform (CTP) is a cloud-based system for testing Android applications (apps). It can be used to test whether an Android app can provide consistent user experiences on diverse devices developed by different manufacturers with ...
Comments