ABSTRACT
Software-as-a-service science gateways provide user interfaces and middleware for accessing scientific software deployed on remote high-performance computing resources and clusters. Selecting the resource to use for a particular job submission may be left to the user, who may need more information to make good choices when selecting from multiple options. To address this problem, we have designed and developed an extensible, scalable metascheduling system that can provide automated scheduling capabilities based on resource availability and other characteristics. We develop a system model based on queuing theory to guide our implementation and provide a basis for analysis. In particular, we derive an efficiency metric from these considerations. We implement the metascheduling system within the open-source Apache Airavata framework for science gateways as a supplemental service for guiding the job submission capabilities. We measure efficiency in representative scenarios, observing efficiencies of greater than 70% even in scenarios with high input rates and low job acceptance rates.
- 2023. Airavata DataModels. https://github.com/apache/airavata/blob/develop/thrift-interface-descriptions/data-models/experiment-catalog-models/process_model.thrift.Google Scholar
- 2023. Airavata Metascheduler. https://github.com/apache/airavata/tree/develop/modules/airavata-metascheduler, https://github.com/apache/airavata/tree/develop/modules/cluster-monitoring.Google Scholar
- 2023. Airavata Python SDK. https://github.com/apache/airavata/tree/develop/airavata-api/airavata-client-sdks/airavata-python-sdk.Google Scholar
- Enis Afgan, Dannon Baker, Bérénice Batut, Marius Van Den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A Grüning, 2018. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic acids research 46, W1 (2018), W537–W544.Google Scholar
- Enis Afgan and Purushotham Bangalore. 2008. Embarrassingly parallel jobs are not embarrassingly easy to schedule on the grid. In 2008 Workshop on Many-Task Computing on Grids and Supercomputers. 1–10. https://doi.org/10.1109/MTAGS.2008.4777910Google ScholarCross Ref
- Enis Afgan, Purushotham V. Bangalore, and Tibor Skala. 2011. Scheduling and planning job execution of loosely coupled applications. The Journal of Supercomputing 59 (2011), 1431 – 1454.Google ScholarDigital Library
- Aymen Alsaadi, Logan Ward, Andre Merzky, Kyle Chard, Ian Foster, Shantenu Jha, and Matteo Turilli. 2022. RADICAL-Pilot and Parsl: Executing Heterogeneous Workflows on HPC Platforms. In 2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS). 27–34. https://doi.org/10.1109/WORKS56498.2022.00009Google Scholar
- Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, and Ivona Brandic. 2009. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems 25, 6 (2009), 599–616. https://doi.org/10.1016/j.future.2008.12.001Google ScholarDigital Library
- Thomas E. Carroll and Daniel Grosu. 2008. An Incentive-Compatible Mechanism for Scheduling Non-Malleable Parallel Jobs with Individual Deadlines. In 2008 37th International Conference on Parallel Processing. 107–114. https://doi.org/10.1109/ICPP.2008.27Google ScholarDigital Library
- Tuhinangshu Choudhury, Gauri Joshi, Weina Wang, and Sanjay Shakkottai. 2021. Job Dispatching Policies for Queueing Systems with Unknown Service Rates(MobiHoc ’21). Association for Computing Machinery, New York, NY, USA, 181–190. https://doi.org/10.1145/3466772.3467047Google ScholarDigital Library
- J Eric Coulter, Eroma Abeysinghe, Sudhakar Pamidighantam, and Marlon Pierce. 2019. Virtual clusters in the jetstream cloud: A story of elasticized hpc. In Proceedings of the Humans in the Loop: Enabling and Facilitating Research on Cloud Computing. 1–6.Google ScholarDigital Library
- Attila Csenki. 2011. Independent events in elementary probability theory. International Journal of Mathematical Education in Science and Technology 42, 5 (2011), 685–691. https://doi.org/10.1080/0020739X.2011.562313Google ScholarCross Ref
- Borries Demeler. 2005. UltraScan: a comprehensive data analysis software package for analytical ultracentrifugation experiments. Modern analytical ultracentrifugation: techniques and methods 10 (2005), 210–229.Google Scholar
- Ye Fan, Sudhakar Pamidighantam, and Warren Smith. 2014. Incorporating Job Predictions into the SEAGrid Science Gateway(XSEDE ’14). Association for Computing Machinery, New York, NY, USA, Article 57, 3 pages. https://doi.org/10.1145/2616498.2616563Google ScholarDigital Library
- Carole Fayad, Jonathan M. Garibaldi, and Djamila Ouelhadj. 2007. Fuzzy Grid Scheduling Using Tabu Search. In 2007 IEEE International Fuzzy Systems Conference. 1–6. https://doi.org/10.1109/FUZZY.2007.4295513Google Scholar
- Saurabh Garg, Pramod Konugurthi, and Rajkumar Buyya. 2008. A Linear Programming Driven Genetic Algorithm for Meta-Scheduling on Utility Grids. In 2008 16th International Conference on Advanced Computing and Communications. 19–26. https://doi.org/10.1109/ADCOM.2008.4760422Google ScholarCross Ref
- David Y Hancock, Jeremy Fischer, John Michael Lowe, Winona Snapp-Childs, Marlon Pierce, Suresh Marru, J Eric Coulter, Matthew Vaughn, Brian Beck, Nirav Merchant, 2021. Jetstream2: Accelerating cloud computing via Jetstream. In Practice and Experience in Advanced Research Computing. 1–8.Google Scholar
- James H. Anderson J. Y-T.Leung. 2004. Handbook of Scheduling: Algorithms, Models, and Performance Analysis. Chapman and Hall.Google ScholarDigital Library
- Katherine A. Lawrence, Michael Zentner, Nancy Wilkins-Diehr, Julie A. Wernert, Marlon Pierce, Suresh Marru, and Scott Michael. 2015. Science gateways today and tomorrow: positive perspectives of nearly 5000 members of the research community. Concurrency and Computation: Practice and Experience 27, 16 (2015), 4252–4268. https://doi.org/10.1002/cpe.3526Google ScholarCross Ref
- Gunho Lee, Byung-Gon Chun, and H. Katz. 2011. Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud(HotCloud’11). USENIX Association, USA, 4.Google Scholar
- D. V. Lindley and L. D. Phillips. 1976. Inference for a Bernoulli Process (A Bayesian View). The American Statistician 30, 3 (1976), 112–119. http://www.jstor.org/stable/2683855Google Scholar
- Suresh Marru, Lahiru Gunathilake, Chathura Herath, Patanachai Tangchaisin, Marlon Pierce, Chris Mattmann, Raminder Singh, Thilina Gunarathne, Eran Chinthaka, Ross Gardler, 2011. Apache Airavata: a framework for distributed applications and computational workflows. In Proceedings of the 2011 ACM workshop on Gateway computing environments. 21–28.Google ScholarDigital Library
- Avinash Maurya, Bogdan Nicolae, Ishan Guliani, and M. Mustafa Rafique. 2020. CoSim: A Simulator for Co-Scheduling of Batch and On-Demand Jobs in HPC Datacenters. In 2020 IEEE/ACM 24th International Symposium on Distributed Simulation and Real Time Applications (DS-RT). 1–8. https://doi.org/10.1109/DS-RT50469.2020.9213578Google Scholar
- Michael McLennan, Steven Clark, Ewa Deelman, Mats Rynge, Karan Vahi, Frank McKenna, Derrick Kearney, and Carol Song. 2015. HUBzero and Pegasus: integrating scientific workflows into science gateways. Concurrency and Computation: Practice and Experience 27, 2 (2015), 328–343.Google ScholarCross Ref
- Marlon Pierce, Suresh Marru, Eroma Abeysinghe, Sudhakar Pamidighantam, Marcus Christie, and Dimuthu Wannipurage. 2018. Supporting science gateways using Apache Airavata and SciGaP services. In Proceedings of the Practice and Experience on Advanced Research Computing. 1–4.Google ScholarDigital Library
- Marlon Pierce, Suresh Marru, Borries Demeler, Raminderjeet Singh, and Gary Gorbet. 2014. The Apache Airavata application programming interface: overview and evaluation with the UltraScan science gateway. In 2014 9th Gateway Computing Environments Workshop. IEEE, 25–29.Google ScholarDigital Library
- Marlon E Pierce, Mark A Miller, Emre H Brookes, Mona Wong, Enis Afgan, Yan Liu, Sandra Gesing, Maytal Dahan, Suresh Marru, and Tony Walker. 2018. Towards a science gateway reference architecture. (2018).Google Scholar
- Alexey Savelyev and Emre Brookes. 2019. GenApp: Extensible tool for rapid generation of web and native GUI applications. Future Generation Computer Systems 94 (2019), 929–936.Google ScholarDigital Library
- Jennifer M. Schopf. 2004. Ten Actions When Grid Scheduling. Springer US, Boston, MA, 15–23. https://doi.org/10.1007/978-1-4615-0509-9_2Google Scholar
- Uwe Schwiegelshohn and Ramin Yahyapour. 1999. Resource Allocation and Scheduling in Metasystems. In Proceedings of the 7th International Conference on High-Performance Computing and Networking(HPCN Europe ’99). Springer-Verlag, Berlin, Heidelberg, 851–860.Google ScholarDigital Library
- Stelios Sotiriadis, Nik Bessis, Fatos Xhafa, and Nick Antonopoulos. 2012. From Meta-computing to Interoperable Infrastructures: A Review of Meta-schedulers for HPC, Grid and Cloud. In 2012 IEEE 26th International Conference on Advanced Information Networking and Applications. 874–883. https://doi.org/10.1109/AINA.2012.15Google ScholarDigital Library
- R. Srikant and Lei Ying. 2014. Communication Networks: An Optimization, Control and Stochastic Networks Perspective. Cambridge University Press, USA.Google ScholarDigital Library
- Joe Stubbs, Richard Cardone, Mike Packard, Anagha Jamthe, Smruti Padhy, Steve Terry, Julia Looney, Joseph Meiring, Steve Black, Maytal Dahan, 2021. Tapis: an API platform for reproducible, distributed computational research. In Advances in Information and Communication: Proceedings of the 2021 Future of Information and Communication Conference (FICC), Volume 1. Springer, 878–900.Google ScholarCross Ref
- Dimuthu Wannipurage, Suresh Marru, Marlon Piece, Eroma Abeysinghe, Sudhakar Pamidighantam, Marcus Christie, Gourav Shenoy, Ajinkya Dhamnaskar, and Lahiru Jayathilaka. 2019. Implementing a flexible, fault tolerant job management system for science gateways. In Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (learning). 1–8.Google ScholarDigital Library
- Dimuthu Wannipurage, Suresh Marru, Marlon Piece, Eroma Abeysinghe, Sudhakar Pamidighantam, Marcus Christie, Gourav Shenoy, Ajinkya Dhamnaskar, and Lahiru Jayathilaka. 2019. Implementing a Flexible, Fault Tolerant Job Management System for Science Gateways. In Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning) (Chicago, IL, USA) (PEARC ’19). Association for Computing Machinery, New York, NY, USA, Article 15, 8 pages. https://doi.org/10.1145/3332186.3332233Google ScholarDigital Library
- Fatos Xhafa, Javier Carretero, Bernabé Dorronsoro, and Enrique Alba. 2009. A Tabu Search Algorithm for Scheduling Independent Jobs in Computational Grids. Comput. Informatics 28 (2009), 237–250.Google Scholar
- Shijue Zheng, Wanneng Shu, and Li Gao. 2006. Task Scheduling using Parallel Genetic Simulated Annealing Algorithm. In 2006 IEEE International Conference on Service Operations and Logistics, and Informatics. 46–50. https://doi.org/10.1109/SOLI.2006.328980Google ScholarCross Ref
Recommendations
Apache airavata: a framework for distributed applications and computational workflows
GCE '11: Proceedings of the 2011 ACM workshop on Gateway computing environmentsIn this paper, we introduce Apache Airavata, a software framework to compose, manage, execute, and monitor distributed applications and workflows on computational resources ranging from local resources to computational grids and clouds. Airavata builds ...
Quality of Service on the Grid Via Metascheduling with Resource Co-Scheduling and Co-Reservation
Assuring predictable resources (processors, memory, storage) for applications running on the Grid is a critical factor for the success of the Grid for solving real-life problems. We extend the Globus Resource Management Architecture to provide ...
The apache airavata application programming interface: overview and evaluation with the UltraScan science gateway
GCE '14: Proceedings of the 9th Gateway Computing Environments WorkshopWe present an overview of the Apache Airavata Application Programming Interface (API), describe the design choices and implementation details, and describe how API methods map to the UltraScan Science Gateway use case. The Airavata API is designed to ...
Comments