skip to main content
10.1145/3344341.3368798acmconferencesArticle/Chapter ViewAbstractPublication PagesuccConference Proceedingsconference-collections
research-article

Selecting Efficient Cloud Resources for HPC Workloads

Published:02 December 2019Publication History

ABSTRACT

Constant advances in CPU, storage, and network virtualization are enabling high-performance computing (HPC) applications to be efficiently executed on cloud computing systems. In this computing model, users pay only for what they use, with no need to acquire nor maintain expensive computing infrastructure. Moreover, users have at their disposal multiple kinds of computing resources and are able to assemble computing infrastructures that fit the application needs. Nonetheless, the available computing resources vary in price and performance and selecting the proper resources to execute the applications is of utmost importance to optimize cost and performance. In this work, we discuss the performance and cost implications of selecting different kinds of cloud resources to execute HPC workloads and show that the best resources for executing a given application depend not only on the application itself but also on the input dataset being processed. We also propose a methodology to support the selection of efficient cloud resources for these applications and show that is was able to select the best of 11 different cloud infrastructure configurations to execute 8 different benchmarks by executing just a few seconds of each application on each one of the configurations.

References

  1. David H. Bailey. 2011. NAS Parallel Benchmarks .Springer US, Boston, MA, 1254--1259. https://doi.org/10.1007/978-0--387-09766--4_133Google ScholarGoogle Scholar
  2. Arnamoy Bhattacharyya and Torsten Hoefler. 2014. PEMOGEN: Automatic Adaptive Performance Modeling During Program Runtime. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT '14). ACM, New York, NY, USA, 393--404. https://doi.org/10.1145/2628071.2628100Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Philip Church, Andrzej Goscinski, and Christophe Lefèvre. 2015. Exposing HPC and sequential applications as services through the development and deployment of a SaaS cloud. Future Generation Computer Systems , Vol. 43--44 (2015), 24 -- 37. https://doi.org/10.1016/j.future.2014.10.001Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Dauwe, S. Pasricha, A. A. Maciejewski, and H. J. Siegel. 2018. An Analysis of Multilevel Checkpoint Performance Models. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) . IEEE Press, Vancouver, BC, CANADA, 783--792. https://doi.org/10.1109/IPDPSW.2018.00125Google ScholarGoogle Scholar
  5. Jill Dunbar NASA Advanced Supercomputing Division. 2016. NAS Parallel Benchmarks. Accessed: 2018-01--12.Google ScholarGoogle Scholar
  6. R. Escobar and R. V. Boppana. 2016. Performance Prediction of Parallel Applications Based on Small-Scale Executions. In 2016 IEEE 23rd International Conference on High Performance Computing (HiPC) . IEEE Press, Hyderabad, Telangana, India, 362--371. https://doi.org/10.1109/HiPC.2016.049Google ScholarGoogle Scholar
  7. A. Gupta, P. Faraboschi, F. Gioachin, L. V. Kale, R. Kaufmann, B. Lee, V. March, D. Milojicic, and C. H. Suen. 2016. Evaluating and Improving the Performance and Scheduling of HPC Applications in Cloud. IEEE Transactions on Cloud Computing , Vol. 4, 3 (July 2016), 307--321. https://doi.org/10.1109/TCC.2014.2339858Google ScholarGoogle ScholarCross RefCross Ref
  8. A. Jayakumar, P. Murali, and S. Vadhiyar. 2015. Matching Application Signatures for Performance Predictions Using a Single Execution. In 2015 IEEE International Parallel and Distributed Processing Symposium . IEEE Press, Hyderabad, Telangana, India, 1161--1170. https://doi.org/10.1109/IPDPS.2015.20Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Giovanni Mariani, Andreea Anghel, Rik Jongerius, and Gero Dittmann. 2018. Predicting cloud performance for HPC applications before deployment. Future Generation Computer Systems , Vol. 87 (2018), 618 -- 628. https://doi.org/10.1016/j.future.2017.10.048Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Peter Mell, Tim Grance, et almbox. 2011. The NIST definition of cloud computing.Google ScholarGoogle Scholar
  11. Microsoft Corporation. 2018. Linux Virtual Machines Pricing. Accessed: 2018-01--12.Google ScholarGoogle Scholar
  12. Ludovic Métivier and Romain Brossier. 2016. The SEISCOPE optimization toolbox: A large-scale nonlinear optimization library based on reverse communication. GEOPHYSICS , Vol. 81 (02 2016), F11--F25. https://doi.org/10.1190/geo2015-0031.1Google ScholarGoogle Scholar
  13. M. A. S. Netto, R. L. F. Cunha, and N. Sultanum. 2015. Deciding When and How to Move HPC Jobs to the Cloud. Computer , Vol. 48, 11 (Nov 2015), 86--89. https://doi.org/10.1109/MC.2015.351Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. T. Okita, T. A. Coimbra, C. B. Rodamilans, Martin Tygel, and Edson Borin. 2019 a. Optimizing the Execution Costs of High-Performance Geophysics Software on the Cloud. In 81st EAGE Conference and Exhibition 2019. EAGE, London, UK. https://doi.org/10.3997/2214--4609.201900770Google ScholarGoogle Scholar
  15. N. T. Okita, T. A. Coimbra, Martin Tygel, and Edson Borin. 2018. Using SPITS to optimize the cost of high-performance geophysics processing on the cloud. In EAGE Workshop on High Performance Computing for Upstream, 2018. EAGE, Guatiguará, Santander, Colombia.Google ScholarGoogle ScholarCross RefCross Ref
  16. N. T. Okita, T. A. Coimbra, Martin Tygel, and Edson Borin. 2019 b. A heuristic to optimize the execution cost of distributed seismic processing programs on the cloud. In Society of Exploration Geophysicists Annual Meeting (SEG'19). Society of Exploration Geophysicists, San Antonio, TX, USA.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ilia Pietri, Gideon Juve, Ewa Deelman, and Rizos Sakellariou. 2014. A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud. In Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science (WORKS '14). IEEE Press, Piscataway, NJ, USA, 11--19. https://doi.org/10.1109/WORKS.2014.12Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sukhdeep Sodhi, Jaspal Subhlok, and Qiang Xu. 2008. Performance Prediction with Skeletons. Cluster Computing , Vol. 11, 2 (June 2008), 151--165. https://doi.org/10.1007/s10586-007-0039--2Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. T. Yang, Xiaosong Ma, and F. Mueller. 2005. Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution. In SC '05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing . IEEE Computer Society, Washington, DC, USA, 40--40. https://doi.org/10.1109/SC.2005.20Google ScholarGoogle Scholar
  20. Jidong Zhai, Wenguang Chen, and Weimin Zheng. 2010. PHANTOM: Predicting Performance of Parallel Applications on Large-scale Parallel Machines Using a Single Node. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '10). ACM, New York, NY, USA, 305--314. https://doi.org/10.1145/1693453.1693493Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Weizhe Zhang, Meng Hao, and Marc Snir. 2017. Predicting HPC parallel program performance based on LLVM compiler. Cluster Computing , Vol. 20, 2 (01 Jun 2017), 1179--1192. https://doi.org/10.1007/s10586-016-0707--1Google ScholarGoogle Scholar

Index Terms

  1. Selecting Efficient Cloud Resources for HPC Workloads

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            UCC'19: Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing
            December 2019
            307 pages
            ISBN:9781450368940
            DOI:10.1145/3344341

            Copyright © 2019 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 2 December 2019

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate38of125submissions,30%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader