skip to main content
10.1145/3547276.3548522acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Public Access

Runtime Techniques for Automatic Process Virtualization

Authors Info & Claims
Published:13 January 2023Publication History

ABSTRACT

Asynchronous many-task runtimes look promising for the next generation of high performance computing systems. But these runtimes are usually based on new programming models, requiring extensive programmer effort to port existing applications to them. An alternative approach is to reimagine the execution model of widely used programming APIs, such as MPI, in order to execute them more asynchronously. Virtualization is a powerful technique that can be used to execute a bulk synchronous parallel program in an asynchronous manner. Moreover, if the virtualized entities can be migrated between address spaces, the runtime can optimize execution with dynamic load balancing, fault tolerance, and other adaptive techniques.

Previous work on automating process virtualization has explored compiler approaches, source-to-source refactoring tools, and runtime methods. These approaches achieve virtualization with different tradeoffs in terms of portability (across different architectures, operating systems, compilers, and linkers), programmer effort required, and the ability to handle all different kinds of global state and programming languages. We implement support for three different related runtime methods, discuss shortcomings and their applicability to user-level virtualized process migration, and compare performance to existing approaches. Compared to existing approaches, one of our new methods achieves what we consider the best overall functionality in terms of portability, automation, support for migration, and runtime performance.

References

  1. Bilge Acun, Abhishek Gupta, Nikhil Jain, Akhil Langer, Harshitha Menon, Eric Mikida, Xiang Ni, Michael Robson, Yanhua Sun, Ehsan Totoni, Lukasz Wesolowski, and Laxmikant Kale. 2014. Parallel Programming with Migratable Objects: Charm++ in Practice(SC).Google ScholarGoogle Scholar
  2. Gabriel Antoniu, Luc Bouge, and Raymond Namyst. 1999. An efficient and transparent thread migration scheme in the PM2 runtime system. In Proc. 3rd Workshop on Runtime Systems for Parallel Programming (RTSPP) San Juan, Puerto Rico. Lecture Notes in Computer Science 1586. Springer-Verlag, 496–510.Google ScholarGoogle Scholar
  3. Michael Bauer, Sean Treichler, Elliott Slaughter, and Alex Aiken. 2012. Legion: expressing locality and independence with logical regions. In Proceedings of the international conference on high performance computing, networking, storage and analysis. IEEE Computer Society Press, 66.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jean-Baptiste Besnard, Julien Adam, Sameer Shende, Marc Pérache, Patrick Carribault, Julien Jaeger, and Allen D. Maloney. 2016. Introducing Task-Containers as an Alternative to Runtime-Stacking. In Proceedings of the 23rd European MPI Users’ Group Meeting (Edinburgh, United Kingdom) (EuroMPI 2016). Association for Computing Machinery, New York, NY, USA, 51–63. https://doi.org/10.1145/2966884.2966910Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Shirley Browne, Christine Deane, George Ho, and Philip Mucci. 1999. PAPI: A Portable Interface to Hardware Performance Counters.Google ScholarGoogle Scholar
  6. B.L. Chamberlain, D. Callahan, and H.P. Zima. 2007. Parallel Programmability and the Chapel Language. Int. J. High Perform. Comput. Appl. 21 (August 2007), 291–312. Issue 3. https://doi.org/10.1177/1094342007078442Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sanjay Chatterjee, Sagnak Tasırlar, Zoran Budimlic, Vincent Cavé, Milind Chabbi, Max Grossman, Vivek Sarkar, and Yonghong Yan. 2013. Integrating Asynchronous Task Parallelism with MPI. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. 712–725. https://doi.org/10.1109/IPDPS.2013.78Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. James Dinan, Pavan Balaji, David Goodell, Douglas Miller, Marc Snir, and Rajeev Thakur. 2013. Enabling MPI Interoperability through Flexible Communication Endpoints. In Proceedings of the 20th European MPI Users’ Group Meeting (Madrid, Spain) (EuroMPI ’13). Association for Computing Machinery, New York, NY, USA, 13–18. https://doi.org/10.1145/2488551.2488553Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. James Dinan and Mario Flajslik. 2014. Contexts: A Mechanism for High Throughput Communication in OpenSHMEM. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models (Eugene, OR, USA) (PGAS ’14). Association for Computing Machinery, New York, NY, USA, Article 10, 9 pages. https://doi.org/10.1145/2676870.2676872Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Message Passing Interface Forum. 2015. MPI: A Message-passing Interface Standard, Version 3.1 ; June 4, 2015. High-Performance Computing Center Stuttgart, University of Stuttgart. https://books.google.com/books?id=Fbv7jwEACAAJGoogle ScholarGoogle Scholar
  11. Atsushi Hori, Min Si, Balazs Gerofi, Masamichi Takagi, Jai Dayal, Pavan Balaji, and Yutaka Ishikawa. 2018. Process-in-process: Techniques for Practical Address-space Sharing. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing (Tempe, Arizona) (HPDC ’18). ACM, New York, NY, USA, 131–143. https://doi.org/10.1145/3208040.3208045Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chao Huang, Orion Lawlor, and L. V. Kalé. 2003. Adaptive MPI. In Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2003), LNCS 2958. College Station, Texas, 306–322.Google ScholarGoogle Scholar
  13. Nikhil Jain, Abhinav Bhatele, Jae-Seung Yeom, Mark F. Adams, Francesco Miniati, Chao Mei, and Laxmikant V. Kale. 2015. Charm++ & MPI: Combining the Best of Both Worlds. In Proceedings of the IEEE International Parallel & Distributed Processing Symposium (to appear)(IPDPS ’15). IEEE Computer Society. LLNL-CONF-663041.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hartmut Kaiser, Thomas Heller, Bryce Adelstein-Lelbach, Adrian Serio, and Dietmar Fey. 2014. HPX: A task based programming model in a global address space. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. ACM, 6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Humaira Kamal and Alan Wagner. 2010. FG-MPI: Fine-grain MPI for multicore and clusters. In 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW). 1–8. https://doi.org/10.1109/IPDPSW.2010.5470773Google ScholarGoogle ScholarCross RefCross Ref
  16. Stas Negara, Gengbin Zheng, Kuo-Chuan Pan, Natasha Negara, Ralph E. Johnson, Laxmikant V. Kalé, and Paul M. Ricker. 2010. Automatic MPI to AMPI Program Transformation Using Photran. In Proceedings of the 2010 Conference on Parallel Processing (Ischia, Italy) (Euro-Par 2010). Springer-Verlag, Berlin, Heidelberg, 531–539.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Stas Negara, Gengbin Zheng, Kuo-Chuan Pan, Natasha Negara, Ralph E. Johnson, Laxmikant V. Kale, and Paul M. Ricker. 2010. Automatic MPI to AMPI Program Transformation using Photran. In 3rd Workshop on Productivity and Performance (PROPER 2010). Ischia/Naples/Italy.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Marc Perache, Herve Jourdren, and Raymond Namyst. 2008. MPC: A Unified Parallel Runtime for Clusters of NUMA Machines. In Proceedings of the 14th International Euro-Par Conference on Parallel Processing (Las Palmas de Gran Canaria, Spain) (Euro-Par ’08). Springer-Verlag, Berlin, Heidelberg, 78–88. https://doi.org/10.1007/978-3-540-85451-7_9Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. KJ Roberts, JC Dietrich, D Wirasaet, WJ Pringle, and JJ Westerink. 2021. Dynamic load balancing for predictions of storm surge and coastal flooding. Environmental Modelling and Software 140, 105045. https://doi.org/10.1016/j.envsoft.2021.105045Google ScholarGoogle ScholarCross RefCross Ref
  20. Hong Tang, Kai Shen, and Tao Yang. 1999. Compile/Run-Time Support for Threaded MPI Execution on Multiprogrammed Shared Memory Machines. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Atlanta, Georgia, USA) (PPoPP ’99). Association for Computing Machinery, New York, NY, USA, 107–118. https://doi.org/10.1145/301104.301114Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Marc Tchiboukdjian, Patrick Carribault, and Marc Pérache. 2012. Hierarchical Local Storage: Exploiting Flexible User-Data Sharing Between MPI Tasks. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium. 366–377. https://doi.org/10.1109/IPDPS.2012.42Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sam White and Laxmikant V. Kale. 2018. Optimizing point-to-point communication between adaptive MPI endpoints in shared memory. Concurrency and Computation: Practice and Experience (2018), n/a–n/a. https://doi.org/10.1002/cpe.4467Google ScholarGoogle ScholarCross RefCross Ref
  23. Gengbin Zheng, Stas Negara, Celso L. Mendes, Eduardo R. Rodrigues, and Laxmikant V. Kale. 2011. Automatic Handling of Global Variables for Multi-threaded MPI Programs. In Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS) 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Runtime Techniques for Automatic Process Virtualization
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICPP Workshops '22: Workshop Proceedings of the 51st International Conference on Parallel Processing
            August 2022
            233 pages
            ISBN:9781450394451
            DOI:10.1145/3547276

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 January 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate91of313submissions,29%
          • Article Metrics

            • Downloads (Last 12 months)42
            • Downloads (Last 6 weeks)8

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format