skip to main content
10.1145/3144769.3144774acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
short-paper

In Situ Workflows at Exascale: System Software to the Rescue

Published:12 November 2017Publication History

ABSTRACT

Implementing an in situ workflow involves several challenges related to data placement, task scheduling, efficient communications, scalability, and reliability. Most of the current implementations provide reasonably performant solutions to these issues by focusing on high-performance communications and low-overhead execution models at the cost of reliability and flexibility.

One of the key design choices in such infrastructures is between providing a single-program, integrated environment or a multiple-program, connected environment, both solutions having their own strengths and weaknesses. While these approaches might be appropriate for current production systems, the expected characteristics of exascale machines will shift current priorities.

After a survey of the trade-offs and challenges of integrated and connected in situ workflow solutions available today, we discuss in this paper how exascale systems will impact those designs. In particular, we identify missing features of current system-level software required for the evolution of in situ workflows toward exascale and how system software innovations from the Argo Exascale Computing Project can help address those challenges.

References

  1. 2016. The In Situ Terminology Project. (Feb 2016). https://ix.cs.uoregon.edu/~hank/insituterminology/index.cgi?n=Phase1B.Phase1BProposedInSituCategorizations.Google ScholarGoogle Scholar
  2. Sean Ahern, Eric Brugger, Brad Whitlock, Jeremy S Meredith, Kathleen Biagas, Mark C Miller, and Hank Childs. 2013. VisIt: Experiences with Sustainable Software. arXiv preprint arXiv:1309.1796 (2013).Google ScholarGoogle Scholar
  3. James Ahrens, Berk Geveci, and Charles Law. 2005. ParaView: An End-User Tool for Large-Data Visualization. The Visualization Handbook (2005), 717.Google ScholarGoogle Scholar
  4. Scott Atchley, David Dillow, Galen Shipman, Patrick Geoffray, Jeffrey M Squyres, George Bosilca, and Ronald Minnich. 2011. The common communication interface (CCI). In 2011 IEEE 19th Annual Symposium on High Performance Interconnects (HOTI). IEEE, 51--60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D.A. Boyuka, S. Lakshminarasimham, Xiaocheng Zou, Zhenhuan Gong, J. Jenkins, E.R. Schendel, N. Podhorszki, Qing Liu, S. Klasky, and N.F. Samatova. 2014. Transparent In Situ Data Transformations in ADIOS. In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 256--266. https://doi.org/10.1109/CCGrid.2014.73Google ScholarGoogle Scholar
  6. J. Dayal, D. Bratcher, G. Eisenhauer, K. Schwan, M. Wolf, Xuechen Zhang, H. Abbasi, S. Klasky, and N. Podhorszki. 2014. Flexpath: Type-Based Publish/Subscribe System for Large-Scale Science Analytics. In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 246--255. https://doi.org/10.1109/CCGrid.2014.104Google ScholarGoogle Scholar
  7. Ciprian Docan, Manish Parashar, and Scott Klasky. 2010. DataSpaces: an Interaction and Coordination Framework for Coupled Simulation Workflows. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC '10). ACM, New York, NY, USA, 25--36. https://doi.org/10.1145/1851476.1851481 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ciprian Docan, Manish Parashar, and Scott Klasky. 2010. Enabling High-Speed Asynchronous Data Extraction and Transfer using DART. Concurrency and Computation: Practice and Experience 22 (2010), 1181--1204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jack Dongarra, Pete Beckman, et al. 2011. The International Exascale Software Project Roadmap. Int. J. High Perform. Comput. Appl. 25, 1 (Feb. 2011), 58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Matthieu Dorier, Gabriel Antoniu, Franck Cappello, Marc Snir, and Leigh Orf. 2012. Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O. In CLUSTER - IEEE International Conference on Cluster Computing. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Matthieu Dorier, Matthieu Dreher, Tom Peterka, Justin M Wozniak, Gabriel Antoniu, and Bruno Raffin. 2015. Lessons Learned from Building in Situ Coupling Frameworks. In Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization. ACM, 19--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Dreher and T. Peterka. 2017. Decaf: Decoupled Dataflows for In Situ High-Performance Workflows. Technical Report ANL/MCS-TM-371.Google ScholarGoogle ScholarCross RefCross Ref
  13. Matthieu Dreher and Bruno Raffin. 2014. A Flexible Framework for Asynchronous In Situ and In Transit Analytics for Scientific Simulations. In 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. https://hal.inria.fr/hal-00941413Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Greg Eisenhauer, Matthew Wolf, Hasan Abbasi, and Karsten Schwan. [n. d.]. Event-based Systems: Opportunities and Challenges at Exascale. In Proceedings of the Third ACM International Conference on Distributed Event-Based Systems (DEBS '09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dan Ellsworth, Tapasya Patki, Swann Perarnau, Sangmin Seo, Abdelhalim Amer, Judicael Zounmevo, Rinku Gupta, Kazutomo Yoshii, Henry Hoffman, Allen Malony, Martin Schulz, and Pete Beckman. 2016. Systemwide Power Management with Argo. In High-Performance, Power-Aware Computing (HPPAC). Google ScholarGoogle ScholarCross RefCross Ref
  16. N. Fabian, K. Moreland, D. Thompson, A.C. Bauer, P. Marion, B. Geveci, M. Rasquin, and K.E. Jansen. 2011. The ParaView Coprocessing Library: A Scalable, General Purpose In Situ Visualization Library. In 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV). 89--96. https://doi.org/10.1109/LDAV.2011.6092322Google ScholarGoogle Scholar
  17. Qing Liu, Jeremy Logan, Yuan Tian, Hasan Abbasi, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Roselyne Tchoua, Jay Lofstead, Ron Oldfield, Manish Parashar, Nagiza Samatova, Karsten Schwan, Arie Shoshani, Matthew Wolf, Kesheng Wu, and Weikuan Yu. 2014. Hello ADIOS: The Challenges and Lessons of Developing Leadership Class I/O Frameworks. Concurrency and Computation: Practice and Experience 26, 7 (2014), 1453--1473. https://doi.org/10.1002/cpe.3125 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. A. Oldfield, P. Widener, A. B. Maccabe, L. Ward, and T. Kordenbrock. 2006. Efficient Data-Movement for Lightweight I/O. In 2006 IEEE International Conference on Cluster Computing. Google ScholarGoogle ScholarCross RefCross Ref
  19. Swann Perarnau, Rinku Gupta, and Pete Beckman. 2015. Argo: An Exascale Operating System and Runtime. In The International Conference for High Performance Computing, Networking, Storage and Analysis, SC15.Google ScholarGoogle Scholar
  20. Swann Perarnau, Rajeev Thakur, Kamil Iskra, Ken Raffenetti, Franck Cappello, Rinku Gupta, Pete Beckman, Marc Snir, Henry Hoffmann, Martin Schulz, and Barry Rountree. 2015. Distributed Monitoring and Management of Exascale Systems in the Argo Project. In IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), Short Paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Perarnau, J. A. Zounmevo, M. Dreher, B. C. V. Essen, R. Gioiosa, K. Iskra, M. B. Gokhale, K. Yoshii, and P. Beckman. 2017. Argo NodeOS: Toward Unified Resource Management for Exascale. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 153--162. https://doi.org/10.1109/IPDPS.2017.25Google ScholarGoogle Scholar
  22. Brad Whitlock, Jean M. Favre, and Jeremy S. Meredith. 2011. Parallel In Situ Coupling of Simulation with a Fully Featured Visualization System. In Proceedings of the 11th Eurographics Conference on Parallel Graphics and Visualization (EGPGV '11). Eurographics Association, 101--109.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Michael Wilde, Mihael Hategan, Justin M. Wozniak, Ben Clifford, Daniel S. Katz, and Ian Foster. 2011. Swift: A Language for Distributed Parallel Scripting. Parallel Comput. 37, 9 (2011). https://doi.org/10.1016/j.parco.2011.05.005Google ScholarGoogle Scholar

Index Terms

  1. In Situ Workflows at Exascale: System Software to the Rescue

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ISAV'17: Proceedings of the In Situ Infrastructures on Enabling Extreme-Scale Analysis and Visualization
          November 2017
          53 pages
          ISBN:9781450351393
          DOI:10.1145/3144769

          Copyright © 2017 ACM

          Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 November 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper
          • Research
          • Refereed limited

          Acceptance Rates

          ISAV'17 Paper Acceptance Rate9of28submissions,32%Overall Acceptance Rate23of63submissions,37%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader