Orchestration of materials science workflows for heterogeneous resources at large scale
- Univ. of Tennessee, Knoxville, TN (United States)
- Univ. of Utah, Salt Lake City, UT (United States)
- Idaho National Laboratory (INL), Idaho Falls, ID (United States)
- MicroTesting Solutions LLC, Hilliard, OH (United States)
- Johns Hopkins Univ., Laurel, MD (United States). Applied Physics Lab.
In the era of big data, materials science workflows need to handle large-scale data distribution, storage, and computation. Any of these areas can become a performance bottleneck. We present a framework for analyzing internal material structures (e.g., cracks) to mitigate these bottlenecks. We demonstrate the effectiveness of our framework for a workflow performing synchrotron X-ray computed tomography reconstruction and segmentation of a silica-based structure. Our framework provides a cloud-based, cutting-edge solution to challenges such as growing intermediate and output data and heavy resource demands during image reconstruction and segmentation. Specifically, our framework efficiently manages data storage, scaling up compute resources on the cloud. The multi-layer software structure of our framework includes three layers. A top layer uses Jupyter notebooks and serves as the user interface. A middle layer uses Ansible for resource deployment and managing the execution environment. A low layer is dedicated to resource management and provides resource management and job scheduling on heterogeneous nodes (i.e., GPU and CPU). At the core of this layer, Kubernetes supports resource management, and Dask enables large-scale job scheduling for heterogeneous resources. The broader impact of our work is four-fold: through our framework, we hide the complexity of the cloud’s software stack to the user who otherwise is required to have expertise in cloud technologies; we manage job scheduling efficiently and in a scalable manner; we enable resource elasticity and workflow orchestration at a large scale; and we facilitate moving the study of nonporous structures, which has wide applications in engineering and scientific fields, to the cloud. While we demonstrate the capability of our framework for a specific materials science application, it can be adapted for other applications and domains because of its modular, multi-layer architecture.
- Research Organization:
- Idaho National Laboratory (INL), Idaho Falls, ID (United States)
- Sponsoring Organization:
- USDOE; National Science Founation (NSF)
- Grant/Contract Number:
- AC07-05ID14517; 1841758; 2028923; 2103845; 2138811
- OSTI ID:
- 1986538
- Report Number(s):
- INL/JOU-23-71771-Rev000; TRN: US2402761
- Journal Information:
- International Journal of High Performance Computing Applications, Vol. 37, Issue 3-4; ISSN 1094-3420
- Publisher:
- SAGECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
SDN for End-to-end Networked Science at the Exascale (SENSE) - Final Technical Report