ABSTRACT
The High-Performance Computing (HPC) community started using Operating System virtualization, aka Containerization, due to its near-native performance compared to the BareMetal environment. Despite several existing container solutions for HPC, users are still unconvinced about the suitability of container orchestration solutions for their extreme-scale applications due to the lack of thorough performance assessment on recent advances. A key concern in current generation containerized HPC is the inherent performance degradation due to resource interference by co-hosted applications. This paper proposes an analytical model for data locality and memory bandwidth contention-aware container placement strategy for our developed containerized High-Performance Computing environment (cHPCe). Performance is evaluated and compared against LXD, Docker Swarm, Kubernetes, and Singularity using HPC Challenge and NAS parallel benchmarks. To the best of our knowledge, no study has been reported yet for such a comparative analysis with insight into data locality and memory bandwidth contention-aware container placement. The experimental result shows that data locality and memory bandwidth contention-awareness reduce the overall execution time of the benchmark in cHPCe by 51.41% in the best case compared to the Docker Swarm.
- Surya Kant Garg and J Lakshmi. 2017. Workload performance and interference on containers. In IEEE SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI. 1–6.Google Scholar
- D Ghatrehsamani et al.2020. The art of cpu-pinning: Evaluating and improving the performance of virtualization and containerization platforms. In 49th ICPP. ACM, 1–11.Google Scholar
- A Hartstein et al.2006. Cache miss behavior: is it √2?. In Proc. of the 3rd conference on Computing frontiers. ACM, 313–320.Google Scholar
- Al Jawarneh et al.2019. Container orchestration engines: A thorough functional and performance comparison. In IEEE ICC. 1–6.Google Scholar
- Animesh Kuity and Sateesh K Peddoju. 2017. Performance evaluation of container-based high performance computing ecosystem using OpenPOWER. In International Conference on HiPC. Springer, 290–308.Google Scholar
- J Langguth et al.2018. Memory bandwidth contention: communication vs computation tradeoffs in supercomputers with multicore architectures. In 24th ICPADS. IEEE, 497–506.Google Scholar
- J Lin et al.2008. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In 14th International Symposium on High Performance Computer Architecture. IEEE, 367–378.Google Scholar
- V Meyer et al.2020. An interference-aware application classifier based on machine learning to improve scheduling in clouds. In 28th Euromicro International Conference on PDP. IEEE, 80–87.Google Scholar
- V Meyer et al.2021. ML-driven classification scheme for dynamic interference-aware resource scheduling in cloud infrastructures. Journal of Systems Architecture 116 (2021), 102064.Google ScholarDigital Library
- Chiang RC. 2020. Contention-aware container placement strategy for docker swarm with machine learning based clustering algorithms. Cluster Computing (2020), 1–11.Google Scholar
- MG Xavier et al.2013. Performance evaluation of container-based virtualization for high performance computing environments. In 21st Euromicro International Conference on PDP. IEEE, 233–240.Google Scholar
- Di Xu et al.2010. On mitigating memory bandwidth contention through bandwidth-aware scheduling. In Proc. of the 19th international conference on PACT. ACM, 237–248.Google Scholar
- D Zhao et al.2018. Locality-aware scheduling for containers in cloud computing. IEEE Transactions on cloud computing 8, 2 (2018), 635–646.Google ScholarCross Ref
Index Terms
- cHPCe: Data Locality and Memory Bandwidth Contention-aware Containerized HPC
Recommendations
Container orchestration on HPC systems through Kubernetes
AbstractContainerisation demonstrates its efficiency in application deployment in Cloud Computing. Containers can encapsulate complex programs with their dependencies in isolated environments making applications more portable, hence are being adopted in ...
Data-Locality Aware Scientific Workflow Scheduling Methods in HPC Cloud Environments
Efficient data-aware methods in job scheduling, distributed storage management and data management platforms are necessary for successful execution of data-intensive applications. However, research about methods for data-intensive scientific ...
HPC on the Grid: The Theophys Experience
The Grid Virtual Organization (VO) "Theophys", associated to the INFN (Istituto Nazionale di Fisica Nucleare), is a theoretical physics community with various computational demands, spreading from serial, SMP, MPI and hybrid jobs. That has led, in the ...
Comments