Topology-Aware Optimization of Communications for Parallel Matrix Multiplication on Hierarchical Heterogeneous HPC Platform | IEEE Conference Publication | IEEE Xplore