A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC | IEEE Conference Publication | IEEE Xplore