Offloaded GPU Collectives Using CORE-Direct and CUDA Capabilities on InfiniBand Clusters | IEEE Conference Publication | IEEE Xplore