Hierarchical Distributed-Memory Multi-Leader MPI-Allreduce for Deep Learning Workloads | IEEE Conference Publication | IEEE Xplore