Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers | IEEE Conference Publication | IEEE Xplore