Examining Failures and Repairs on Supercomputers with Multi-GPU Compute Nodes | IEEE Conference Publication | IEEE Xplore