Design and Implementation of an Integrated Fault-Supervising System for Large HPCs | IEEE Conference Publication | IEEE Xplore