An application-level failure detection algorithm based on a robust and efficient torus-tree for HPC | IEEE Conference Publication | IEEE Xplore