Abstract:
Distributed and compact thermal models are at the basis of thermal-aware design and on-line optimization of the cooling effort in future High-Performance Computing system...Show MoreMetadata
Abstract:
Distributed and compact thermal models are at the basis of thermal-aware design and on-line optimization of the cooling effort in future High-Performance Computing systems. These models can be directly extracted from the target device's thermal response by means of system identification techniques. This paper proposes a novel thermal identification approach for real-life production HPC systems. Our approach is capable of extracting MISO thermal models from a supercomputing node in a production deployment scenario affected by quantization noise on the temperature measurements as well as operating in free-cooling, with variable ambient temperature. The approach is based on an identification algorithm that takes advantage of both the Frisch scheme and the instrumental variable approach. The effectiveness of the proposed methodology has been tested on a node of the CINECA Galileo Tier-1 supercomputer system.
Date of Conference: 23-26 October 2016
Date Added to IEEE Xplore: 22 December 2016
ISBN Information: