Abstract:
Data centers as all complex systems are prone to faults, and cost of them can be very high. This paper is focused on detecting the faults in the cooling systems, in parti...Show MoreMetadata
Abstract:
Data centers as all complex systems are prone to faults, and cost of them can be very high. This paper is focused on detecting the faults in the cooling systems, in particular on local fans level. In the paper, a hybrid approach is proposed. In the approach a model is used as substitute of the real system to generate dataset containing records of both normal and fault cases. On the generated data, machine learning algorithm or ensemble of algorithms are selected and trained to detect the faults. To demonstrate the approach, the rack model of real data center is created, and reliability of the model is shown. Using the model, the dataset with normal as well as abnormal records of data is generated. To detect faults of local fans, simple classifiers are built for all pairs: a local fan - a processor unit. Classifiers are trained on one part of generated data (training data), and then their accuracy is estimated on another part of generated data (test data). A real-time fault detection system is built based on the classifiers. The rack model is used as the substitute of the real plant to check operability of the system.
Date of Conference: 22-25 July 2019
Date Added to IEEE Xplore: 30 January 2020
ISBN Information: