Loading [a11y]/accessibility-menu.js
SiaDFP: A Disk Failure Prediction Framework Based on Siamese Neural Network in Large-Scale Data Center | IEEE Journals & Magazine | IEEE Xplore

SiaDFP: A Disk Failure Prediction Framework Based on Siamese Neural Network in Large-Scale Data Center


Abstract:

With the rapid development of cloud services, service providers increasingly rely on a dependable storage system equipped with large-capacity disks to ensure data availab...Show More

Abstract:

With the rapid development of cloud services, service providers increasingly rely on a dependable storage system equipped with large-capacity disks to ensure data availability. The primary source of unreliability in such storage systems attributes to disk failures. In recent years, some proactive methods base on machine learning models have emerged, aiming to predict impending disk failures by leveraging the SMART attributes of disks. These methods enable service providers to timely back up storage data. While the methods prove more effective and efficient in disk failure prediction, they still face challenges, such as inadequate mining of abnormal information and imbalanced classification. In this paper, we mainly analyzed the change of data distribution in hard disks. From the data analysis, we observed that the distribution change in the failed disk is obvious during the period before the disk damage, while that in the healthy disk is insignificant during running time. Motivated by the observation, we propose a novel framework named SiaDFP, based on Siamese neural network, designed to predict impending disk failures by capturing the distribution changes in failed disks. Additionally, we observed that the failed disks exhibit some change points as an abnormal feature by analyzing the disk data trend. To fully mining abnormal information inhere in failed disks, we propose CP-MAP mechanism and 2D-Attention mechanism. Furthermore, we present a subsampling approach named Region Balanced Sampling to address the challenge of imbalanced classification. Experiments on the real-world dataset Backblaze and Baidu demonstrate that the performance of SiaDFP is outstanding in the task of disk failure prediction.
Published in: IEEE Transactions on Services Computing ( Volume: 17, Issue: 5, Sept.-Oct. 2024)
Page(s): 2890 - 2903
Date of Publication: 29 April 2024

ISSN Information:


References

References is not available for this document.