Abstract
Detecting faults in Hard Disk Drives (HDD) can lead to significant benefits to HDD manufacturers, users and storage system providers. As a consequence, several works have focused on the development of fault detection algorithms for HDDs. Recently, promising results were achieved by methods using SMART (Self-Monitoring Analysis and Reporting Technology) features and anomaly detection algorithms. In this work, we propose a method for fault detection on HDDs that uses a Gaussian Mixture to model the behavior of healthy HDDs. After obtaining the similarity between a given HDD and this statistical model, an anomaly is detected when a statistical estimator computed over these dissimilarities exceeds a threshold. In addition to the proposed method, we also conducted an extensive evaluation of different statistical estimators. The proposed method, named Fault Detection of HDDs based on GMM and statistical estimators (FDGE) was compared to state-of-the-art Fault detection methods and achieved the promising results










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Last, M., Sinaiski, A., Subramania, H.S.: Condition-based maintenance with multi-target classification models. New Gener. Comput. 29(3), 245 (2011)
Rodrigues, F.C.M., Queiroz, L.P., Gomes, J.P.P., Machado, J.C.: Predicting overtemperature events in graphics cards using regression models. In: 2015 Brazilian conference on intelligent systems (BRACIS), Natal, pp. 328–332 (2015). doi:10.1109/BRACIS.2015.38
Hughes, G., Murray, J., Kreutz-Delgado, K., Elkan, C.: Improved disk-drive failure warnings. IEEE Trans. Reliab. 51(3), 350–357 (2002)
Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Machine learning methods for predicting failures in hard drives: a multiple-instance application. J. Mach. Learn. Res. 6, 783–816 (2005)
Wang, Y., Miao, Q., Ma, E., Tsui, K.L., Pecht, M.: Online anomaly detection for hard disk drives based on Mahalanobis distance. IEEE Trans. Reliab. 62(1), 136–145 (2013)
Wang, Y., Ma, E., Chow, T., Tsui, K.L.: A two-step parametric method for failure prediction in hard disk drives. IEEE Trans. Ind. Inform. 10(1), 419–430 (2014)
Queiroz, L.P., et al.: A fault detection method for hard disk drives based on mixture of gaussians and nonparametric statistics. IEEE Trans. Ind. Inform. 13(2), 542–550 (2017). doi:10.1109/TII.2016.2619180
Dempster, A.P., et al.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–38 (1977)
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Queiroz, L.P., Rodrigues, F.C.M., Gomes, J.P.P., Brito, F.T., Brito, I.C., Machado, J.C.: Fault detection in hard disk drives based on mixture of Gaussians. In: 2016 5th Brazilian conference on intelligent systems (BRACIS), Recife, pp. 145–150 (2016). doi:10.1109/BRACIS.2016.036
Rousseeuw, P.J., Verboven, S.: Robust estimation in very small samples. Comput. Stat. Data Anal. 40(4), 741–758 (2002)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Siegel, S.: Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill Series in Psychology. McGraw-Hill, New York (1956)
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26(2), 195–239 (1984)
Acknowledgements
This research was partially supported by Funcap, Capes/Brazil, LSBD and LENOVO (under the Lei de informatica funding program).
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Queiroz, L.P., Gomes, J.P.P., Rodrigues, F.C.M. et al. Fault Detection in Hard Disk Drives Based on a Semi Parametric Model and Statistical Estimators. New Gener. Comput. 36, 5–19 (2018). https://doi.org/10.1007/s00354-017-0016-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-017-0016-0