Skip to main content

Advertisement

Log in

Fault Detection in Hard Disk Drives Based on a Semi Parametric Model and Statistical Estimators

  • Special Feature
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Detecting faults in Hard Disk Drives (HDD) can lead to significant benefits to HDD manufacturers, users and storage system providers. As a consequence, several works have focused on the development of fault detection algorithms for HDDs. Recently, promising results were achieved by methods using SMART (Self-Monitoring Analysis and Reporting Technology) features and anomaly detection algorithms. In this work, we propose a method for fault detection on HDDs that uses a Gaussian Mixture to model the behavior of healthy HDDs. After obtaining the similarity between a given HDD and this statistical model, an anomaly is detected when a statistical estimator computed over these dissimilarities exceeds a threshold. In addition to the proposed method, we also conducted an extensive evaluation of different statistical estimators. The proposed method, named Fault Detection of HDDs based on GMM and statistical estimators (FDGE) was compared to state-of-the-art Fault detection methods and achieved the promising results

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Last, M., Sinaiski, A., Subramania, H.S.: Condition-based maintenance with multi-target classification models. New Gener. Comput. 29(3), 245 (2011)

    Article  Google Scholar 

  2. Rodrigues, F.C.M., Queiroz, L.P., Gomes, J.P.P., Machado, J.C.: Predicting overtemperature events in graphics cards using regression models. In: 2015 Brazilian conference on intelligent systems (BRACIS), Natal, pp. 328–332 (2015). doi:10.1109/BRACIS.2015.38

  3. Hughes, G., Murray, J., Kreutz-Delgado, K., Elkan, C.: Improved disk-drive failure warnings. IEEE Trans. Reliab. 51(3), 350–357 (2002)

    Article  Google Scholar 

  4. Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Machine learning methods for predicting failures in hard drives: a multiple-instance application. J. Mach. Learn. Res. 6, 783–816 (2005)

    MathSciNet  MATH  Google Scholar 

  5. Wang, Y., Miao, Q., Ma, E., Tsui, K.L., Pecht, M.: Online anomaly detection for hard disk drives based on Mahalanobis distance. IEEE Trans. Reliab. 62(1), 136–145 (2013)

    Article  Google Scholar 

  6. Wang, Y., Ma, E., Chow, T., Tsui, K.L.: A two-step parametric method for failure prediction in hard disk drives. IEEE Trans. Ind. Inform. 10(1), 419–430 (2014)

    Article  Google Scholar 

  7. Queiroz, L.P., et al.: A fault detection method for hard disk drives based on mixture of gaussians and nonparametric statistics. IEEE Trans. Ind. Inform. 13(2), 542–550 (2017). doi:10.1109/TII.2016.2619180

    Article  Google Scholar 

  8. Dempster, A.P., et al.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  9. Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  11. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  12. Queiroz, L.P., Rodrigues, F.C.M., Gomes, J.P.P., Brito, F.T., Brito, I.C., Machado, J.C.: Fault detection in hard disk drives based on mixture of Gaussians. In: 2016 5th Brazilian conference on intelligent systems (BRACIS), Recife, pp. 145–150 (2016). doi:10.1109/BRACIS.2016.036

  13. Rousseeuw, P.J., Verboven, S.: Robust estimation in very small samples. Comput. Stat. Data Anal. 40(4), 741–758 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  14. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Siegel, S.: Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill Series in Psychology. McGraw-Hill, New York (1956)

  16. Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26(2), 195–239 (1984)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was partially supported by Funcap, Capes/Brazil, LSBD and LENOVO (under the Lei de informatica funding program).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joao Paulo P. Gomes.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Queiroz, L.P., Gomes, J.P.P., Rodrigues, F.C.M. et al. Fault Detection in Hard Disk Drives Based on a Semi Parametric Model and Statistical Estimators. New Gener. Comput. 36, 5–19 (2018). https://doi.org/10.1007/s00354-017-0016-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-017-0016-0

Keywords