skip to main content
10.1145/3373419.3373437acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaipConference Proceedingsconference-collections
research-article

Hard disk Drive Failure Prediction Challenges in Machine Learning for Multi-variate Time Series

Published: 24 January 2020 Publication History

Abstract

Hard disk drive failure prediction (HDDFP) is an active area of machine learning applications. While recent work shows very promising results with high failure recall (95%) and precision based on SMART attributes, challenges remain that call for improvement in the machine learning pipeline. This paper starts with an introduction of the topic and a summary of recent work. Some challenges applicable to the existing solutions are then illustrated with an example using Backblaze dataset and its HDDFP rule. A main result of the paper is a rigorous formulation of the HDDFP problem as a MIMO dynamic system problem to tackle the challenges. It is also shown that the general formulation can help the existing classification method by enhancing the prediction lead time requirement. Though presented in the context of the HDDFP problem, the findings and thought process are applicable to other dynamic system failure prediction, and in some degree to the IoT and time series based analytics in general.

References

[1]
Horizon Editorial, 2018. HDD Remains Dominant Storage Technology. https://www.horizontechnology.com/news/hdd-remains-dominant-storage-technology/
[2]
Ponemon Institute, 2016. Cost of data center outages. https://www.vertiv.com/globalassets/documents/reports/2016-cost-of-data-center-outages-11-11_51190_1.pdf
[3]
Pinheiro, E., Weber, W., Barroso, L.A., 2007. Failure trends in a large disk drive population. Proceedings USNIX FAST'07 (Feb. 2007)
[4]
Yang, W., et al, 2015. Hard Drive Failure Prediction Using Big Data. 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshops
[5]
Botezatu, M.M., et al, 2016. Predicting disk replacement towards reliable data centers. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016) ACM, pp. 39--48.
[6]
Li, J., et al, 2014. Hard Drive Failure Prediction Using Classification and Regression Trees. 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
[7]
Zhu, B., et al, 2013. Proactive Drive Failure Prediction for Large Scale Storage Systems. 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).
[8]
Anantharaman, P., Qiao, M., Jadav, D., 2018. Large Scale Predictive Analytics for Hard Disk Remaining Useful Life Estimation. IEEE International Congress on Big Data (BigData Congress), 2018
[9]
Aussel, N., et al, 2017. Predictive models of hard drive failures based on operational data. Proceedings of ICMLA 2017: 16th IEEE International Conference On Machine Learning And Applications, pp.619--625, Dec 2017
[10]
Huang, X., 2017. Hard Drive Failure Prediction for Large Scale Storage System. M.S. Thesis, UCLA, 2017
[11]
Mahdisoltani, F., Stefanovici, I., Schroeder, B., 2017. Proactive error prediction to improve storage system reliability? Proceedings of USENIX ATC'17, July 2017
[12]
Rincon, C.A., et al. 2017. Disk Failure Prediction in Heterogeneous Environments. 2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems
[13]
Zhao, Y., et al, 2010, Predicting disk failures with hmm-and hsmm-based approaches. Proceedings of the 10th Industrial Conference on Advances in Data Mining: Applications and Theoretical Aspects. Springer, 2010, pp. 390--404.
[14]
Wang, Y., Miao, Q., Pecht, M., 2011. Health Monitoring of Hard Disk Drive Based on Mahalanobis Distance. Prognostics and System Health Management Conference, May 2011
[15]
Beach, B., 2015. Reliability data set for 41,000 hard drives now open source. https://www.backblaze.com/blog/hard-drive-data-feb2015/, 2015
[16]
Backblaze, ongoing. Hard Drive Data and Stats. https://www.backblaze.com/b2/hard-drive-test-data.html
[17]
WIKIPEDIA. S.m.a.r.t. https://en.wikipedia.org/wiki/S.M.A.R.T.
[18]
Klein, A. 2016. What SMART stats tell us about hard drives. Backblaze blog, https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/, Oct 2016,
[19]
Ganguly, S., et al. 2016. A practical approach to hard disk failure prediction in cloud platforms. 2016 IEEE Second International Conference on Big Data Computing Service and Applications (BigDataService)
[20]
Jiang, T., et al, 2016. "VCM: An Economic Value Cost Perspective for Hard Drive Failure Prediction",
[21]
Xu, Y., et al, 2018. Improving Service Availability of Cloud Systems by Predicting Disk Error. Proceedings of USENIX ATC'18, July 2018
[22]
Salfner, F., 2005. Predicting Failures with Hidden Markov Model. https://www.semanticscholar.org/paper/Predicting-Failures-with-Hidden-Markov-Models-Salfner/a45b45f008c4a30fd9c20c6b6c39d38db5b34199
[23]
Salfner, F., Lenk, M., Malek, M. 2010. A Survey of Online Failure Prediction Methods.
[24]
Salfner, F., Malek, M., 2007. Using Hidden Semi-Markov Models for Effective Online Failure Prediction. '26th IEEE International Symposium on Reliable Distributed Systems, SRDS 2007
[25]
Bulla, J., Bulla, I., Nenadi¢, O., 2010. Hsmm _ An R package for analyzing hidden semi-Markov models. Computational Statistics and Data Analysis 54 (2010) 611_619
[26]
Bar-Shalom, Y., Li, X.R., 1993, Estimation and Tracking: Principles, Techniques, and Software, Artech House, ISBN 0-89006-643-4

Cited By

View all
  • (2024)Hidden Markov Model for Hard Disk Drive Failure Detection2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI)10.1109/IATMSI60426.2024.10502467(1-5)Online publication date: 14-Mar-2024
  • (2024)ACPR: Adaptive Classification Predictive Repair Method for Different Fault ScenariosIEEE Access10.1109/ACCESS.2023.334688112(4631-4641)Online publication date: 2024
  • (2023)Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction ModelsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.313157120:1(256-272)Online publication date: 1-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIP '19: Proceedings of the 2019 3rd International Conference on Advances in Image Processing
November 2019
232 pages
ISBN:9781450376754
DOI:10.1145/3373419
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

  • Southwest Jiaotong University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 January 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Failure prediction
  2. IoT
  3. SMART
  4. big data
  5. dynamic system
  6. hard disk
  7. machine learning
  8. multi-variate
  9. time series

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICAIP 2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Hidden Markov Model for Hard Disk Drive Failure Detection2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI)10.1109/IATMSI60426.2024.10502467(1-5)Online publication date: 14-Mar-2024
  • (2024)ACPR: Adaptive Classification Predictive Repair Method for Different Fault ScenariosIEEE Access10.1109/ACCESS.2023.334688112(4631-4641)Online publication date: 2024
  • (2023)Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction ModelsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.313157120:1(256-272)Online publication date: 1-Jan-2023
  • (2022)A Multivariate Time Series Streaming Classifier for Predicting Hard Drive Failures [Application Notes]IEEE Computational Intelligence Magazine10.1109/MCI.2021.312996217:1(102-114)Online publication date: 1-Feb-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media