Abstract
With system size and complexity is growing rapidly, traditional passive fault tolerance can no longer guarantee the reliability of system because of the high overhead and poor scalability of these methods. Active fault tolerance is believed to be the most important fault tolerant approach for exascale systems. Aiming at system failure prediction, this paper proposes a system logs pre-processing method using classification via sparse representation (SRCP). Adopting the idea of vectorization, SRCP removes the details of each log and generates the corresponding Vectors. It uses TF-IDF (term frequency-inverse document frequency) method to Weight each keyword which can reveal more precise information about correlation between log records. In order to improve the accuracy and flexibility of pre-processing method, log vectors are processed by sparse representation classification. For generalization purpose, SRCP does not adopt any expert system or domain knowledge. Experimental results show that, SRCP can not only achieve both outstanding precision and F-measure, but also provide a satisfactory compression ratio.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cappello, F.: Fault tolerance in petascale/exascale systems: current knowledge, challenges and research opportunities. The International Journal of High Performance Computing Applications 23, 212–226 (2009)
Varela, M.R., Ferreira, K.B., Riesen, R.: Fault-Tolerance for Exascale Systems. In: 2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), Heraklion, Crete, pp. 1–4 (2010)
Zhou, H., Jiang, Y.: Research on Online Failure prediction Model and Status Pretreatment Method for Exascale System. In: International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 386–392 (2011)
Zheng, Z., Lan, Z.: System Log Pre-processing to Improve Failure Prediction. In: Proc. of ICDSN, pp. 572–577 (2009)
Zheng, Z., Li, Y.: Co-analysis of RAS Log and Job Log on Blue Gene/P. In: Proc. of IPDPS, pp. 840–851 (2011)
Liang, Y., Zhang, Y., Jette, M.: BlueGene/L Failure Analysis and Prediction Models. In: International Conference on Dependable Systems and Networks (2006)
Liang, Y., Zhang, Y., Xiong, H.: An Adaptive Semantic Filter for Blue Gene/L Failure Log Analysis. In: Parallel and Distributed Processing Symposium (2007)
The ANL Intrepid log, http://www.cs.huji.ac.il/labs/parallel/workload/l_anl_int/index.html
SparseLab software package, http://sparselab.stanford.edu/
Zhang, H., Nasrabadi, N.: Multi-View Automatic Target Recognition using Joint Sparse Representation. IEEE Transactions on Aerospace and Electronic Systems 48(3), 2481–2496 (2012)
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics 59, 797–829 (2004)
Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory 53(12), 4655–4666 (2007)
Wright, J.: Robust Face Recognition via Sparse Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009)
Sainath, T.N., Maskey, S., Kanevsky, D., Ramabhadran, B., Nahamoo, D., Hirschberg, J.: Sparse representations for text categorization. In: Proc. Interspeech, pp. 266–2269 (2010)
Salfner, F., Lenk, M., Malek, M.: A Survey of Online Failure Prediction Methods. ACM Computing Surveys 42(3) (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, L., Gu, J., Zhao, T., Wang, Y. (2013). Research on Log Pre-processing for Exascale System Using Sparse Representation. In: Park, J.J.(.H., Arabnia, H.R., Kim, C., Shi, W., Gil, JM. (eds) Grid and Pervasive Computing. GPC 2013. Lecture Notes in Computer Science, vol 7861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38027-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-38027-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38026-6
Online ISBN: 978-3-642-38027-3
eBook Packages: Computer ScienceComputer Science (R0)