Abstract
Monitoring events in communication and computing systems becomes more and more challenging due to the increasing complexity and diversity of these systems. Several supporting tools have been created to assist system administrators in monitoring an enormous number of events daily. The main function of these tools is to filter as many as possible events and present highly suspected events to the administrators for fault analysis, detection and report. While these suspected events appear regularly on large and complex systems, such as cloud computing systems, analyzing them consumes much time and effort. In this study, we propose an approach for evaluating the severity level of events using a classification decision tree. The approach exploits existing fault datasets and features, such as bug reports and log events to construct a decision tree that can be used to classify the severity level of other events. The administrators refer to the result of classification to determine proper actions for the suspected events with a high severity level. We have implemented and experimented the approach for various bug report and log event datasets. The experimental results reveal that the accuracy of classifying severity levels by using the decision trees is above 80%, and some detailed analyses are also provided.
References
Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I.: Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25(6), 599–616 (2009)
Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M.: A view of cloud computing. ACM Commun. 53(4), 50–58 (2010)
Tran, H.M., Nguyen, S., Le, S.T., Vu, Q.T.: Fault data analytics using decision tree for fault detection. In: Dang, T.K., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E. (eds.) FDSE 2015. LNCS, vol. 9446, pp. 57–71. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26135-5_5
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, New York (1984)
Tran, H.M., Schönwälder, J.: Fault representation in case-based reasoning. In: Clemm, A., Granville, L.Z., Stadler, R. (eds.) DSOM 2007. LNCS, vol. 4785, pp. 50–61. Springer, Heidelberg (2007). doi:10.1007/978-3-540-75694-1_5
Tran, H.M., Le, S.T., Ha, S.V.U., Huynh, T.K.: Software bug ontology supporting bug search on peer-to-peer networks. In: Proceedings of 6th International KES Conference on Agents and Multi-agent Systems Technologies and Applications (AMSTA 2013). IOS Press (2013)
Sinnamon, R.M., Andrews, J.D.: Fault tree analysis and binary decision diagrams. In: Proceedings in Reliability and Maintainability Annual Symposium, pp. 215–222 (1996)
Reay, K.A., Andrews, J.D.: A fault tree analysis strategy using binary decision diagrams. Reliab. Eng. Syst. Saf. 78(1), 45–56 (2002)
Francis, P., Leon, D., Minch, M., Podgurski, A.: Tree-based methods for classifying software failures. In: Proceedings of 15th International Symposium on Software Reliability Engineering (ISSRE 2004), pp. 451–462, Washington, DC, USA. IEEE (2004)
Zheng, A.X., Lloyd, J., Brewer, E.: Failure diagnosis using decision trees. In: Proceedings of 1st International Conference on Autonomic Computing (ICAC 2004), Washington, DC, USA, pp. 36–43. IEEE Computer Society (2004)
Sama, M., Rosenblum, D.S., Wang, Z., Elbaum, S.: Model-based fault detection in context-aware adaptive applications. In: Proceedings of 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, New York, NY, USA, pp. 261–271. ACM (2008)
Xu, C., Cheung, S.C., Ma, X., Cao, C., Jian, L.: Detecting faults in context-aware adaptation. Int. J. Softw. Inf. 7(1), 85–111 (2013)
Moldovan, D., Copil, G., Truong, H.L., Dustdar, S.: MELA: monitoring and analyzing elasticity of cloud services. In: Proceedings of 5th International Conference on Cloud Computing, pp. 80–87. IEEE Press (2013)
Breiman, L., Friedman, J., Stone, C., Olshen, R.: Classification and Regression Trees. Chapman & Hall, New York (1984)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Kass, G.V.: An exploratory technique for investigating large quantities of categorical data. Appl. Stat. 29(2), 119–127 (1980)
Mozilla bug tracking system. https://bugzilla.mozilla.org/. Accessed Jan 2015
Launchpad bugs. https://bugs.launchpad.net/. Accessed Jan 2015
Mantis bug tracker. https://www.mantisbt.org/. Accessed Jan 2015
Debian bug tracking system. https://www.debian.org/Bugs/. Accessed Jan 2015
Tran, H.M., Lange, C., Chulkov, G., Schönwälder, J., Kohlhase, M.: Applying semantic techniques to search and analyze bug tracking data. J. Netw. Syst. Manag. 17(3), 285–308 (2009)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Oliphant, T.: A Guide to NumPy, vol. 1. Trelgol Publishing, Spanish Fork (2006)
Silva, F.B.: Learning SciPy for Numerical and Scientific Computing. Packt Publishing, Birmingham (2013)
Hall, L.O., Chawla, N., Bowyer, K.W.: Decision tree learning on very large data sets. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2579–2584. IEEE (1998)
The industry standard in IT infrastructure monitoring (1999). http://www.nagios.org/. Accessed Nov 2015
Ganglia monitoring system (2000). http://ganglia.info/. Accessed Nov 2015
Acknowledgements
This research activity is funded by Vietnam National University in Ho Chi Minh City (VNU-HCM) under the grant number B2017-28-01 (the type-B project “Augmenting fault detection services on large and complex network systems using context-aware data analysis")
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer-Verlag GmbH Germany
About this paper
Cite this paper
Tran, H.M., Van Nguyen, S., Le, S.T., Vu, Q.T. (2017). Applying Data Analytic Techniques for Fault Detection. In: Hameurlain, A., Küng, J., Wagner, R., Dang, T., Thoai, N. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXI. Lecture Notes in Computer Science(), vol 10140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-54173-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-54173-9_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-54172-2
Online ISBN: 978-3-662-54173-9
eBook Packages: Computer ScienceComputer Science (R0)