Abstract
To solve the problems of diagnosing the “health” of a computer network (CN), it is proposed to use the methods of intellectual analysis of historical data obtained by studying the behavior of the network in the past and formalizing this information in the form of sequential patterns. Service level objectives (SLO) and service level agreement (SLA) are used as system indicators characterizing the health of a computer network. An algorithm for calculating the forecast of the current abnormal state of the CN based on sequential pattern predictor (PP) patterns and the structure of the implemented algorithm have been developed and implemented. It is shown that the pattern predictor (PP) algorithm proposed in this work is more than 1000 times superior to the compact prediction tree (CPT) algorithm in terms of the computational complexity of learning and forecasting and the CPT+ algorithm by more than 100 times. The PP algorithm, on average, made 40 and 43% fewer type II errors than the CPT+ and CPT algorithms.
Similar content being viewed by others
REFERENCES
Gu, X. and Wang, H., Online anomaly prediction for robust cluster systems, IEEE 25th Int. Conf. on Data Engineering, Shanghai, 2009, IEEE, 2009, pp. 1000–1011. https://doi.org/10.1109/ICDE.2009.128
Cohen, S., Zhang, M., Goldszmidt, J., Symons, T., Kelly, T., and Fox, A, Capturing, indexing, clustering, and retrieving system history, ACM SIGOPS Oper. Syst. Rev., 2005, vol. 39, no. 5, pp. 105–118. https://doi.org/10.1145/1095809.1095821
Mirza, M., Sommers, J., Barford, P., and Zhu, X., A machine learning approach to TCP throughput prediction, ACM SIGMETRICS Perform. Eval. Rev., 2007, vol. 35, no. 1, pp. 97–108. https://doi.org/10.1145/1269899.1254894
Sheluhin, O.I., Kostin, D.V., and Gorodnichev, M.G., Multiclass classification of anomalous states of computer systems by means of intellectual analysis of system journals, Autom. Control Comput. Sci., 2020, vol. 54, no. 6, pp. 549–559. https://doi.org/10.3103/S0146411620060073
Sheluhin, O.I. and Ryabinin, V.S., Detection anomalies of big data in unstructured syslogs, Vopr. Kiberbezop., 2019, no. 2, pp. 36–41. https://doi.org/10.21681/2311-3456-2019-2-36-41
Gniady, C., Butt, A.R., and Hu, Y.C., Program counter based pattern classification in buffer caching, 6th Symp. on Operating Systems Design and Implementation, OSDI 2004, San Francisco, 2004, pp. 395–408.
Zaki, M.J., SPADE: An efficient algorithm for mining frequent sequences, Mach. Learn., 2001, vol. 42, pp. 31–60. https://doi.org/10.1023/A:1007652502315
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C., Mining sequential patterns by pattern-growth: The PrefixSpan approach, IEEE Trans. Knowl. Data Eng., 2004, vol. 16, no. 11, pp. 1424–1440. https://doi.org/10.1109/TKDE.2004.77
Srikant, R. and Agrawal, R., Mining sequential patterns: Generalizations and performance improvements, Advances in Database Technology – EDBT ’96, Apers, P., Bouzeghoub, M., and Gardarin, G., Eds., Lecture Notes in Computer Science, vol. 1057, Berlin: Springer, 1996, pp. 1–17. https://doi.org/10.1007/BFb0014140
Agrawal, R. and Srikant, R., Mining sequential patterns, Proc. of the Eleventh Int. Conf. on Data Engineering, Taipei, 1995, IEEE, 1995, pp. 3–14. https://doi.org/10.1109/ICDE.1995.380415
Aivazyan, S.A., Enyukov, I.S., and Meshalkin, L.D., Prikladnaya statistika. Issledovanie zavisimostei (Applied Statistics. Dependency Research), Moscow: Finansy i Statistika, 1985.
Abbasghorbani, S. and Tavoli, R., Survey on sequential pattern mining algorithms 2nd Int. Conf. on Knowledge-Based Engineering and Innovation (KBEI), Tehran, 2015, pp. 1153–1164. https://doi.org/10.1109/KBEI.2015.7436211
Fournier-Viger, P., Chun-Wei, J., Kiran, R.-U., Koh, Y.-S., and Thomas, R., A survey of sequential pattern mining, Data Sci. Pattern Recognit., 2017, vol. 1, no. 1, pp. 54–77.
Gan, W., Lin, J.C.-W., Fournier-Viger, P., Chao, H.-C., and Yu, P.S., A survey of parallel sequential pattern mining, ACM Trans. Knowl. Discovery Data, 2019, vol. 13, no. 3, p. 25. https://doi.org/10.1145/3314107
Huang, J.-W., Tseng, C.-Y., Ou, J.-C., and Chen, M.-S., A general model for sequential pattern mining with a progressive database, IEEE Trans. Knowl. Data Eng., 2008, vol. 20, no. 9, pp. 1153–1167. https://doi.org/10.1109/TKDE.2008.37
Keshavamurthy, B.N., Sharma, M., and Toshniwal, D., Efficient support coupled frequent pattern mining over progressive databases, Int. J. Database Manage. Syst., 2010, vol. 2, no. 2, pp. 73–82. https://doi.org/10.5121/ijdms.2010.2205
Kumar, K.M.V.M., Srinivas, P.V.S., and Rao, C.R., Sequential pattern mining with multiple minimum supports in progressive databases publication, Int. J. Database Manage. Syst., 2012, vol. 4, no. 4, pp. 29–41. https://doi.org/10.5121/ijdms.2012.4403
Gellert, A. and Florea, A., Web prefetching through efficient prediction by partial matching, World Wide Web, 2016, vol. 19, pp. 921–932. https://doi.org/10.1007/s11280-015-0367-8
Begleiter, R. and El-Yaniv, R., and Yona., G., On prediction using variable order Markov models, J. Artif. Intell. Res., 2004, vol. 22, pp. 385–421. https://doi.org/10.1613/jair.1491
Padmanabhan, V.N. and Mogul, J.C., Using prefetching to improve world wide web latency, ACM SIGCOMM Comput. Commun., 1998, vol. 16, no. 3, pp. 22–36. https://doi.org/10.1145/235160.235164
Pitkow, J. and Pirolli, P., Mining longest repeating subsequence to predict world wide web surfing, Proc. of the 2nd Conf. on USENIX Symp. on Internet Technologies and Systems, Boulder, Colo., 1999, Berkeley, Calif.: USENIX Association, 1999, vol. 2.
Laird, P. and Saul, R., Discrete sequence prediction and its applications, Mach. Learn., 1994, vol. 15, no. 1, pp. 43–68. https://doi.org/10.1007/BF01000408
Gueniche, T., Fournier-Viger, P., and Tseng, V.S., Compact prediction tree: A lossless model for accurate sequence prediction, Advanced Data Mining and Applications. ADMA 2013, Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., and Wang, W., Eds., Lecture Notes in Computer Science, vol. 8347, Berlin: Springer, 2013, pp. 177–188. https://doi.org/10.1007/978-3-642-53917-6_16
Gueniche, T., Fournier-Viger, P., Raman, R., and Tseng, V.S., CPT+: Decreasing the time/space complexity of the compact prediction tree, Advances in Knowledge Discovery and Data Mining. PAKDD 2015, Cao, T., Lim, E.P., Zhou, Z.H., Ho, T.B., Cheung, D., and Motoda, H., Eds., Lecture Notes in Computer Science, vol. 9708, Cham: Springer, 2015, pp. 625–636. https://doi.org/10.1007/978-3-319-18032-8_49
Molodtsov, D.A., Comparison and continuation of multi-valued dependencies, Nechetkie Sist. Myagkie Vychisl., 2016, vol. 11, no. 2, pp. 115–145.
Sheluhin, O.I., Osin, A.V., and Kostin, D.V., Health monitoring of a computer network based on sequential analysis of serial pattern, T-Comm, 2020, vol. 14, no. 2, pp. 9–16. https://doi.org/10.36724/2072-8735-2020-14-2-9-16
Sheluhin, O.I., Osin, A.V., and Kostin, D.V., Monitoring and diagnostics of anomalous states in a computer network based on the study of “historical data”, T-Comm, 2020, vol. 14, no. 4, pp. 23–30. https://doi.org/10.36724/2072-8735-2020-14-4-23-30
PACKETBEAT (Lightweight shipper for network data). https://www.elastic.co/beats/packetbeat. Cited September 10, 2020.
METRICBEAT (Lightweight shipper for metrics). https://www.elastic.co/beats/metricbeat. Cited September 10, 2020.
FILEBEAT (Lightweight shipper for logs). https://www.elastic.co/beats/filebeat. Cited September 11, 2020.
Elastic beat to call commands in a regular interval and send the result to Logstash, Elasticsearch. https://github.com/christiangalsterer/execbeat.
CURL (Command line tool and library for transferring data with URLs). https://curl.haxx.se. Cited September 11, 2020.
LOGSTASH (Centralize, transform & stash your data). https://www.elastic.co/logstash. Cited September 11, 2020.
Apache Kafka (A distributed streaming platform). https://kafka.apache.org. Cited September 21, 2020.
Apache Cassandra. http://cassandra.apache.org. Cited September 23, 2020.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
About this article
Cite this article
Sheluhin, O.I., Kostin, D.V. & Polkovnikov, M.V. Forecasting of Computer Network Anomalous States Based on Sequential Pattern Analysis of “Historical Data”. Aut. Control Comp. Sci. 55, 522–533 (2021). https://doi.org/10.3103/S0146411621060067
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411621060067