Skip to main content
Log in

Forecasting of Computer Network Anomalous States Based on Sequential Pattern Analysis of “Historical Data”

  • Published:
Automatic Control and Computer Sciences Aims and scope Submit manuscript

Abstract

To solve the problems of diagnosing the “health” of a computer network (CN), it is proposed to use the methods of intellectual analysis of historical data obtained by studying the behavior of the network in the past and formalizing this information in the form of sequential patterns. Service level objectives (SLO) and service level agreement (SLA) are used as system indicators characterizing the health of a computer network. An algorithm for calculating the forecast of the current abnormal state of the CN based on sequential pattern predictor (PP) patterns and the structure of the implemented algorithm have been developed and implemented. It is shown that the pattern predictor (PP) algorithm proposed in this work is more than 1000 times superior to the compact prediction tree (CPT) algorithm in terms of the computational complexity of learning and forecasting and the CPT+ algorithm by more than 100 times. The PP algorithm, on average, made 40 and 43% fewer type II errors than the CPT+ and CPT algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Similar content being viewed by others

REFERENCES

  1. Gu, X. and Wang, H., Online anomaly prediction for robust cluster systems, IEEE 25th Int. Conf. on Data Engineering, Shanghai, 2009, IEEE, 2009, pp. 1000–1011.  https://doi.org/10.1109/ICDE.2009.128

  2. Cohen, S., Zhang, M., Goldszmidt, J., Symons, T., Kelly, T., and Fox, A, Capturing, indexing, clustering, and retrieving system history, ACM SIGOPS Oper. Syst. Rev., 2005, vol. 39, no. 5, pp. 105–118.  https://doi.org/10.1145/1095809.1095821

    Article  Google Scholar 

  3. Mirza, M., Sommers, J., Barford, P., and Zhu, X., A machine learning approach to TCP throughput prediction, ACM SIGMETRICS Perform. Eval. Rev., 2007, vol. 35, no. 1, pp. 97–108.  https://doi.org/10.1145/1269899.1254894

    Article  Google Scholar 

  4. Sheluhin, O.I., Kostin, D.V., and Gorodnichev, M.G., Multiclass classification of anomalous states of computer systems by means of intellectual analysis of system journals, Autom. Control Comput. Sci., 2020, vol. 54, no. 6, pp. 549–559.  https://doi.org/10.3103/S0146411620060073

    Article  Google Scholar 

  5. Sheluhin, O.I. and Ryabinin, V.S., Detection anomalies of big data in unstructured syslogs, Vopr. Kiberbezop., 2019, no. 2, pp. 36–41.  https://doi.org/10.21681/2311-3456-2019-2-36-41

  6. Gniady, C., Butt, A.R., and Hu, Y.C., Program counter based pattern classification in buffer caching, 6th Symp. on Operating Systems Design and Implementation, OSDI 2004, San Francisco, 2004, pp. 395–408.

  7. Zaki, M.J., SPADE: An efficient algorithm for mining frequent sequences, Mach. Learn., 2001, vol. 42, pp. 31–60.  https://doi.org/10.1023/A:1007652502315

    Article  MATH  Google Scholar 

  8. Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C., Mining sequential patterns by pattern-growth: The PrefixSpan approach, IEEE Trans. Knowl. Data Eng., 2004, vol. 16, no. 11, pp. 1424–1440.  https://doi.org/10.1109/TKDE.2004.77

    Article  Google Scholar 

  9. Srikant, R. and Agrawal, R., Mining sequential patterns: Generalizations and performance improvements, Advances in Database Technology – EDBT ’96, Apers, P., Bouzeghoub, M., and Gardarin, G., Eds., Lecture Notes in Computer Science, vol. 1057, Berlin: Springer, 1996, pp. 1–17.  https://doi.org/10.1007/BFb0014140

    Book  Google Scholar 

  10. Agrawal, R. and Srikant, R., Mining sequential patterns, Proc. of the Eleventh Int. Conf. on Data Engineering, Taipei, 1995, IEEE, 1995, pp. 3–14.  https://doi.org/10.1109/ICDE.1995.380415

  11. Aivazyan, S.A., Enyukov, I.S., and Meshalkin, L.D., Prikladnaya statistika. Issledovanie zavisimostei (Applied Statistics. Dependency Research), Moscow: Finansy i Statistika, 1985.

  12. Abbasghorbani, S. and Tavoli, R., Survey on sequential pattern mining algorithms 2nd Int. Conf. on Knowledge-Based Engineering and Innovation (KBEI), Tehran, 2015, pp. 1153–1164.  https://doi.org/10.1109/KBEI.2015.7436211

  13. Fournier-Viger, P., Chun-Wei, J., Kiran, R.-U., Koh, Y.-S., and Thomas, R., A survey of sequential pattern mining, Data Sci. Pattern Recognit., 2017, vol. 1, no. 1, pp. 54–77.

    Google Scholar 

  14. Gan, W., Lin, J.C.-W., Fournier-Viger, P., Chao, H.-C., and Yu, P.S., A survey of parallel sequential pattern mining, ACM Trans. Knowl. Discovery Data, 2019, vol. 13, no. 3, p. 25.  https://doi.org/10.1145/3314107

    Article  Google Scholar 

  15. Huang, J.-W., Tseng, C.-Y., Ou, J.-C., and Chen, M.-S., A general model for sequential pattern mining with a progressive database, IEEE Trans. Knowl. Data Eng., 2008, vol. 20, no. 9, pp. 1153–1167. https://doi.org/10.1109/TKDE.2008.37

    Article  Google Scholar 

  16. Keshavamurthy, B.N., Sharma, M., and Toshniwal, D., Efficient support coupled frequent pattern mining over progressive databases, Int. J. Database Manage. Syst., 2010, vol. 2, no. 2, pp. 73–82.  https://doi.org/10.5121/ijdms.2010.2205

    Article  Google Scholar 

  17. Kumar, K.M.V.M., Srinivas, P.V.S., and Rao, C.R., Sequential pattern mining with multiple minimum supports in progressive databases publication, Int. J. Database Manage. Syst., 2012, vol. 4, no. 4, pp. 29–41. https://doi.org/10.5121/ijdms.2012.4403

    Article  Google Scholar 

  18. Gellert, A. and Florea, A., Web prefetching through efficient prediction by partial matching, World Wide Web, 2016, vol. 19, pp. 921–932.  https://doi.org/10.1007/s11280-015-0367-8

    Article  Google Scholar 

  19. Begleiter, R. and El-Yaniv, R., and Yona., G., On prediction using variable order Markov models, J. Artif. Intell. Res., 2004, vol. 22, pp. 385–421. https://doi.org/10.1613/jair.1491

    Article  MathSciNet  MATH  Google Scholar 

  20. Padmanabhan, V.N. and Mogul, J.C., Using prefetching to improve world wide web latency, ACM SIGCOMM Comput. Commun., 1998, vol. 16, no. 3, pp. 22–36.  https://doi.org/10.1145/235160.235164

    Article  Google Scholar 

  21. Pitkow, J. and Pirolli, P., Mining longest repeating subsequence to predict world wide web surfing, Proc. of the 2nd Conf. on USENIX Symp. on Internet Technologies and Systems, Boulder, Colo., 1999, Berkeley, Calif.: USENIX Association, 1999, vol. 2.

  22. Laird, P. and Saul, R., Discrete sequence prediction and its applications, Mach. Learn., 1994, vol. 15, no. 1, pp. 43–68.  https://doi.org/10.1007/BF01000408

    Article  MATH  Google Scholar 

  23. Gueniche, T., Fournier-Viger, P., and Tseng, V.S., Compact prediction tree: A lossless model for accurate sequence prediction, Advanced Data Mining and Applications. ADMA 2013, Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., and Wang, W., Eds., Lecture Notes in Computer Science, vol. 8347, Berlin: Springer, 2013, pp. 177–188.  https://doi.org/10.1007/978-3-642-53917-6_16

    Book  Google Scholar 

  24. Gueniche, T., Fournier-Viger, P., Raman, R., and Tseng, V.S., CPT+: Decreasing the time/space complexity of the compact prediction tree, Advances in Knowledge Discovery and Data Mining. PAKDD 2015, Cao, T., Lim, E.P., Zhou, Z.H., Ho, T.B., Cheung, D., and Motoda, H., Eds., Lecture Notes in Computer Science, vol. 9708, Cham: Springer, 2015, pp. 625–636.  https://doi.org/10.1007/978-3-319-18032-8_49

    Book  Google Scholar 

  25. Molodtsov, D.A., Comparison and continuation of multi-valued dependencies, Nechetkie Sist. Myagkie Vychisl., 2016, vol. 11, no. 2, pp. 115–145.

    MATH  Google Scholar 

  26. Sheluhin, O.I., Osin, A.V., and Kostin, D.V., Health monitoring of a computer network based on sequential analysis of serial pattern, T-Comm, 2020, vol. 14, no. 2, pp. 9–16.  https://doi.org/10.36724/2072-8735-2020-14-2-9-16

    Article  Google Scholar 

  27. Sheluhin, O.I., Osin, A.V., and Kostin, D.V., Monitoring and diagnostics of anomalous states in a computer network based on the study of “historical data”, T-Comm, 2020, vol. 14, no. 4, pp. 23–30.  https://doi.org/10.36724/2072-8735-2020-14-4-23-30

    Article  Google Scholar 

  28. PACKETBEAT (Lightweight shipper for network data). https://www.elastic.co/beats/packetbeat. Cited September 10, 2020.

  29. METRICBEAT (Lightweight shipper for metrics). https://www.elastic.co/beats/metricbeat. Cited September 10, 2020.

  30. FILEBEAT (Lightweight shipper for logs). https://www.elastic.co/beats/filebeat. Cited September 11, 2020.

  31. Elastic beat to call commands in a regular interval and send the result to Logstash, Elasticsearch. https://github.com/christiangalsterer/execbeat.

  32. CURL (Command line tool and library for transferring data with URLs). https://curl.haxx.se. Cited September 11, 2020.

  33. LOGSTASH (Centralize, transform & stash your data). https://www.elastic.co/logstash. Cited September 11, 2020.

  34. Apache Kafka (A distributed streaming platform). https://kafka.apache.org. Cited September 21, 2020.

  35. Apache Cassandra. http://cassandra.apache.org. Cited September 23, 2020.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to O. I. Sheluhin, D. V. Kostin or M. V. Polkovnikov.

Ethics declarations

The authors declare that they have no conflicts of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sheluhin, O.I., Kostin, D.V. & Polkovnikov, M.V. Forecasting of Computer Network Anomalous States Based on Sequential Pattern Analysis of “Historical Data”. Aut. Control Comp. Sci. 55, 522–533 (2021). https://doi.org/10.3103/S0146411621060067

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0146411621060067

Keywords:

Navigation