Skip to main content
Log in

Large-scale IP network behavior anomaly detection and identification using substructure-based approach and multivariate time series mining

  • Published:
Telecommunication Systems Aims and scope Submit manuscript

Abstract

In this paper, a substructure-based network behavior anomaly detection approach, called WFS (Weighted Frequent Subgraphs), is proposed to detect the anomalies of a large-scale IP networks. With application of WFS, an entire graph is examined, unusual substructures of which are reported. Due to additional information given by the graph, the anomalies are able to be detected more accurately. With multivariate time series motif association rules mining (MTSMARM), the patterns of abnormal traffic behavior are able to be obtained. In order to verify the above proposals, experiments are conducted and, together with application of backbone networks (Internet2) Netflow data, show some positive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Han, J., & Kamber, M. (2006). Data mining: concepts and techniques. San Francisco: Morgan Kaufmann.

    Google Scholar 

  2. Cover, T. M., & Thomas, J. A. (2006). Elements of information theory. New York: Wiley.

    Google Scholar 

  3. Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423.

    Google Scholar 

  4. Kumar, V. (2005). Parallel and distributed computing for cybersecurity. IEEE Trans. Distrib. Syst. Online.

  5. Anderson, D., Lunt, T., Javitz, H., Tamaru, A., & Valdes, A. (1995). Detecting unusual program behavior using the statistical components of NIDES (Tech. Rep. SRI-CSL-95-06). Computer Science Laboratory, SRI International.

  6. Anderson, D., Frivold, T., Tamaru, A., & Valdes, A. (1994). Next-generation intrusion detection expert system, software users manual, beta-update release (Tech. Rep. SRI-CSL-95-07). Computer Science Laboratory, SRI International.

  7. Anderson, D., Lunt, T., Javitz, H., Tamaru, A., & Valdes, A. (1995). Detecting unusual program behavior using the statistical components of NIDES (Tech. Rep. SRI-CSL-95-06). Computer Science Laboratory, SRI International.

  8. Ilgun, K., Kemmerer, R. A., & Porras, P. A. (1995). State transition analysis: a rule-based intrusion detection approach. IEEE Transactions on Software Engineering, 21(3), 181–199.

    Article  Google Scholar 

  9. Porras, P. A., & Neumann, P. G. (1997). EMERALD: event monitoring enabling responses to anomalous live disturbances. In Proceedings of 20th NIST-NCSC national information systems security conference (pp. 353–365).

  10. Yamanishi, K., & Takeuchi, J.-I. (2001). Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner. In Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 389–394). New York: ACM.

    Chapter  Google Scholar 

  11. Yamanishi, K., Takeuchi, J.-I., Williams, G., & Milne, P. (2004). On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Mining and Knowledge Discovery, 8, 275–300.

    Article  Google Scholar 

  12. Ho, L. L., Macey, C. J., & Hiller, R. (1999). A distributed and reliable platform for adaptive anomaly detection in ip networks. In Proceedings of the 10th IFIP/IEEE international workshop on distributed systems: operations and management (pp. 33–46). London: Springer.

    Google Scholar 

  13. Kruegel, C., Mutz, D., Robertson, W., & Valeur, F. (2003). Bayesian event classification for intrusion detection. In Proceedings of the 19th annual computer security applications conference (Vol. 14). Los Alamitios: IEEE Computer Society.

    Google Scholar 

  14. Kruegel, C., Toth, T., & Kirda, E. (2002). Service specific anomaly detection for network intrusion detection. In Proceedings of the 2002 ACM symposium on applied computing (pp. 201–208). New York: ACM.

    Chapter  Google Scholar 

  15. Kruegel, C., & Vigna, G. (2003). Anomaly detection of web-based attacks. In Proceedings of the 10th ACM conference on computer and communications security (pp. 251–261). New York: ACM.

    Chapter  Google Scholar 

  16. Mahoney, M. V., & Chan, P. K. (2002). Learning nonstationary models of normal network traffic for detecting novel attacks. In Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 376–385). New York: ACM.

    Chapter  Google Scholar 

  17. Mahoney, M. V., & Chan, P. K. (2003). Learning rules for anomaly detection of hostile network traffic. In Proceedings of the 3rd IEEE international conference on data mining (p. 601). Los Alamitios: IEEE Computer Society.

    Chapter  Google Scholar 

  18. Mahoney, M. V., Chan, P. K., & Arshad, M. H. (2003). A machine learning approach to anomaly detection (Tech. Rep. CS-2003-06). Department of Computer Science, Florida Institute of Technology Melbourne.

  19. Sargor, C. (1998). Statistical anomaly detection for link-state routing protocols. In Proceedings of the sixth international conference on network protocols (p. 62). Washington: IEEE Computer Society.

    Google Scholar 

  20. Gwadera, R., Atallah, M. J., & Szpankowski, W. (2004). Detection of significant sets of episodes in event sequences. In Proceedings of the fourth IEEE international conference on data mining (pp. 3–10). Washington: IEEE Computer Society.

    Google Scholar 

  21. Gwadera, R., Atallah, M. J., & Szpankowski, W. (2005). Markov models for identification of significant episodes. In Proceedings of 5th SIAM international conference on data mining.

  22. Gwadera, R., Atallah, M. J., & Szpankowski, W. (2005). Reliable detection of episodes in event sequences. Knowledge and Information Systems, 7(4), 415–437.

    Article  Google Scholar 

  23. Ye, N., & Chen, Q. (2001). An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Quality and Reliability Engineering International, 17, 105–112.

    Article  Google Scholar 

  24. Chow, C., & Yeung, D.-Y. (2002). Parzen-window network intrusion detectors. In Proceedings of the 16th international conference on pattern recognition (p. 40385). Washington: IEEE Computer Society.

    Google Scholar 

  25. Siaterlis, C., & Maglaris, B. (2004). Towards multisensor data fusion for dos detection. In Proceedings of the 2004 ACM symposium on applied computing (pp. 439–446). New York: ACM.

    Chapter  Google Scholar 

  26. Sebyala, A. A., Olukemi, T., & Sacks, L. (2002). Active platform security through intrusion detection using naive Bayesian network for anomaly detection. In Proceedings of the 2002 London communications symposium.

  27. Valdes, A., & Skinner, K. (2000). Adaptive, model-based monitoring for cyber attack detection. In Proceedings of the 3rd international workshop on recent advances in intrusion detection (pp. 80–92). Berlin: Springer.

    Chapter  Google Scholar 

  28. Bronstein, A., Das, J., Duro, M., Friedrich, R., Kleyner, G., Mueller, M., Singhal, S., & Cohen, I. (2001). Bayesian networks for detecting anomalies in internet-based services. In International Symposium on Integrated Network Management.

  29. Zhang, Z., Li, J., Manikopoulos, C., Jorgenson, J., & Ucles, J. (2001). Hide: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In Proceedings of IEEE workshop on information assurance and security (pp. 85–90) West Point.

  30. Labib, K., & Vemuri, R. (2002). Nsom: a real-time network-based intrusion detection using self-organizing maps. Networks and Security.

  31. Smith, R., Bivens, A., Embrechts, M., Palagiri, C., & Szymanski, B. (2002). Clustering approaches for anomaly based intrusion detection. In Proceedings of intelligent engineering systems through artificial neural networks (pp. 579–584). New York: ASME.

    Google Scholar 

  32. Williams, G., Baxter, R., He, H., Hawkins, S., & Gu, L. (2002). A comparative study of rnn for outlier detection in data mining. In Proceedings of the 2002 IEEE international conference on data mining (p. 709). Washington: IEEE Computer Society.

    Chapter  Google Scholar 

  33. Manikopoulos, C., & Papavassiliou, S. (2002). Network intrusion and fault detection: a statistical anomaly approach. IEEE Communication Magazine, 40.

  34. Ramadas, M., Ostermann, S., & Tjaden, B. C. (2003). Detecting anomalous network traÀc with self-organizing maps. In Proceedings of recent advances in intrusion detection (pp. 36–54).

  35. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., & Stolfo, S. (2002). A geometric framework for unsupervised anomaly detection. In Proceedings of applications of data mining in computer security (pp. 78–100). Norwell: Kluwer Academics.

    Google Scholar 

  36. Barbara, D., Couto, J., Jajodia, S., & Wu, N. (2001a). Adam: a testbed for exploring the use of data mining in intrusion detection. SIGMOD Rec., 30(4), 15–24.

    Article  Google Scholar 

  37. Barbara, D., Couto, J., Jajodia, S., & Wu, N. (2001b). Detecting novel network intrusions using Bayes estimators. In Proceedings of the first SIAM international conference on data mining.

  38. Barbara, D., Li, Y., Couto, J., Lin, J.-L., & Jajodia, S. (2003). Bootstrapping a data mining intrusion detection system. In Proceedings of the 2003 ACM symposium on applied computing (pp. 421–425). New York: ACM.

    Chapter  Google Scholar 

  39. Fan, W., Miller, M., Stolfo, S. J., Lee, W., & Chan, P. K. (2001). Using artificial anomalies to detect unknown and known network intrusions. In Proceedings of the 2001 IEEE international conference on data mining (pp. 123–130). Los Alamitos: IEEE Computer Society.

    Chapter  Google Scholar 

  40. Helmer, G., Wong, J., Honavar, V., & Miller, L. (1998). Intelligent agents for intrusion detection. In Proceedings of IEEE information technology conference (pp. 121–124).

  41. Qin, M., & Hwang, K. (2004). Frequent episode rules for internet anomaly detection. In Proceedings of the 3rd IEEE international symposium on network computing and applications. Los Alamitos: IEEE Computer Society.

    Google Scholar 

  42. Salvador, S., & Chan, P. (2003). Learning states and rules for time-series anomaly detection (Tech. Rep. CS-2003-05). Department of Computer Science, Florida Institute of Technology Melbourne.

  43. Otey, M., Parthasarathy, S., Ghoting, A., Li, G., Narravula, S., & Panda, D. (2003). Towards nic-based intrusion detection. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 723–728). New York: ACM.

    Chapter  Google Scholar 

  44. Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P.-N., Kumar, V., Srivastava, J., & Dokas, P. (2004). MINDS—Minnesota Intrusion Detection System. In Data mining—next generation challenges and future directions. Cambridge: MIT Press.

    Google Scholar 

  45. Sequeira, K., & Zaki, M. (2002). Admit: anomaly-based data mining for intrusions. In Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 386–395). New York: ACM.

    Chapter  Google Scholar 

  46. Wu, N., & Zhang, J. (2003). Factor analysis based anomaly detection. In Proceedings of IEEE workshop on information assurance. West Point: United States Military Academy.

    Google Scholar 

  47. Chandola, V., Eilertson, E., Ertoz, L., Simon, G., & Kumar, V. (2006). Data mining for cyber security. In A. Singhal (Ed.), Data warehousing and data mining techniques for computer security Berlin: Springer.

    Google Scholar 

  48. Shyu, M.-L., Chen, S.-C., Sarinnapakorn, K., & Chang, L. (2003). A novel anomaly detection scheme based on principal component classifier. In Proceedings of 3rd IEEE international conference on data mining (pp. 353–365).

  49. Lakhina, A., Crovella, M., & Diot, C. (2005). Mining anomalies using traffic feature distributions. In Proceedings of the 2005 ACM SIGCOMM conference on applications, technologies, architectures, and protocols for computer communications, Aug. 2005.

  50. Nychis, G., Sekar, V., Andersen, D. G., Kim, H., & Zhang, H. (2008). An empirical evaluation of entropy-based traffic anomaly detection. In Proceedings of the 8th ACM SIGCOMM conference on Internet measurement (pp. 151–156).

  51. Thottan, M., & Ji, C. (2003). Anomaly detection in ip networks. IEEE Transactions on Signal Processing, 51(8), 2191–2204.

    Article  Google Scholar 

  52. Sun, J., Qu, H., Chakrabarti, D., & Faloutsos, C. (2005). Relevance search and anomaly detection in bipartite graphs. SIGKDD Explorations, 7(2), 48–55.

    Article  Google Scholar 

  53. Noble, C. C., & Cook, D. J. (2003). Graph-based anomaly detection. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003 (pp. 631–636).

  54. Lee, W., & Xiang, D. (2001). Information-theoretic measures for anomaly detection. In Proceedings of the IEEE symposium on security and privacy (p. 130). Los Alamitos: IEEE Computer Society.

    Google Scholar 

  55. Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). Locally adaptive dimensionality reduction for indexing large time series databases. In Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery.

  56. Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery.

  57. Lin, J., Keogh, E., Patel, P., & Lonardi, S. (2002). Finding motifs in time series. In Proceedings of the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international conference on knowledge discovery and data mining.

  58. Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107–144.

    Article  Google Scholar 

  59. Keogh, E., Lin, J., & Fu, A. (2005). Hot sax: efficiently finding the most unusual time series subsequence. In Proc. of the 5th IEEE int’l conf. on data mining (pp. 226–233).

  60. Staniford-Chen, S., Cheung, S., Crawford, R., Dilger, M., Frank, J., Hoagland, J., Levitt, K., Wee, C., Yip, R., & Zerkle, D. (1996). GrIDS—A graph based intrusion detection system for large networks. In Proceedings of the 19th national information systems security conference.

  61. Shetty, J., & Adibi, J. (2005). Discovering important nodes through graph entropy: the case of enron email database. In KDD, Proceedings of the 3rd international workshop on Link discovery (pp. 74–81).

  62. Rattigan, M., & Jensen, D. (2005). The case for anomalous link discovery. ACM SIGKDD Exploration Newsletter, 7(2), 41–47.

    Article  Google Scholar 

  63. Chakrabarti, D. (2004). AutoPart: parameter-free graph partitioning and outlier detection. In Knowledge Discovery in Databases: PKDD 2004 (pp. 112–124). 8th European Conference on Principles and Practice of Knowledge Discovery in Databases.

  64. Lin, S., & Chalupsky, H. (2003). Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis. In Proceedings of the third IEEE ICDM international conference on data mining (pp. 171–178).

  65. Netflow Data, Abilene http://abilene.internet2.edu.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Weisong He or Guangmin Hu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, W., Hu, G. & Zhou, Y. Large-scale IP network behavior anomaly detection and identification using substructure-based approach and multivariate time series mining. Telecommun Syst 50, 1–13 (2012). https://doi.org/10.1007/s11235-010-9384-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11235-010-9384-1

Keywords

Navigation