Abstract
In this paper, a substructure-based network behavior anomaly detection approach, called WFS (Weighted Frequent Subgraphs), is proposed to detect the anomalies of a large-scale IP networks. With application of WFS, an entire graph is examined, unusual substructures of which are reported. Due to additional information given by the graph, the anomalies are able to be detected more accurately. With multivariate time series motif association rules mining (MTSMARM), the patterns of abnormal traffic behavior are able to be obtained. In order to verify the above proposals, experiments are conducted and, together with application of backbone networks (Internet2) Netflow data, show some positive results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Han, J., & Kamber, M. (2006). Data mining: concepts and techniques. San Francisco: Morgan Kaufmann.
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory. New York: Wiley.
Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423.
Kumar, V. (2005). Parallel and distributed computing for cybersecurity. IEEE Trans. Distrib. Syst. Online.
Anderson, D., Lunt, T., Javitz, H., Tamaru, A., & Valdes, A. (1995). Detecting unusual program behavior using the statistical components of NIDES (Tech. Rep. SRI-CSL-95-06). Computer Science Laboratory, SRI International.
Anderson, D., Frivold, T., Tamaru, A., & Valdes, A. (1994). Next-generation intrusion detection expert system, software users manual, beta-update release (Tech. Rep. SRI-CSL-95-07). Computer Science Laboratory, SRI International.
Anderson, D., Lunt, T., Javitz, H., Tamaru, A., & Valdes, A. (1995). Detecting unusual program behavior using the statistical components of NIDES (Tech. Rep. SRI-CSL-95-06). Computer Science Laboratory, SRI International.
Ilgun, K., Kemmerer, R. A., & Porras, P. A. (1995). State transition analysis: a rule-based intrusion detection approach. IEEE Transactions on Software Engineering, 21(3), 181–199.
Porras, P. A., & Neumann, P. G. (1997). EMERALD: event monitoring enabling responses to anomalous live disturbances. In Proceedings of 20th NIST-NCSC national information systems security conference (pp. 353–365).
Yamanishi, K., & Takeuchi, J.-I. (2001). Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner. In Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 389–394). New York: ACM.
Yamanishi, K., Takeuchi, J.-I., Williams, G., & Milne, P. (2004). On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. Data Mining and Knowledge Discovery, 8, 275–300.
Ho, L. L., Macey, C. J., & Hiller, R. (1999). A distributed and reliable platform for adaptive anomaly detection in ip networks. In Proceedings of the 10th IFIP/IEEE international workshop on distributed systems: operations and management (pp. 33–46). London: Springer.
Kruegel, C., Mutz, D., Robertson, W., & Valeur, F. (2003). Bayesian event classification for intrusion detection. In Proceedings of the 19th annual computer security applications conference (Vol. 14). Los Alamitios: IEEE Computer Society.
Kruegel, C., Toth, T., & Kirda, E. (2002). Service specific anomaly detection for network intrusion detection. In Proceedings of the 2002 ACM symposium on applied computing (pp. 201–208). New York: ACM.
Kruegel, C., & Vigna, G. (2003). Anomaly detection of web-based attacks. In Proceedings of the 10th ACM conference on computer and communications security (pp. 251–261). New York: ACM.
Mahoney, M. V., & Chan, P. K. (2002). Learning nonstationary models of normal network traffic for detecting novel attacks. In Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 376–385). New York: ACM.
Mahoney, M. V., & Chan, P. K. (2003). Learning rules for anomaly detection of hostile network traffic. In Proceedings of the 3rd IEEE international conference on data mining (p. 601). Los Alamitios: IEEE Computer Society.
Mahoney, M. V., Chan, P. K., & Arshad, M. H. (2003). A machine learning approach to anomaly detection (Tech. Rep. CS-2003-06). Department of Computer Science, Florida Institute of Technology Melbourne.
Sargor, C. (1998). Statistical anomaly detection for link-state routing protocols. In Proceedings of the sixth international conference on network protocols (p. 62). Washington: IEEE Computer Society.
Gwadera, R., Atallah, M. J., & Szpankowski, W. (2004). Detection of significant sets of episodes in event sequences. In Proceedings of the fourth IEEE international conference on data mining (pp. 3–10). Washington: IEEE Computer Society.
Gwadera, R., Atallah, M. J., & Szpankowski, W. (2005). Markov models for identification of significant episodes. In Proceedings of 5th SIAM international conference on data mining.
Gwadera, R., Atallah, M. J., & Szpankowski, W. (2005). Reliable detection of episodes in event sequences. Knowledge and Information Systems, 7(4), 415–437.
Ye, N., & Chen, Q. (2001). An anomaly detection technique based on a chi-square statistic for detecting intrusions into information systems. Quality and Reliability Engineering International, 17, 105–112.
Chow, C., & Yeung, D.-Y. (2002). Parzen-window network intrusion detectors. In Proceedings of the 16th international conference on pattern recognition (p. 40385). Washington: IEEE Computer Society.
Siaterlis, C., & Maglaris, B. (2004). Towards multisensor data fusion for dos detection. In Proceedings of the 2004 ACM symposium on applied computing (pp. 439–446). New York: ACM.
Sebyala, A. A., Olukemi, T., & Sacks, L. (2002). Active platform security through intrusion detection using naive Bayesian network for anomaly detection. In Proceedings of the 2002 London communications symposium.
Valdes, A., & Skinner, K. (2000). Adaptive, model-based monitoring for cyber attack detection. In Proceedings of the 3rd international workshop on recent advances in intrusion detection (pp. 80–92). Berlin: Springer.
Bronstein, A., Das, J., Duro, M., Friedrich, R., Kleyner, G., Mueller, M., Singhal, S., & Cohen, I. (2001). Bayesian networks for detecting anomalies in internet-based services. In International Symposium on Integrated Network Management.
Zhang, Z., Li, J., Manikopoulos, C., Jorgenson, J., & Ucles, J. (2001). Hide: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In Proceedings of IEEE workshop on information assurance and security (pp. 85–90) West Point.
Labib, K., & Vemuri, R. (2002). Nsom: a real-time network-based intrusion detection using self-organizing maps. Networks and Security.
Smith, R., Bivens, A., Embrechts, M., Palagiri, C., & Szymanski, B. (2002). Clustering approaches for anomaly based intrusion detection. In Proceedings of intelligent engineering systems through artificial neural networks (pp. 579–584). New York: ASME.
Williams, G., Baxter, R., He, H., Hawkins, S., & Gu, L. (2002). A comparative study of rnn for outlier detection in data mining. In Proceedings of the 2002 IEEE international conference on data mining (p. 709). Washington: IEEE Computer Society.
Manikopoulos, C., & Papavassiliou, S. (2002). Network intrusion and fault detection: a statistical anomaly approach. IEEE Communication Magazine, 40.
Ramadas, M., Ostermann, S., & Tjaden, B. C. (2003). Detecting anomalous network traÀc with self-organizing maps. In Proceedings of recent advances in intrusion detection (pp. 36–54).
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., & Stolfo, S. (2002). A geometric framework for unsupervised anomaly detection. In Proceedings of applications of data mining in computer security (pp. 78–100). Norwell: Kluwer Academics.
Barbara, D., Couto, J., Jajodia, S., & Wu, N. (2001a). Adam: a testbed for exploring the use of data mining in intrusion detection. SIGMOD Rec., 30(4), 15–24.
Barbara, D., Couto, J., Jajodia, S., & Wu, N. (2001b). Detecting novel network intrusions using Bayes estimators. In Proceedings of the first SIAM international conference on data mining.
Barbara, D., Li, Y., Couto, J., Lin, J.-L., & Jajodia, S. (2003). Bootstrapping a data mining intrusion detection system. In Proceedings of the 2003 ACM symposium on applied computing (pp. 421–425). New York: ACM.
Fan, W., Miller, M., Stolfo, S. J., Lee, W., & Chan, P. K. (2001). Using artificial anomalies to detect unknown and known network intrusions. In Proceedings of the 2001 IEEE international conference on data mining (pp. 123–130). Los Alamitos: IEEE Computer Society.
Helmer, G., Wong, J., Honavar, V., & Miller, L. (1998). Intelligent agents for intrusion detection. In Proceedings of IEEE information technology conference (pp. 121–124).
Qin, M., & Hwang, K. (2004). Frequent episode rules for internet anomaly detection. In Proceedings of the 3rd IEEE international symposium on network computing and applications. Los Alamitos: IEEE Computer Society.
Salvador, S., & Chan, P. (2003). Learning states and rules for time-series anomaly detection (Tech. Rep. CS-2003-05). Department of Computer Science, Florida Institute of Technology Melbourne.
Otey, M., Parthasarathy, S., Ghoting, A., Li, G., Narravula, S., & Panda, D. (2003). Towards nic-based intrusion detection. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 723–728). New York: ACM.
Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P.-N., Kumar, V., Srivastava, J., & Dokas, P. (2004). MINDS—Minnesota Intrusion Detection System. In Data mining—next generation challenges and future directions. Cambridge: MIT Press.
Sequeira, K., & Zaki, M. (2002). Admit: anomaly-based data mining for intrusions. In Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 386–395). New York: ACM.
Wu, N., & Zhang, J. (2003). Factor analysis based anomaly detection. In Proceedings of IEEE workshop on information assurance. West Point: United States Military Academy.
Chandola, V., Eilertson, E., Ertoz, L., Simon, G., & Kumar, V. (2006). Data mining for cyber security. In A. Singhal (Ed.), Data warehousing and data mining techniques for computer security Berlin: Springer.
Shyu, M.-L., Chen, S.-C., Sarinnapakorn, K., & Chang, L. (2003). A novel anomaly detection scheme based on principal component classifier. In Proceedings of 3rd IEEE international conference on data mining (pp. 353–365).
Lakhina, A., Crovella, M., & Diot, C. (2005). Mining anomalies using traffic feature distributions. In Proceedings of the 2005 ACM SIGCOMM conference on applications, technologies, architectures, and protocols for computer communications, Aug. 2005.
Nychis, G., Sekar, V., Andersen, D. G., Kim, H., & Zhang, H. (2008). An empirical evaluation of entropy-based traffic anomaly detection. In Proceedings of the 8th ACM SIGCOMM conference on Internet measurement (pp. 151–156).
Thottan, M., & Ji, C. (2003). Anomaly detection in ip networks. IEEE Transactions on Signal Processing, 51(8), 2191–2204.
Sun, J., Qu, H., Chakrabarti, D., & Faloutsos, C. (2005). Relevance search and anomaly detection in bipartite graphs. SIGKDD Explorations, 7(2), 48–55.
Noble, C. C., & Cook, D. J. (2003). Graph-based anomaly detection. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003 (pp. 631–636).
Lee, W., & Xiang, D. (2001). Information-theoretic measures for anomaly detection. In Proceedings of the IEEE symposium on security and privacy (p. 130). Los Alamitos: IEEE Computer Society.
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). Locally adaptive dimensionality reduction for indexing large time series databases. In Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery.
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery.
Lin, J., Keogh, E., Patel, P., & Lonardi, S. (2002). Finding motifs in time series. In Proceedings of the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international conference on knowledge discovery and data mining.
Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107–144.
Keogh, E., Lin, J., & Fu, A. (2005). Hot sax: efficiently finding the most unusual time series subsequence. In Proc. of the 5th IEEE int’l conf. on data mining (pp. 226–233).
Staniford-Chen, S., Cheung, S., Crawford, R., Dilger, M., Frank, J., Hoagland, J., Levitt, K., Wee, C., Yip, R., & Zerkle, D. (1996). GrIDS—A graph based intrusion detection system for large networks. In Proceedings of the 19th national information systems security conference.
Shetty, J., & Adibi, J. (2005). Discovering important nodes through graph entropy: the case of enron email database. In KDD, Proceedings of the 3rd international workshop on Link discovery (pp. 74–81).
Rattigan, M., & Jensen, D. (2005). The case for anomalous link discovery. ACM SIGKDD Exploration Newsletter, 7(2), 41–47.
Chakrabarti, D. (2004). AutoPart: parameter-free graph partitioning and outlier detection. In Knowledge Discovery in Databases: PKDD 2004 (pp. 112–124). 8th European Conference on Principles and Practice of Knowledge Discovery in Databases.
Lin, S., & Chalupsky, H. (2003). Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis. In Proceedings of the third IEEE ICDM international conference on data mining (pp. 171–178).
Netflow Data, Abilene http://abilene.internet2.edu.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
He, W., Hu, G. & Zhou, Y. Large-scale IP network behavior anomaly detection and identification using substructure-based approach and multivariate time series mining. Telecommun Syst 50, 1–13 (2012). https://doi.org/10.1007/s11235-010-9384-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-010-9384-1