Elsevier

Applied Soft Computing

Volume 36, November 2015, Pages 408-418
Applied Soft Computing

An uncertainty-managing batch relevance-based approach to network anomaly detection

https://doi.org/10.1016/j.asoc.2015.07.029Get rights and content

Highlights

  • Adaptive, network anomaly detection strategy based on a batch relevance-based fuzzified learning algorithm.

  • Couples the capability of inferring decisional structures from incomplete observations, with the flexibility of a fuzzy-based uncertainty management strategy.

  • Infers the laws and rules governing normal or abnormal network traffic, in order to model its operating dynamics.

  • Based on a rule-based detection strategy, is more effective against previously unknown phenomena and robust against obfuscation mechanisms.

Abstract

The main aim in network anomaly detection is effectively spotting hostile events within the traffic pattern associated to network operations, by distinguishing them from normal activities. This can be only accomplished by acquiring the a-priori knowledge about any kind of hostile behavior that can potentially affect the network (that is quite impossible for practical reasons) or, more easily, by building a model that is general enough to describe the normal network behavior and detect the violations from it. Earlier detection frameworks were only able to distinguish already known phenomena within traffic data by using pre-trained models based on matching specific events on pre-classified chains of traffic patterns. Alternatively, more recent statistics-based approaches were able to detect outliers respect to a statistic idealization of normal network behavior. Clearly, while the former approach is not able to detect previously unknown phenomena (zero-day attacks) the latter one has limited effectiveness since it cannot be aware of anomalous behaviors that do not generate significant changes in traffic volumes. Machine learning allows the development of adaptive, non-parametric detection strategies that are based on “understanding” the network dynamics by acquiring through a proper training phase a more precise knowledge about normal or anomalous phenomena in order to classify and handle in a more effective way any kind of behavior that can be observed on the network. Accordingly, we present a new anomaly detection strategy based on supervised machine learning, and more precisely on a batch relevance-based fuzzyfied learning algorithm, known as U-BRAIN, aiming at understanding through inductive inference the specific laws and rules governing normal or abnormal network traffic, in order to reliably model its operating dynamics. The inferred rules can be applied in real time on online network traffic. This proposal appears to be promising both in terms of identification accuracy and robustness/flexibility when coping with uncertainty in the detection/classification process, as verified through extensive evaluation experiments.

Introduction

Together with the astonishing deployment of network technologies and the consequent increment in traffic volumes, the importance of network misuse detection and prevention frameworks is proportionally growing in almost all the modern organizations, in order to protect the most strategic resources from both external and internal threats. In this scenario, the task of identifying and categorizing network anomalies essentially consists in determining all the circumstances in which the network traffic pattern deviates from its normal behavior, that in turn depends on multiple elements and considerations associated to the activities taking place every day on the network.

However, the main difficulty related to a really effective detection is associated to the continuous evolution of anomalous phenomena, due to the emergence of new previously unknown attacks, so that achieving a precise, stable and exhaustive definition of anomalous behavior, encompassing all the possible hostile events that can occur on a real network, is practically impossible. Nevertheless, detection systems must not be limited by the a priori knowledge of a specific set of anomalous traffic templates or be conditioned by a large number of complex operating parameters (e.g., traffic statistic distributions and alarm thresholds), and hence have to be able to recognize and directly classify any previously unknown phenomenon that can be experienced on the network. As a consequence, the ultimate goal of modern anomaly detection systems is behaving in a adaptive way in order to flag in “real-time”, all the deviations from a model that is built dynamically and in an incremental way by capturing the concept of normality in network operations according to a learning-by-example strategy. These new systems, overcoming the known limitations of the more traditional ones based on pattern detection and statistical analysis, are empowered by flexible machine learning techniques.

Accordingly, we propose a novel anomaly detection strategy, particularly suitable for IP networks, based on supervised machine learning, and more specifically on a batch relevance-based fuzzyfied learning algorithm known as U-BRAIN.

This strategy aims at understanding the processes that originate the traffic data, by deriving the specific laws and rules governing it, in order to reliably model its underlying dynamics. This is accomplished by performing inductive inference (or better, generalization) on traffic observations, based on some empirical pre-classified “experiential” (training) data, representing incomplete information about the occurrence of specific phenomena that describe normal or anomalous network activities. In addition, the adopted learning scheme allows a certain degree of uncertainty in the whole detection process making the resulting framework more solid and flexible in managing the large variety and complexity of real traffic phenomena. Then the inferred rules can be applied in real time on online network traffic.

We evaluated the effectiveness of the presented detection framework within a widely known test case scenario, in order to make the achieved results comparable with those of other proposal available in literature. These results demonstrated a quite satisfactory identification accuracy by placing our strategy among the most promising state-of-the-art proposals.

Section snippets

Background and related work

Network anomaly detection has gained a great attention in security research with about 40 years of experiences available in literature. The first approach to automatic detection has been proposed in [1], followed by a large number of contributions exploring many other solutions and proposals [2], [3], [4].

The earliest and more traditional detection approaches, mainly aiming at spotting intrusion activities, work by matching specific traffic patterns, gathered from the packets under observation,

A fuzzy rule-based detection strategy

The basic idea is building a formal model that expresses the relations between all the fundamental variables involved in the traffic dynamics, and hence “understands” the notions of normal and anomalous behavior from the available experience by learning the characteristics of the corresponding traffic classes and expressing them into laws and rules that are general enough to determine if any unseen instance belongs to the one or the other class. Obviously, the overall detection quality strongly

Performance evaluation

In evaluating the performance of the proposed detection strategy our main aim was making our results comparable with alternative approaches already available in literature. Unfortunately, this is not immediate, in lack of a generally recognized benchmark for assessing and validating anomaly detection solutions. In fact, most of the publicly available data sets and taxonomies that can be used for benchmarking anomaly detection systems are generally known to be error-prone and of limited

Conclusions

Identifying anomalous events is one of the best ways to discover a lot of existing malfunctions and handle most of the security and performance problems that may occur in modern networks. Hence, the availability of reliable detection devices and strategies becomes a fundamental prerequisite for next generation network-empowered infrastructures. We presented a new supervised machine learning approach to anomaly detection, whose goal is understanding the dynamics and behaviors characterizing

References (51)

  • S. Staniford et al.

    Practical automated detection of stealthy portscans

    J. Comput. Secur.

    (2002)
  • T. Lane et al.

    An application of machine learning to anomaly detection

  • A.K. Ghosh et al.

    A study in using neural networks for anomaly and misuse detection

  • V.N. Dao et al.

    A performance comparison of different back propagation neural networks methods in computer network intrusion detection

    Differ. Equ. Dyn. Syst.

    (2002)
  • S. Mukkamala et al.

    Intrusion detection using neural networks and support vector machines

  • W. Lee et al.

    Mining in a data-flow environment: experience in network intrusion detection

  • W. Lee et al.

    A data mining framework for building intrusion detection models

  • C. Warrender et al.

    Detecting intrusions using system calls: alternative data models

  • P.K. Chan et al.

    Learning rules and clusters for anomaly detection in network traffic

    Managing Cyber Threats

    (2005)
  • N. Duffield et al.

    Rule-based anomaly detection on IP flows

  • K. Burbeck et al.

    ADWICE-anomaly detection with real-time incremental clustering

  • S.R. Gaddam et al.

    K-means+ id3: a novel method for supervised anomaly detection by cascading k-means clustering and id3 decision tree learning methods

    IEEE Trans. Knowl. Data Eng.

    (2007)
  • L. Khan et al.

    A new intrusion detection system using support vector machines and hierarchical clustering

    VLDB J. – Int. J. Very Large Data Bases

    (2007)
  • D. peng Chen et al.

    Internet anomaly detection with weighted fuzzy matching over frequent episode rules

  • J.E. Dickerson et al.

    Fuzzy network profiling for intrusion detection

  • Cited by (0)

    View full text