Elsevier

Computers & Security

Volume 73, March 2018, Pages 474-491
Computers & Security

Process mining and hierarchical clustering to help intrusion alert visualization

https://doi.org/10.1016/j.cose.2017.11.021Get rights and content

Abstract

Intrusion Detection Systems (IDS) are extensively used as one of the lines of defense of a network to prevent and mitigate the risks caused by security breaches. IDS provide information about the intrusive activities on a network through alerts, which security analysts manually evaluate to execute an intrusion response plan. However, one of the downsides of IDS is the large amount of alerts they raise, which makes the manual investigation of alerts a burdensome and error-prone task. In this work, we propose an approach to facilitate the investigation of huge amounts of intrusion alerts. The approach applies process mining techniques on alerts to extract information regarding the attackers behavior and the multi-stage attack strategies they adopted. The strategies are presented to the network administrator in friendly high-level visual models. Large and visually complex models that are difficult to understand are clustered into smaller, simpler and intuitive models using hierarchical clustering techniques. To evaluate the proposed approach, a real dataset of alerts from a large public University in the United States was used. We find that security visualization models created with process mining and hierarchical clustering are able to condense a huge number of alerts and provide insightful information for network/IDS administrators. For instance, by analyzing the models generated during the case study, network administrators could find out important details about the attack strategies such as attack frequencies and targeted network services.

Introduction

In recent years, with the rapid growth and improvement of computer networks, new services and applications increasingly reliant on networking have been developed (Ning, Xu, 2003, Ramaki et al, 2015). Along with this growth, the importance of cybersecurity has increased and measures to mitigate the consequences of security events has become imperative (Lee et al., 2006). Intrusion Detection Systems (IDS) have been extensively used for this purpose.

IDS are devices that play an important role in the set of security policies in information systems. IDS monitor the network and system activities for any security violations. When it detects a security violation, an alert is raised to a network administrator, who manually analyzes the alert to support a response plan. Unfortunately, IDS sensors overwhelm network administrators by raising large amounts of alerts on a daily basis, which makes the manual investigation of the alerts a burdensome and error-prone activity.

Typically, traditional IDS raise low-level alerts for each attack step, and they are not able to detect logical connections and causal relationship between the alerts (Ning et al, 2002, Sadoddin, Ghorbani, 2009). As a result, finding the related alerts in a multistage attack scenario and the underlying attack strategies in a huge amount of alerts becomes a real challenge.

To deal with the unmanageable amount of alerts and improve their representation to facilitate intrusion analysis, alert preprocessing and alert correlation techniques have been proposed. The main purpose of alert preprocessing techniques is to reduce false positive alerts by identifying and removing their predominant root causes (Julisch and Dacier, 2002). On the other hand, correlation techniques aim to analyze similarity and causal relationships between low-level intrusion alerts to provide a high-level and informative description of the network state to network administrators (Zhang et al., 2009).

In this paper, we propose an alert correlation approach with emphasis on visual models to assist network administrators in the investigation of multistage attack strategies. The models provide an intuitive way to understand the strategies attackers have employed to compromise the network, helping network administrators to determine response actions and preventive measures against the intrusions.

This paper extends our previous work on discovering attack strategies using process mining (Alvarenga et al., 2015). Unlike our previous work, which relied on Heuristic Mining algorithm to generate attack models (Weijters and Ribeiro, 2011), in this approach, the attack models are generated using a novel algorithm. It represents the attack strategies in a weighted directed graph-based model, which combines visual features along with quantitative measures that can help the network administrator to identify the steps of multistage attacks, as well as the frequency of the attacks against the network. The main contributions of this paper are three-fold:

  • First, we propose a new method to extract attack models from intrusion alerts using a process mining-based approach. The attack models generated by our method provide an understandable and intuitive way to interpret attack strategies that would hardly be obtained by performing the manual analysis of the alerts.

  • Second, due to the huge amounts of alerts generated by IDS, some attack models may be very large, making their visualization difficult. To deal with this issue, we propose a novel algorithm based on hierarchical clustering for automatically clustering large and complex models into smaller, and simpler attack models.

  • Third, we propose a method to evaluate the attack models and determine whether they need to be clustered. The method includes a metric to evaluate the simplicity of the models, which guides our algorithm to determine the number of clusters required for each attack model.

The rest of the paper is organized as follows. Section 2 reviews the related work. In Section 3, we present the main concepts related to IDS, process mining and cluster analysis that are used throughout the paper. Section 4 introduces the architecture of the proposed approach and explores its operations. Section 5 evaluates the proposed approach and presents the obtained results. Finally, Section 6 contains the concluding remarks and future work possibilities.

Section snippets

Related work

In this section, we review the related work about approaches that help the network administrator managing and analyzing huge amounts of low-level intrusion alerts. Generally, the approaches may be classified into two categories, namely alert preprocessing and alert correlation.

Alert preprocessing techniques aim to reduce the influence of false positives, i.e., alerts that were raised for regular events that were incorrectly identified as malicious. Therefore, the techniques are often applied in

Background information

In this section, an overview of the main concepts related to process mining and hierarchical clustering that are used throughout the paper are presented.

Proposed approach

In this section, we introduce our approach which aims to mine intrusion alerts to acquire knowledge regarding the attackers activities. The approach makes use of process mining and hierarchical clustering techniques to extract information about the attackers behavior and discover which strategies they are using in attempts to compromise the network. We present the discovered strategies for the network administrator in friendly high-level visual models.

Experiments and results

This section presents a case study that aims to evaluate the proposed approach in this paper. The case study was organized considering some situations that may influence in generating complex attack models such as the number of alerts, distinct attackers and distinct signatures. First, the dataset used in the case study is presented. Then, the proposed approach is executed considering the aforementioned situations and finally some discussion and considerations are made on the results.

Conclusion and future work

This paper addresses the problem of visualizing huge amounts of IDS alerts proposing a solution composed of four steps. The approach makes use of processes mining techniques to extract information about the attackers behavior and discover the attack strategies they are using in an attempt to compromise the network. The strategies are then presented to the network administrator in an attack model, a high-level visual representation of the low-level alerts, through which the analysis and

Acknowledgment

This work was partially supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (1290203) (Brazil) through the grant of a scholarship to Sean Carlisto de Alvarenga. Also, the authors would like to thank Gerry Sneeringer and the Division of Information Technology at the University of Maryland for allowing and supporting the described research.

Sean Carlisto de Alvarenga received his B.S. and M.Sc. degrees in Computer Science from State University of Londrina, Brazil, in 2013 and 2016, respectively. His research interests are mainly focused on intrusion detection systems, and more precisely on mining intrusion alerts for attack strategies discovery using process mining techniques.

References (24)

  • B.S. Everitt et al.

    Hierarchical clustering

    (2011)
  • A. Hätälä et al.

    Event data exchange and intrusion alert correlation in heterogeneous networks

  • Cited by (32)

    • Quantum-inspired ant lion optimized hybrid k-means for cluster analysis and intrusion detection

      2020, Knowledge-Based Systems
      Citation Excerpt :

      Clustering algorithms can be used for intrusion detection. For example, security visualization models [6] created with process mining and hierarchical clustering, a density-based fuzzy imperialist competitive clustering algorithm (D-FICCA) for intrusion detection in wireless sensor networks [7], safety detection algorithm in sensor network based on ant colony optimization with improved multiple clustering [8], an intrusion detection system based on combining cluster centers and nearest neighbors [9], semi-supervised multilayered clustering model [10] for intrusion detection, and a new intrusion detection system (IDS) based on fuzzy clustering algorithms [11]. Clustering is a process of grouping a set of objects based on some similarity measure.

    • Process mining-based anomaly detection of additive manufacturing process activities using a game theory modeling approach

      2020, Computers and Industrial Engineering
      Citation Excerpt :

      This paper explores how integration of process mining and game theory approaches can be applied to control business processes between the expected behavior and the actual behavior through distributed event logs. There are three main domains of process mining, specifically discovery, conformance and enhancement (Carlisto de Alvarenga, Barbon, Sanches, Cukier, & Bogazzarpelao, 2018). The main focus of our proposed component is closely on the discovery and conformance technique.

    • An Approach for Analysing Law Processes based on Hierarchical Activities and Clustering

      2023, 2023 IEEE Latin American Conference on Computational Intelligence, LA-CCI 2023
    View all citing articles on Scopus

    Sean Carlisto de Alvarenga received his B.S. and M.Sc. degrees in Computer Science from State University of Londrina, Brazil, in 2013 and 2016, respectively. His research interests are mainly focused on intrusion detection systems, and more precisely on mining intrusion alerts for attack strategies discovery using process mining techniques.

    Sylvio Barbon Jr. received his B.S. degree in Computer Science from Centro Universitário do Norte Paulista (2005), the master degree in Computational Physics from University of Sao Paulo (2007), the B.S. degree in Computational Engineering from Centro Universitário de Votuporanga (2008) and the Ph.D. degree (2011) from IFSC/USP. He is a Sun Certified Programmer for Java. He is currently a professor at State University of Londrina (UEL), Brazil, in postgraduate and graduate programs. His research interests include Digital Signal Processing, Pattern Recognition, Machine Learning and Games.

    Rodrigo Sanches Miani received his B.S. degree in Mathematics from the Federal University of São Carlos, Brazil, the M.Sc. and Ph.D. degrees in Electrical Engineering from the University of Campinas, Brazil. In 2013, he joined the School of Computer Science of the Federal University of Uberlandia. His research interests include cyber security, security quantification models and evaluation of intrusion detection systems.

    Michel Cukier is an associate professor of Reliability Engineering with a joint appointment in the Department of Mechanical Engineering at the University of Maryland–College Park. He is also the Director of the Advanced Cybersecurity Experience for Students (ACES) and the Associate Director for Education of the Maryland Cybersecurity Center. His research interests include system dependability and security issues.

    Bruno Bogaz Zarpelão received his B.S. degree in Computer Science from State University of Londrina, Brazil, and the Ph.D. degree in Electrical Engineering from University of Campinas, Brazil. He is currently a professor at the Computer Science Department of the State University of Londrina (UEL), Brazil, which he joined in 2012. His research interests include security analytics, intrusion detection and Internet of Things.

    View full text