Anomaly-based intrusion detection: privacy concerns and other problems

doi:10.1016/S1389-1286(00)00134-1

Computer Networks

Volume 34, Issue 4, October 2000, Pages 623-640

https://doi.org/10.1016/S1389-1286(00)00134-1 Get rights and content

Abstract

This paper addresses the specific advantages and disadvantages of anomaly-based intrusion detection. One important disadvantage is its impact on user privacy. A great deal of potentially sensitive information is recorded and analyzed in ways that threaten personal integrity. A solution for this may be to pseudonymize the sensitive information in the log files, i.e., exchange user names, etc., for pseudonyms. This paper shows how this can be done. We have carried out a number of experiments using an anomaly detection tool on pseudonymized data collected from a proxy firewall. The experiments revealed most of the known problems of anomaly detection and also some problems originating from the use of intrusion detection in combination with pseudonymization. This paper focuses on these problems and discusses how they can be remedied or circumvented. Also discussed is the extent to which these problems apply to tools based on misuse detection.

Introduction

A large number of intrusion detection tools have been designed and implemented during the past two decades. The first ones were anomaly detection tools, but today misuse detection tools dominate the market together with some hybrid versions. Some early systems are IDES [9], Haystack [16] and MIDAS [17]. More recent systems are, e.g., Emerald [14] and Bro [13].

Anomaly detection means establishing a “normal” behavior pattern for the users of the system and then looking for deviations from this behavior. An early anomaly detection model is described by Denning [3]. Research efforts on anomaly detection have resulted in systems such as NIDES [7] and W&S [18].

The feature of anomaly detection that makes it interesting to use is the possibility of detecting formerly unknown intrusions. The most apparent drawback of anomaly detection is the high false alarm rates. The question is if this is an unsolvable problem that will render anomaly detection useless.

Misuse detection means looking for known malicious or unwanted behavior. Examples of misuse detection systems are Haystack [16] and Bro [13].

The main features of misuse detection are its efficiency and comparably low false alarm rate and these qualities are important to the customers. The problem is that new intrusions appear frequently and there is a race between upgrading the intrusion detection system (IDS) and attackers finding new ways of getting into the systems. This is a race that cannot be won by the system owners.

Many different intrusion detection prototypes have been studied and tested during the years. Most studies focus on the advantages of the method and not many address the limitations and practical problems.

There are many different characteristics that must be taken into consideration when studying IDSs. Of course, the detection rate is one of these. Detection rate means the percentage of “all possible” intrusions that will be detected. The false alarm rate may be even more important, especially since it seems to be a limiting factor in what methods can be used in practice [1]. Besides that, there are characteristics such as timeliness, what data sources that are used, if the detection is distributed, privacy aspects, etc. Such characteristics have been surveyed and classified in a number of papers, e.g., [2], [4].

We designed and implemented an anomaly detection tool in a recent industrial project. Since we were interested in the features of anomaly detection, we wanted to make a practical evaluation of the usefulness of the method. The goal of our work was to describe the problems experienced and to suggest possible solutions to them. We realized at an early stage that user privacy would be one important problem. We therefore developed a pseudonymization tool that removes sensitive information from log files and uses pseudonyms for user names, host names, etc. Anomaly detection experiments were then performed on pseudonymized data collected from a proxy firewall [11].

In the following, Section 2 discusses background and related work within anomaly detection and privacy in computer systems. Section 3 discusses sensitive information and how to protect privacy with pseudonymizers. In Section 4, the implementation of our pseudonymizer is described. The implementation of the anomaly detection tool is described in Section 5, and the experiments in Section 6. Experienced problems and possible remedies are discussed in Section 7. Section 8 presents a short discussion of how these problems apply to misuse detection tools and Section 9 some concluding remarks.

Section snippets

Known problems with anomaly detection

One of the largest and probably most successful research efforts on anomaly detection is the NIDES project at the SRI. Lunt describes the main concepts and ideas in [12]. NIDES consists of a statistical anomaly detection component described in detail in [7] and an expert system for misuse detection. In [12], the problems of anomaly detection are discussed briefly. It is mentioned that there are obvious difficulties with attempting to detect intrusions solely on the basis of departures from

Sensitive information and pseudonymizers

This section discusses what information can be considered sensitive. It describes how to use a pseudonymizer to enhance privacy, and the problems it causes are discussed.

A pseudonymizer tool

In a recent industrial project, we developed a pseudonymizer tool. This was done to obtain log files in which the sensitive information was removed. The company intended to distribute the tool to their customers in order to obtain pseudonymized data to use for debugging purposes and for further intrusion detection experiments.

Anomaly detection tool description

We also implemented an intrusion detection tool using statistical methods to detect anomalies. The tool was specially adapted for analyzing log files from the proxy firewall described in Section 4.1.

The primary intention was for the program to be able to detect masqueraders and, to some extent, insiders. A masquerader can be defined as a person, either external or internal, who uses an account on the system for which he is not authorized. An insider is a legitimate user who misuses the system.

Tool evaluation and data analysis

In order to evaluate the tool, a number of experiments were carried out. The objective of the experiments was to evaluate the detection method and our tool and to investigate the problems of the method. We also wanted to determine whether statistical anomaly detection was a practicable method for analyzing log files generated by the system described in Section 4.1.

The experiments were carried out on authentic log files gathered over a ten-month period. The log files came from two different

Experienced problems and possible remedies

We experienced a variety of problems during the implementation and experiments with our anomaly detection tool, [11]. These are described and discussed in this section. Possible remedies are suggested for several of the problems.

Comparison to misuse detection

This section discusses whether the problems we experienced with our anomaly detection tool will also apply for misuse detection.

Misuse detection is a method rather different from anomaly detection. A misuse detection tool looks for acts that are known to be intrusive, which gives more exact detection. This produces fewer false alarms and it is not necessary to visualize users' behavior to find out what has happened. The disadvantage is that the tool must be updated regularly, in order not to

Conclusions

There are many problems to be aware of when implementing or using an anomaly detection tool, some of which, but not all, can be remedied or reduced. The problem that poses the greatest threat to the applicability of this method is the many false alarms and the amount of time it takes to investigate them.

The privacy of the users in the system was also found to be threatened when our tool was used. Pseudonymization of user names reduces privacy problems considerably, but it is not a flawless

Emilie Lundin is a Ph.D. student at the Department of Computer Engineering at Chalmers University of Technology, Goteborg, Sweden. She is part of the Computer Security group and her main research interest is intrusion detection. Emilie Lundin received a MS in Computer Science and Engineering from Chalmers.

References (19)

H. Debar et al.
Towards a taxonomy of intrusion–detection systems
Comput. Networks
(1999)
J. Hochberg et al.
NADIR: An automated system for detecting network intrusions and misuse
Comput. & Security
(1993)
S. Axelsson, The base-rate fallacy and its implications for intrusion detection, in: Proceeding of the Sixth ACM...
D.E. Denning
An intrusion–detection model
IEEE Trans. Software Eng.
(1987)
L.R. Halme, R.K. Bauer, AINT misbehaving – a taxonomy of anti-intrusion techniques, in: Proceedings of the 18th...
P. Helman, G. Liepins, Statistical foundations of audit trail analysis for the detection of computer misuse, IEEE...
H.S. Javitz, A. Valdes, The nides statistical component: Description and justification, Technical Report, SRI Computer...
T. Lane, C.E. Brodley, Temporal sequence learning and data reduction for anomaly detection, in: Proceedings of the...
T.F. Lunt, R. Jagannathan, A prototype real-time expert system, in: Proceedings of the IEEE Symposium on Security and...

There are more references available in the full text version of this article.

Cited by (0)

Erland Jonsson is professor of Computer security and head of the Department of Computer Engineering at Chalmers University of Technology, Goteborg, Sweden. Prior to taking up his present post, he worked in industry for almost 20 years with hardware and software design and quality assurance for telecommunications and space applications. His research interests include intrusion detection, security mechanisms and quantitative security modelling. Jonsson received a MS in Electrical Engineering and a Ph.D. in Computer Engineering from Chalmers. He was a board member of the Special Interest Group for Security of the Swedish Information Processing Society. He is a member of the IEEE Computer Society and ACM.

^☆: This paper is a revised and extended version of “Privacy vs intrusion detection analysis”, Proceedings of the Second International Workshop on Recent Advances in Intrusion Detection (RAID'99) [10].

View full text

Anomaly-based intrusion detection: privacy concerns and other problems☆