ABSTRACT
Evaluating anomaly detectors is a crucial task in traffic monitoring made particularly difficult due to the lack of ground truth. The goal of the present article is to assist researchers in the evaluation of detectors by providing them with labeled anomaly traffic traces. We aim at automatically finding anomalies in the MAWI archive using a new methodology that combines different and independent detectors. A key challenge is to compare the alarms raised by these detectors, though they operate at different traffic granularities. The main contribution is to propose a reliable graph-based methodology that combines any anomaly detector outputs. We evaluated four unsupervised combination strategies; the best is the one that is based on dimensionality reduction. The synergy between anomaly detectors permits to detect twice as many anomalies as the most accurate detector, and to reject numerous false positive alarms reported by the detectors. Significant anomalous traffic features are extracted from reported alarms, hence the labels assigned to the MAWI archive are concise. The results on the MAWI traffic are publicly available and updated daily. Also, this approach permits to include the results of upcoming anomaly detectors so as to improve over time the quality and variety of labels.
- MAWILab. http://www.fukuda-lab.org/mawilab/.Google Scholar
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB '94, pages 487--499, 1994. Google ScholarDigital Library
- A. B. Ashfaq, M. Javed, S. A. Khayam, and H. Radha. An information-theoretic combining method for multi-classifier anomaly detection systems. ICC '10, page 5, 2010.Google ScholarCross Ref
- P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. IMW '02, pages 71--82, 2002. Google ScholarDigital Library
- J.-P. Benzécri. Correspondence Analysis Handbook. Marcel Dekker, New York, 1992.Google ScholarCross Ref
- V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. J. STAT. MECH., 2008.Google ScholarCross Ref
- P. Borgnat, G. Dewaele, K. Fukuda, P. Abry, and K. Cho. Seven years and one day: Sketching the evolution of internet traffic. INFOCOM '09, pages 711--719, 2009.Google ScholarCross Ref
- D. Brauckhoff, X. Dimitropoulos, A. Wagner, and K. Salamatian. Anomaly extraction in backbone networks using association rules. IMC '09, pages 28--34, 2009. Google ScholarDigital Library
- K. Cho, K. Mitsuya, and A. Kato. Traffic data repository at the WIDE project. In USENIX 2000 Annual Technical Conference: FREENIX Track, pages 263--270, 2000. Google ScholarDigital Library
- H. chul Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee. Internet traffic classification demystified: Myths, caveats, and the best practices. CoNEXT '08, 2008. Google ScholarDigital Library
- G. Dewaele, K. Fukuda, P. Borgnat, P. Abry, and K. Cho. Extracting hidden anomalies using sketch and non gaussian multiresolution statistical detection procedures. SIGCOMM LSAD '07, pages 145--152, 2007. Google ScholarDigital Library
- S. Floyd and V. Paxson. Difficulties in simulating the internet. IEEE/ACM Trans. Netw., 9(4):392--403, 2001. Google ScholarDigital Library
- R. Fontugne, P. Borgnat, P. Abry, and K. Fukuda. Uncovering relations between traffic classifiers and anomaly detectors via graph theory. In International Workshop on Traffic Monitoring and Analysis (TMA '10), pages 101--114, 2010. Google ScholarDigital Library
- R. Fontugne and K. Fukuda. A Hough-transform-based anomaly detector with an adaptive time interval. ACM SAC '11, 2011. Google ScholarDigital Library
- S. Fortunato. Community detection in graphs. Physics Reports, 486(3--5):75--174, 2010.Google Scholar
- H. Gupta, V. J. Ribeiro, and A. Mahanti. A longitudinal study of small-time scaling behavior of internet traffic. In Proceedings of NETWORKING 2010, pages 83--95, 2010. Google ScholarDigital Library
- Y. Himura, K. Fukuda, K. Cho, and H. Esaki. An automatic and dynamic parameter tuning of a statistics-based anomaly detection algorithm. ICC '09, page 6, 2009. Google ScholarDigital Library
- Y. Kanda, K. Fukuda, and T. Sugawara. An evaluation of anomaly detection based on sketch and PCA. GLOBECOM '10, 2010.Google ScholarCross Ref
- T. Karagiannis, M. Molle, M. Faloutsos, and A. Broido. A nonstationary poisson view of internet traffic. In INFOCOM '04, 2004.Google ScholarCross Ref
- L. I. Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, 2004. Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. SIGCOMM '04, pages 219--230, 2004. Google ScholarDigital Library
- A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. SIGCOMM '05, pages 217--228, 2005. Google ScholarDigital Library
- X. Li, F. Bian, M. Crovella, C. Diot, R. Govindan, G. Iannaccone, and A. Lakhina. Detection and identification of network anomalies using sketch subspaces. IMC '06, pages 147--152, 2006. Google ScholarDigital Library
- R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das. The 1999 darpa off-line intrusion detection evaluation. Computer Networks, 34(4):579--595, 2000. Google ScholarDigital Library
- J. Mchugh. Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Trans. Inf. Syst. Secur., 3(4):262--294, 2000. Google ScholarDigital Library
- C. J. Merz. Using correspondence analysis to combine classifiers. Mach. Learn., 36(1--2):33--58, 1999. Google ScholarDigital Library
- G. Nychis, V. Sekar, D. G. Andersen, H. Kim, and H. Zhang. An empirical evaluation of entropy-based traffic anomaly detection. IMC '08, pages 151--156, 2008. Google ScholarDigital Library
- P. Owezarski. A database of anomalous traffic for assessing profile based IDS. In International Workshop on Traffic Monitoring and Analysis (TMA '10), pages 59--72, 2010. Google ScholarDigital Library
- H. Ringberg, M. Roughan, and J. Rexford. The need for simulation in evaluating anomaly detectors. SIGCOMM Comput. Commun. Rev., 38(1):55--59, 2008. Google ScholarDigital Library
- H. Ringberg, A. Soule, J. Rexford, and C. Diot. Sensitivity of PCA for traffic anomaly detection. SIGMETRICS Perform. Eval. Rev., 35(1):109--120, 2007. Google ScholarDigital Library
- B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph, S.-h. Lau, S. Rao, N. Taft, and J. D. Tygar. Antidote: understanding and defending against poisoning of anomaly detectors. IMC '09, pages 1--14, 2009. Google ScholarDigital Library
- A. Scherrer, N. Larrieu, P. Owezarski, P. Borgnat, and P. Abry. Non-Gaussian and Long Memory Statistical Characterisations for Internet Traffic with Anomalies. IEEE Transaction on Dependable and Secure Computing, 4(1):56--70, 02 2007. Google ScholarDigital Library
- S. Shanbhag and T. Wolf. Accurate anomaly detection through parallelism. Netwrk. Mag. of Global Internetwkg., 23(1):22--28, 2009. Google ScholarDigital Library
- M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani. A detailed analysis of the kdd cup 99 data set. IEEE international conference on Computational intelligence for security and defense applications (CISDA '09), pages 53--58, 2009. Google ScholarDigital Library
- K. Xu, Z.-L. Zhang, and S. Bhattacharyya. Internet traffic behavior profiling for network security monitoring. IEEE/ACM Trans. Netw., 16(6):1241--1252, 2008. Google ScholarDigital Library
Index Terms
- MAWILab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking
Recommendations
Generating Labeled Flow Data from MAWILab Traces for Network Intrusion Detection
SNTA '19: Proceedings of the ACM Workshop on Systems and Network Telemetry and AnalyticsA growing issue in the modern cyberspace world is the direct identification of malicious activity over network connections. The boom of the machine learning industry in the past few years has led to the increasing usage of machine learning technologies, ...
Autoencoding Binary Classifiers for Supervised Anomaly Detection
PRICAI 2019: Trends in Artificial IntelligenceAbstractWe propose the Autoencoding Binary Classifiers (ABC), a novel supervised anomaly detector based on the Autoencoder (AE). There are two main approaches in anomaly detection: supervised and unsupervised. The supervised approach accurately detects ...
Complementary Set Variational Autoencoder for Supervised Anomaly Detection
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)Anomalies have broad patterns corresponding to their causes. In industry, anomalies are typically observed as equipment failures. Anomaly detection aims to detect such failures as anomalies. Although this is usually a binary classification task, the ...
Comments