Abstract
While anomaly detection systems typically work on single server, most commercial web sites operate cluster environments, and user queries trigger transactions scattered through multiple servers. For this reason, anomaly detectors in a same server farm should communicate with each other to integrate their partial profile. In this paper, we describe a real-time distributed anomaly detection system that can deal with over one billion transactions per day. In our system, base on Google MapReduce algorithm, an anomaly detector in each node shares profiles of user behaviors and propagates intruder information to reduce false alarms. We evaluated our system using web log data from www.microsoft.com. The web log data, about 250GB in size, contains over one billion transactions recorded in a day.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Operating Systems Design and Implementation, 137–149 (2004)
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating MapReduce for Multi-core and Multiprocessor Systems. In: Proceedings of the 13th Intl. Symposium on HPCA, Phoenix, AZ (February 2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, J., Cha, S. (2008). Page-Based Anomaly Detection in Large Scale Web Clusters Using Adaptive MapReduce (Extended Abstract). In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds) Recent Advances in Intrusion Detection. RAID 2008. Lecture Notes in Computer Science, vol 5230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87403-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-87403-4_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87402-7
Online ISBN: 978-3-540-87403-4
eBook Packages: Computer ScienceComputer Science (R0)