Abstract
Detecting malicious connection attempts and attacks against web-based applications is one of many approaches to protect the World Wide Web and its users.
In this paper, we present a generic method for detecting anomalous and potentially malicious web requests from the network’s point of view without prior knowledge or training data of the web-based application. The algorithm assumes that a legitimate request is an ordered sequence of semantic entities. Malicious requests are in different order or include entities which deviate from the structure of the majority of requests. Our method learns a variable-order Markov model from legitimate sequences of semantic entities. If a sequence’s probability deviates from previously seen ones, it is reported as anomalous.
Experiments were conducted on logs from a social networking web site. The results indicate that that the proposed method achieves good detection rates at acceptable false-alarm rates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Apache 2.0 Documentation: Apache Module mod_rewrite (2011), http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html (Online; accessed April 28, 2011)
Axelsson, S.: The base-rate fallacy and its implications for the difficulty of intrusion detection. In: CCS 1999: Proceedings of the 6th ACM Conference on Computer and Communications Security, pp. 1–7. ACM, New York (1999)
Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Int. Res. 22(1), 385–421 (2004)
Berners-Lee, T., Fielding, R., Masinter, L.: Uniform Resource Identifier (URI): Generic Syntax. RFC 3986 (Standard) (January 2005), http://www.ietf.org/rfc/rfc3986.txt
Chan-Tin, E., Feldman, D., Hopper, N., Kim, Y.: The Frog-Boiling Attack: Limitations of Anomaly Detection for Secure Network Coordinate Systems. In: Chen, Y., Dimitriou, T.D., Zhou, J. (eds.) SecureComm 2009. LNICST, vol. 19, pp. 448–458. Springer, Heidelberg (2009)
Cleary, J.G., Witten, I.H.: Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications 32, 396–402 (1984)
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: ICML 2006, pp. 233–240. ACM, New York (2006)
Düssel, P., Gehl, C., Laskov, P., Rieck, K.: Incorporation of Application Layer Protocol Syntax into Anomaly Detection. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 188–202. Springer, Heidelberg (2008)
Evans, M., Hastings, N., Peacock, B.: Statistical Distributions, 3rd edn. Wiley-Interscience (2000)
Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., Berners-Lee, T.: Hypertext Transfer Protocol – HTTP/1.1. RFC 2616 (Draft Standard) (June 1999), http://www.ietf.org/rfc/rfc2616.txt , updated by RFCs 2817, 5785
Görnitz, N., Kloft, M., Rieck, K., Brefeld, U.: Active learning for network intrusion detection. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, AISec 2009, pp. 47–54. ACM, New York (2009)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (2006)
Ingham, K.L., Somayaji, A., Burge, J., Forrest, S.: Learning dfa representations of http for protecting web applications. Comput. Netw. 51, 1239–1255 (2007)
Knuth, D.E.: The Art of Computer Programming. Seminumerical Algorithms, 2nd edn., vol. II. Addison-Wesley (1981)
Kruegel, C., Vigna, G.: Anomaly detection of web-based attacks. In: CCS 2003: Proceedings of the 10th ACM Conference on Computer and Communications Security, pp. 251–261. ACM, New York (2003)
Krueger, T., Gehl, C., Rieck, K., Laskov, P.: Tokdoc: a self-healing web application firewall. In: SAC 2010: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 1846–1853. ACM, New York (2010)
Ma, J., Liu, X., Wang, Q., Dai, G.: Compression-based web anomaly detection model. In: 2010 IEEE 29th International Performance Computing and Communications Conference (IPCCC) (December 2010)
Maggi, F., Robertson, W., Kruegel, C., Vigna, G.: Protecting a Moving Target: Addressing Web Application Concept Drift. In: Kirda, E., Jha, S., Balzarotti, D. (eds.) RAID 2009. LNCS, vol. 5758, pp. 21–40. Springer, Heidelberg (2009)
Metasploit: The Metasploit Project (2011), http://www.metasploit.com/ (Online; accessed April 30, 2011)
MITRE Corporation: Common Vulnerabilites and Exposures (2011), http://cve.mitre.org/ (Online; accessed May 12, 2011)
MITRE Corporation: Common Weakness Enumeration (2011), http://cwe.mitre.org/ (Online; accessed April 28, 2011)
Moffat, A.: Implementing the ppm data compression scheme. IEEE Transactions on Communications 38(11), 1917–1921 (1990)
Perdisci, R., Ariu, D., Fogla, P., Giacinto, G., Lee, W.: Mcpad: A multiple classifier system for accurate payload-based anomaly detection. Computer Networks 53(6), 864–881 (2009); traffic Classification and Its Applications to Modern Networks
Provos, N., McNamee, D., Mavrommatis, P., Wang, K., Modadugu, N.: The ghost in the browser analysis of web-based malware. In: Proceedings of the First Conference on First Workshop on Hot Topics in Understanding Botnets. USENIX Association, Berkeley (2007)
Robertson, W., Vigna, G., Kruegel, C., Kemmerer, R.: Using generalization and characterization techniques in the anomaly-based detection of web attacks. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA (February 2006)
Robertson, W., Maggi, F., Kruegel, C., Vigna, G.: Effective anomaly detection with scarce training data. In: Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA (February 2010)
Salomon, D.: Data Compression: The Complete Reference. Springer, Heidelberg (2007)
Sommer, R., Paxson, V.: Outside the closed world: On using machine learning for network intrusion detection. In: IEEE Symposium on Security and Privacy, pp. 305–316 (2010)
Song, Y., Keromytis, A.D., Stolfo, S.J.: Spectrogram: A mixture-of-markov-chains model for anomaly detection in web traffic. In: Proc. of Network and Distributed System Security Symposium, NDSS (2009)
Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer-Verlag New York, Inc., Secaucus (2005)
Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: Proceedings of the 9th ACM Conference on Computer and Communications Security, CCS 2002, pp. 255–264. ACM, New York (2002)
Wang, K., Parekh, J.J., Stolfo, S.J.: Anagram: A Content Anomaly Detector Resistant to Mimicry Attack. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 226–248. Springer, Heidelberg (2006)
Wang, K., Stolfo, S.J.: Anomalous Payload-Based Network Intrusion Detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 203–222. Springer, Heidelberg (2004)
Welford, B.P.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)
Wojtczuk, R.: Libnids (2011), http://libnids.sourceforge.net/ (Online; accessed May 9, 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Lampesberger, H., Winter, P., Zeilinger, M., Hermann, E. (2012). An On-Line Learning Statistical Model to Detect Malicious Web Requests. In: Rajarajan, M., Piper, F., Wang, H., Kesidis, G. (eds) Security and Privacy in Communication Networks. SecureComm 2011. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 96. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31909-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-31909-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31908-2
Online ISBN: 978-3-642-31909-9
eBook Packages: Computer ScienceComputer Science (R0)