Abstract
This paper presents a novel approach to mining patterns and outliers detection in the Web Usage log. This approach involves kernel methods and fuzzy clustering methods. Web log records are considered as vectors with numeric and nominal attributes. These vectors are mapped by means of a special kernel to a high dimensional feature space, where the possibilistic clustering method is used to calculate the measure of “typicalness” of vectors. If the value of this measure for a particular record is less than specified threshold this record is labeled as an outlier. The records with high “typicalness” are considered as access patterns of user activity. The performance of the approach is demonstrated experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web (A Survey Paper), in Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’97)
Krishnapuram, R., Keller, J. M.: A Possibilistic Approach to Clustering. IEEE Trans. Fuzzy Systems. Vol. 1. No. 1 (1993), 98–110
Scholkopf, B., Smola, A., J.: Learning with kernels: Support Vector Machines, Regularization, Optimization and Beyond. The MIT Press Cambridge, Massachusetss (2000)
Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V.: Support vector clustering. Journal of Machine learning Research, Vol. 2, (2001), 125–137
Girolami, M.: Mercer Kernel Based Clustering in Feature Space. I.E.E.E Transactions on Neural Networks, 13(4), (2001), 780–784
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data. Applications of Data Mining in Computer Security, Kluwer, (2002)
Takuya Inoue, Shigeo Abe: Fuzzy Support Vector Machine for Pattern Classification. In Proc. of IJCNN, (2001) 1449–1455
Bottomley, L.: Dataset: A day of HTTP logs from the EPA WWW Server. Duke university (1995) http://ita.ee.lbl.gov/html/contrib/EPA-HTTP.html
Chih-Chung Chang, Chih-Jen Lin: LIBSVM — A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/~cjlin/libsym
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Petrovskiy, M. (2003). A Hybrid Method for Patterns Mining and Outliers Detection in the Web Usage Log. In: Menasalvas, E., Segovia, J., Szczepaniak, P.S. (eds) Advances in Web Intelligence. AWIC 2003. Lecture Notes in Computer Science, vol 2663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44831-4_33
Download citation
DOI: https://doi.org/10.1007/3-540-44831-4_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40124-7
Online ISBN: 978-3-540-44831-0
eBook Packages: Springer Book Archive