Abstract
Traditional Web usage mining techniques aim at discovering usage patterns from Web data at the page level, while little work is engaged in at some upper level. In this paper, we propose a novel approach to the characterization of Internet users’ preference and interests at the domain name level. By summarizing Internet user’s domain name access behaviors as the co-occurrences of users and targeting domain names, an aspect model is introduced to classify users and domain names into various groups according to their co-occurrences. Meanwhile, each group is characterized by extracting the property of characteristic users and domain names. Experimental results on real-world data sets show that our approach is effective in which some meaningful groups are identified. Thus, our approach could be used for detecting unusual behaviors on the Internet at the domain name level, which can alleviate the work of searching the joint space of users and domain names.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Srivastava, J., Cooley, R., Deshpande, M.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations Newsletter 1, 12–23 (2000)
Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions on Internet Technology 3, 1–27 (2003)
Getoor, L., Diehl, C.P.: Link Mining: a Survey. ACM SIGKDD Explorations Newsletter 7, 3–12 (2005)
Kohavi, R., Mason, L., Parekh, R., Zheng, Z.: Lessons and Challenges from Mining Retail E-Commerce Data. Machine Learning 57, 83–113 (2004)
Mockapetris, P.: Domain Names: Concepts and Facilities. Internet Request for Comments 1034 (1987)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: 15th Conference on Uncertainty in Artificial Intelligence, Stockholm (1999)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: 22nd Annual ACM Conference on Research and Development in Information Retrieval. ACM Press, Berkeley (1999)
Hofmann, T.: Latent Semantic Models for Collaborative Filtering. ACM Transactions on Information Systems 22, 89–115 (2004)
Dempster, A., Laird, N., Rubin, D.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society B(39), 1–38 (1977)
Newman, M.E.J.: Detecting Community Structure in Networks. Eur. Phys. J. B. 38, 321–330 (2004)
MaxMind, http://www.maxmind.com
CNNIC, http://www.cnnic.cn
Mirkovic, J., Reiher, P.: A Taxonomy of DDoS Attacks and Defense Mechanisms. ACM SIGCOMM Computer Communication Review 34, 39–53 (2004)
OpenDNS, http://www.opendns.com
CSTNET, http://www.cstnet.cn
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yuchi, X., Lee, X., Jin, J., Yan, B. (2010). Modeling DNS Activities Based on Probabilistic Latent Semantic Analysis. In: Cao, L., Zhong, J., Feng, Y. (eds) Advanced Data Mining and Applications. ADMA 2010. Lecture Notes in Computer Science(), vol 6441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17313-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-17313-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17312-7
Online ISBN: 978-3-642-17313-4
eBook Packages: Computer ScienceComputer Science (R0)