Abstract
Online review analysis becomes a hot research topic recently. Most of the existing works focus on the problems of review summarization, aspect identification or opinion mining from an item’s point of view such as the quality and popularity of products. Considering the fact that authors of these review texts may pay different attentions to different domain-based product aspects with respect to their own interests, in this paper, we aim to learn K user groups with specific aspect interests indicated by their review writings. Such K user groups’ identification can facilitate better understanding of customers’ interests which are crucial for application like product improvement on customer-oriented design or diverse marketing strategies. Instead of using a traditional text clustering approach, we treat the clusterId as a hidden variable and use a permutation-based structural topic model called KMM. Through this model, we infer K groups’ distribution by discovering not only the frequency of reviewers’ product aspects, but also the occurrence priority of respective aspects. Our experiment on several real-world review datasets demonstrates a competitive solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdul-Mageed, M., Diab, M.T., Korayem, M.: Subjectivity and sentiment analysis of modern standard arabic. In: ACL (Short Papers) 2011, pp. 587–591 (2011)
Beineke, P., Hastie, T., Manning, C., Vaithyanathan, S.: An Exploration of Sentiment Summarization. In: Proceeding of AAAI, pp. 12–15 (2003)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM 57, 7:1–7:30 (2010)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Chen, H., Branavan, S.R.K., Barzilay, R., Karger, D.R.: Content modeling using latent permutations. J. Artif. Intell. Res. (JAIR) 36, 129–163 (2009)
Fligner, M.A., Verducci, J.S.: Distance based ranking models. Journal of the Royal Statistical Society. Series B (Methodological) 48(3), 359–369 (1986)
Gamon, M., Aue, A., Corston-Oliver, S., Ringger, E.: Pulse: Mining Customer Opinions from Free Text. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 121–132. Springer, Heidelberg (2005)
Ganesan, K., Zhai, C.: Opinion-based entity ranking. Information Retrieval (2011)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. PNAS 101(suppl. 1), 5228–5235 (2004)
Gruber, A., Rosen-Zvi, M., Weiss, Y.: Hidden Topic Markov Models. In: Artificial Intelligence and Statistics (AISTATS), San Juan, Puerto Rico (2007)
Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, pp. 219–230 (2008)
Jindal, N., Liu, B., Lim, E.-P.: Finding unusual review patterns using unexpected rules. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 1549–1552 (2010)
Jordan, M. (ed.): Learning in Graphical Models. MIT Press (1999)
Li, W., McCallum, A.: Pachinko allocation: Dag-structured mixture models of topic correlations. In: ICML (2006)
Liu, B.: Opinion observer: Analyzing and comparing opinions on the web. In: Proceedings of the 14th International Conference on World Wide Web, WWW 2005, pp. 342–351 (2005)
Mei, Q., Liu, C., Su, H., Zhai, C.: A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006, pp. 533–542 (2006)
Mukherjee, A., Liu, B., Wang, J., Glance, N., Jindal, N.: Detecting group review spam. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, pp. 93–94 (2011)
Popescu, A.M., Etzioni, O.: Extracting product features and opinions from reviews. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 339–346 (2005)
Purver, M., Griffiths, T.L., Körding, K.P., Tenenbaum, J.B.: Unsupervised topic modelling for multi-party spoken discourse. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 17–24 (2006)
Titov, I., McDonald, R.: Modeling online reviews with multi-grain topic models. In: Proceeding of the 17th International Conference on World Wide Web, WWW 2008, pp. 111–120 (2008)
Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis. Tech. rep., University of Minnesota (2002)
Zhou, X., Zhang, X., Hu, X.: Semantic smoothing of document models for agglomerative clustering. In: Proceeding 20th International Joint Conf. Artificial Intelligence, IJCAI 2007, pp. 2928–2933 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Si, J., Li, Q., Qian, T., Deng, X. (2012). Discovering K Web User Groups with Specific Aspect Interests. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2012. Lecture Notes in Computer Science(), vol 7376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-31537-4_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31536-7
Online ISBN: 978-3-642-31537-4
eBook Packages: Computer ScienceComputer Science (R0)