skip to main content
10.1145/2661829.2662062acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Latent Aspect Mining via Exploring Sparsity and Intrinsic Information

Published: 03 November 2014 Publication History

Abstract

We investigate latent aspect mining problem that aims at automatically discovering aspect information from a collection of review texts in a domain in an unsupervised manner. One goal is to discover a set of aspects which are previously unknown for the domain, and predict the user's ratings on each aspect for each review. Another goal is to detect key terms for each aspect. Existing works on predicting aspect ratings fail to handle the aspect sparsity problem in the review texts leading to unreliable prediction. We propose a new generative model to tackle the latent aspect mining problem in an unsupervised manner. By considering the user and item side information of review texts, we introduce two latent variables, namely, user intrinsic aspect interest and item intrinsic aspect quality facilitating better modeling of aspect generation leading to improvement on the accuracy and reliability of predicted aspect ratings. Furthermore, we provide an analytical investigation on the Maximum A Posterior (MAP) optimization problem used in our proposed model and develop a new block coordinate gradient descent algorithm to efficiently solve the optimization with closed-form updating formulas. We also study its convergence analysis. Experimental results on the two real-world product review corpora demonstrate that our proposed model outperforms existing state-of-the-art models.

References

[1]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003.
[2]
Z. Chen, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Exploiting domain knowledge in aspect extraction. In EMNLP, 2013.
[3]
K. Dave, S. Lawrence, and D. M. Pennock. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In WWW, pages 519--528, 2003.
[4]
D.Blei and J.McAuliffe. Supervised topic models. In NIPS, volume 7, pages 121--128, 2007.
[5]
J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l 1-ball for learning in high dimensions. In ICML, pages 272--279, 2008.
[6]
M. Hu and B. Liu. Mining and summarizing customer reviews. In KDD, pages 168--177, 2004.
[7]
M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, volume 4, pages 755--760, 2004.
[8]
S. Lacoste-Julien, F. Sha, and M. I. Jordan. Disclda: Discriminative learning for dimensionality reduction and classification. In NIPS, pages 897--904, 2008.
[9]
F. Li, N. Liu, H. Jin, K. Zhao, Q. Yang, and X. Zhu. Incorporating reviewer and product information for review rating prediction. In IJCAI, pages 1820--1825, 2011.
[10]
C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM, pages 375--384, 2009.
[11]
Y. Lu and C. Zhai. Opinion integration through semi-supervised topic modeling. In WWW, pages 121--130, 2008.
[12]
J. McAuley, J. Leskovec, and D. Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In ICDM, pages 1020--1025, 2012.
[13]
Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In WWW, pages 171--180, 2007.
[14]
S. Moghaddam and M. Ester. The flda model for aspect-based opinion mining: addressing the cold start problem. In Proceedings of the international conference on WWW, pages 909--918, 2013.
[15]
A. Mukherjee and B. Liu. Aspect extraction through semi-supervised modeling. In ACL, pages 339--348, 2012.
[16]
B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL, pages 115--124, 2005.
[17]
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: sentiment classification using machine learning techniques. In EMNLP, pages 79--86, 2002.
[18]
M. F. Porter. An algorithm for suffix stripping. Program: electronic library and information systems, 14(3):130--137, 1980.
[19]
M. Shashanka, B. Raj, and P. Smaragdis. Sparse overcomplete latent variable decomposition of counts data. In NIPS, pages 1313--1320, 2007.
[20]
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society., pages 267--288, 1996.
[21]
I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL, pages 308--316, 2008.
[22]
I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, pages 111--120, 2008.
[23]
P. Tseng and S. Yun. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117(1--2):387--423, 2009.
[24]
P. D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In ACL, pages 417--424, 2002.
[25]
H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis without aspect keyword supervision. In KDD, pages 618--626, 2011.
[26]
H. Wang and Y. Lu C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In KDD, pages 783--792, 2010.
[27]
S. Wang, F. Li, and M. Zhang. Supervised topic model with consideration of user and item. In AAAI, 2013.
[28]
L. Xu, K. Liu, S. Lai, Y. Chen, and J. Zhao. Walk and learn: a two-stage approach for opinion words and opinion targets co-extraction. In WWW, pages 95--96, 2013.
[29]
J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, pages 1794--1801, 2009.
[30]
J. Zhu, N. Lao, N. Chen, and E. P. Xing. Conditional topical coding: an efficient topic model conditioned on rich features. In KDD, pages 475--483, 2011.
[31]
J. Zhu and E. P. Xing. Sparse topical coding. In UAI, pages 831--838, 2011.

Cited By

View all
  • (2022)Fitness-Based Grey Wolf Optimizer Clustering Method for Spam Review DetectionMathematical Problems in Engineering10.1155/2022/64999182022(1-15)Online publication date: 29-Apr-2022
  • (2022)Detection of spam reviews using hybrid grey wolf optimizer clustering methodMultimedia Tools and Applications10.1007/s11042-022-12848-681:27(38623-38641)Online publication date: 1-Nov-2022
  • (2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
  • Show More Cited By

Index Terms

  1. Latent Aspect Mining via Exploring Sparsity and Intrinsic Information

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
      November 2014
      2152 pages
      ISBN:9781450325981
      DOI:10.1145/2661829
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 November 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. aspect mining
      2. sparse coding
      3. topic model

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CIKM '14
      Sponsor:

      Acceptance Rates

      CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Fitness-Based Grey Wolf Optimizer Clustering Method for Spam Review DetectionMathematical Problems in Engineering10.1155/2022/64999182022(1-15)Online publication date: 29-Apr-2022
      • (2022)Detection of spam reviews using hybrid grey wolf optimizer clustering methodMultimedia Tools and Applications10.1007/s11042-022-12848-681:27(38623-38641)Online publication date: 1-Nov-2022
      • (2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
      • (2018)The Definition, Current Situation and Development Trend of Latent Aspect Rating Analysis in Text MiningProceedings of the 2018 International Conference on Computing and Pattern Recognition10.1145/3232829.3232833(21-26)Online publication date: 23-Jun-2018
      • (2018)Aspect opinion expression and rating prediction via LDA–CRF hybridNatural Language Engineering10.1017/S135132491800013X24:4(611-639)Online publication date: 22-Apr-2018
      • (2018)Learning multiple layers of knowledge representation for aspect based sentiment analysisData & Knowledge Engineering10.1016/j.datak.2017.06.001114(26-39)Online publication date: Mar-2018
      • (2016)Determing Aspect Ratings and Aspect Weights from Textual Reviews by Using Neural Network with Paragraph Vector ModelComputational Social Networks10.1007/978-3-319-42345-6_27(309-320)Online publication date: 12-Jul-2016
      • (2015)Central Topic Model for Event-oriented Topics Mining in Microblog StreamProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806561(1611-1620)Online publication date: 17-Oct-2015
      • (2015)A least square based model for rating aspects and identifying important aspects on review text data2015 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS)10.1109/NICS.2015.7302204(265-270)Online publication date: Sep-2015

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media