research-article

Latent Aspect Mining via Exploring Sparsity and Intrinsic Information

Authors:

Anthony Man-Cho SoAuthors Info & Claims

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 879 - 888

https://doi.org/10.1145/2661829.2662062

Published: 03 November 2014 Publication History

Abstract

We investigate latent aspect mining problem that aims at automatically discovering aspect information from a collection of review texts in a domain in an unsupervised manner. One goal is to discover a set of aspects which are previously unknown for the domain, and predict the user's ratings on each aspect for each review. Another goal is to detect key terms for each aspect. Existing works on predicting aspect ratings fail to handle the aspect sparsity problem in the review texts leading to unreliable prediction. We propose a new generative model to tackle the latent aspect mining problem in an unsupervised manner. By considering the user and item side information of review texts, we introduce two latent variables, namely, user intrinsic aspect interest and item intrinsic aspect quality facilitating better modeling of aspect generation leading to improvement on the accuracy and reliability of predicted aspect ratings. Furthermore, we provide an analytical investigation on the Maximum A Posterior (MAP) optimization problem used in our proposed model and develop a new block coordinate gradient descent algorithm to efficiently solve the optimization with closed-form updating formulas. We also study its convergence analysis. Experimental results on the two real-world product review corpora demonstrate that our proposed model outperforms existing state-of-the-art models.

References

[1]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003.

Digital Library

[2]

Z. Chen, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Exploiting domain knowledge in aspect extraction. In EMNLP, 2013.

[3]

K. Dave, S. Lawrence, and D. M. Pennock. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In WWW, pages 519--528, 2003.

Digital Library

[4]

D.Blei and J.McAuliffe. Supervised topic models. In NIPS, volume 7, pages 121--128, 2007.

Digital Library

[5]

J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l 1-ball for learning in high dimensions. In ICML, pages 272--279, 2008.

Digital Library

[6]

M. Hu and B. Liu. Mining and summarizing customer reviews. In KDD, pages 168--177, 2004.

Digital Library

[7]

M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, volume 4, pages 755--760, 2004.

Digital Library

[8]

S. Lacoste-Julien, F. Sha, and M. I. Jordan. Disclda: Discriminative learning for dimensionality reduction and classification. In NIPS, pages 897--904, 2008.

[9]

F. Li, N. Liu, H. Jin, K. Zhao, Q. Yang, and X. Zhu. Incorporating reviewer and product information for review rating prediction. In IJCAI, pages 1820--1825, 2011.

Digital Library

[10]

C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In CIKM, pages 375--384, 2009.

Digital Library

[11]

Y. Lu and C. Zhai. Opinion integration through semi-supervised topic modeling. In WWW, pages 121--130, 2008.

Digital Library

[12]

J. McAuley, J. Leskovec, and D. Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In ICDM, pages 1020--1025, 2012.

Digital Library

[13]

Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In WWW, pages 171--180, 2007.

Digital Library

[14]

S. Moghaddam and M. Ester. The flda model for aspect-based opinion mining: addressing the cold start problem. In Proceedings of the international conference on WWW, pages 909--918, 2013.

Digital Library

[15]

A. Mukherjee and B. Liu. Aspect extraction through semi-supervised modeling. In ACL, pages 339--348, 2012.

Digital Library

[16]

B. Pang and L. Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL, pages 115--124, 2005.

Digital Library

[17]

B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: sentiment classification using machine learning techniques. In EMNLP, pages 79--86, 2002.

Digital Library

[18]

M. F. Porter. An algorithm for suffix stripping. Program: electronic library and information systems, 14(3):130--137, 1980.

[19]

M. Shashanka, B. Raj, and P. Smaragdis. Sparse overcomplete latent variable decomposition of counts data. In NIPS, pages 1313--1320, 2007.

[20]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society., pages 267--288, 1996.

[21]

I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL, pages 308--316, 2008.

[22]

I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, pages 111--120, 2008.

Digital Library

[23]

P. Tseng and S. Yun. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117(1--2):387--423, 2009.

Digital Library

[24]

P. D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In ACL, pages 417--424, 2002.

Digital Library

[25]

H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis without aspect keyword supervision. In KDD, pages 618--626, 2011.

Digital Library

[26]

H. Wang and Y. Lu C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In KDD, pages 783--792, 2010.

Digital Library

[27]

S. Wang, F. Li, and M. Zhang. Supervised topic model with consideration of user and item. In AAAI, 2013.

[28]

L. Xu, K. Liu, S. Lai, Y. Chen, and J. Zhao. Walk and learn: a two-stage approach for opinion words and opinion targets co-extraction. In WWW, pages 95--96, 2013.

Digital Library

[29]

J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, pages 1794--1801, 2009.

[30]

J. Zhu, N. Lao, N. Chen, and E. P. Xing. Conditional topical coding: an efficient topic model conditioned on rich features. In KDD, pages 475--483, 2011.

Digital Library

[31]

J. Zhu and E. P. Xing. Sparse topical coding. In UAI, pages 831--838, 2011.

Digital Library

Cited By

Shringi SSharma HSuthar D(2022)Fitness-Based Grey Wolf Optimizer Clustering Method for Spam Review DetectionMathematical Problems in Engineering10.1155/2022/64999182022(1-15)Online publication date: 29-Apr-2022
https://doi.org/10.1155/2022/6499918
Shringi SSharma H(2022)Detection of spam reviews using hybrid grey wolf optimizer clustering methodMultimedia Tools and Applications10.1007/s11042-022-12848-681:27(38623-38641)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s11042-022-12848-6
Lin THu ZGuo XCulpepper JMoffat ABennett PLerman K(2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
https://dl.acm.org/doi/10.1145/3289600.3290957
Show More Cited By

Index Terms

Latent Aspect Mining via Exploring Sparsity and Intrinsic Information
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction

Recommendations

Hidden Aspect Rating Discovery from Text Reviews of E-Commerce Web Sites
BigDataScience '14: Proceedings of the 2014 International Conference on Big Data Science and Computing

We investigate hidden aspect mining problem that aims at automatically discovering aspect information from a collection of review texts in an unsupervised manner. The goal is to predict the user's ratings on each aspect. It does not require users to ...
Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews
WWW '18: Proceedings of the 2018 World Wide Web Conference

Although latent factor models (e.g., matrix factorization) achieve good accuracy in rating prediction, they suffer from several problems including cold-start, non-transparency, and suboptimal recommendation for local users or items. In this paper, we ...
A Qualitative Comparison of Three Aspect Mining Techniques
IWPC '05: Proceedings of the 13th International Workshop on Program Comprehension

The fact that crosscutting concerns (aspects) cannot be well modularized in object oriented software is an impediment to program comprehension: the implementation of a concern is typically scattered over many locations and tangled with the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

November 2014

2152 pages

ISBN:9781450325981

DOI:10.1145/2661829

General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CIKM '14

Sponsor:

CIKM '14: 2014 ACM Conference on Information and Knowledge Management

November 3 - 7, 2014

Shanghai, China

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
325
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shringi SSharma HSuthar D(2022)Fitness-Based Grey Wolf Optimizer Clustering Method for Spam Review DetectionMathematical Problems in Engineering10.1155/2022/64999182022(1-15)Online publication date: 29-Apr-2022
https://doi.org/10.1155/2022/6499918
Shringi SSharma H(2022)Detection of spam reviews using hybrid grey wolf optimizer clustering methodMultimedia Tools and Applications10.1007/s11042-022-12848-681:27(38623-38641)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s11042-022-12848-6
Lin THu ZGuo XCulpepper JMoffat ABennett PLerman K(2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
https://dl.acm.org/doi/10.1145/3289600.3290957
Sun SWang KZhang T(2018)The Definition, Current Situation and Development Trend of Latent Aspect Rating Analysis in Text MiningProceedings of the 2018 International Conference on Computing and Pattern Recognition10.1145/3232829.3232833(21-26)Online publication date: 23-Jun-2018
https://dl.acm.org/doi/10.1145/3232829.3232833
LADDHA AMUKHERJEE A(2018)Aspect opinion expression and rating prediction via LDA–CRF hybridNatural Language Engineering10.1017/S135132491800013X24:4(611-639)Online publication date: 22-Apr-2018
https://doi.org/10.1017/S135132491800013X
Pham DLe A(2018)Learning multiple layers of knowledge representation for aspect based sentiment analysisData & Knowledge Engineering10.1016/j.datak.2017.06.001114(26-39)Online publication date: Mar-2018
https://doi.org/10.1016/j.datak.2017.06.001
Pham DLe ANguyen T(2016)Determing Aspect Ratings and Aspect Weights from Textual Reviews by Using Neural Network with Paragraph Vector ModelComputational Social Networks10.1007/978-3-319-42345-6_27(309-320)Online publication date: 12-Jul-2016
https://doi.org/10.1007/978-3-319-42345-6_27
Peng MZhu JLi XHuang JWang HZhang YBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)Central Topic Model for Event-oriented Topics Mining in Microblog StreamProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806561(1611-1620)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806561
Pham DLe ALe T(2015)A least square based model for rating aspects and identifying important aspects on review text data2015 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS)10.1109/NICS.2015.7302204(265-270)Online publication date: Sep-2015
https://doi.org/10.1109/NICS.2015.7302204

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten