skip to main content
10.1145/3154979.3154987acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccctConference Proceedingsconference-collections
research-article

Business: Do you wanna sell more? Discovering Topics, Sentiments and Prediction of Ratings

Published: 24 November 2017 Publication History

Abstract

In the era of Social Computing, the role of customer reviews and ratings can be instrumental in predicting the success and sustainability of businesses as customers and even competitors use them to judge the quality of a business. Yelp is one of the most popular websites for users to write such reviews. This rating can be subjective and biased toward user's personality. Business preferences of a user can be decrypted based on his/ her past reviews. In this paper, we deal with (i) uncovering latent topics in Yelp data based on positive and negative reviews using topic modeling to learn which topics are the most frequent among customer reviews, (ii) sentiment analysis of users' reviews to learn how these topics associate to a positive or negative rating which will help businesses improve their offers and services, and (iii) predicting unbiased ratings from user-generated review text alone, using Linear Regression model. We also perform data analysis to get some deeper insights into customer reviews.

References

[1]
Hanna M. Wallach. 2006. Topic modeling: beyond bag-of-words. In Proceedings of the 23rd international conference on Machine learning (ICML '06). ACM, New York, NY, USA, 977--984.
[2]
Blei, David M., Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993--1022.
[3]
Srivastava, A., Sahami, M. (eds.) Text mining: Classification, Clustering and Applications, pp. 155--184. CRC Press, Boca Raton, FL.
[4]
Yates, B. R., Neto, R. B. (1999) Modern Information Retrieval, ACM Press, New York.
[5]
Fan, Mingming, and Maryam Khademi. "Predicting a business star in Yelp from its reviews text alone." arXiv preprint arXiv:1401.0864, 2014.
[6]
"Yelp 2014 Dataset Challenge." [Online]. Available: http://www.yelp.com/dataset_challenge/. Accessed: July 30, 2014.
[7]
MongoDB-MongoDB, Inc. https://www.mongodb.com/.
[8]
Text Mining Package in R. https://cran.r-project.org/web/packages/tm/tm.pdf.
[9]
Porter, M. F. (1980). An algorithm for suffix stripping. In Program, volume 14, pages 130--137.
[10]
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: An Introduction to Information Retrieval, pages 237--240. Cambridge University Press, 2009
[11]
Zhai Z., Liu B., Xu H., Jia P. (2011) Constrained LDA for Grouping Product Features in Opinion Mining. In: Huang J.Z., Cao L., Srivastava J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science, vol 6634. Springer, Berlin, Heidelberg
[12]
Blei, David M. 2012. Probabilistic topic models. Communications of the ACM 55 (4):77--84.
[13]
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1):5228--5235, 2004.
[14]
T. Griffiths, "Gibbs sampling in the generative model of latent dirichlet allocation", 2002.
[15]
Mehrotra, R., S. Sanner, W. Buntine, L. Xie, Improving LDA Topic Models for Microblogs via Tweet Pooling and Automatic Labeling, Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 2013.
[16]
Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. 2, 1-2 (January 2008), 1--135. DOI=http://dx.doi.org/10.1561/1500000011
[17]
Lars Kai Hansen, Adam Arvidsson, Finn Årup Nielsen, Elanor Colleoni, Michael Etter, "Good Friends, Bad News - Affect and Virality in Twitter", The 2011 International Workshop on Social Computing, Network, and Services (SocialComNet 2011).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCCT-2017: Proceedings of the 7th International Conference on Computer and Communication Technology
November 2017
157 pages
ISBN:9781450353243
DOI:10.1145/3154979
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Topic modeling
  2. Yelp reviews
  3. data visualization
  4. machine learning
  5. predictive analysis
  6. sentiment analysis
  7. text mining

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICCCT-2017

Acceptance Rates

ICCCT-2017 Paper Acceptance Rate 33 of 124 submissions, 27%;
Overall Acceptance Rate 33 of 124 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 109
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media