A probabilistic rating inference framework for mining user preferences from reviews

Leung, Cane Wing-ki; Chan, Stephen Chi-fai; Chung, Fu-lai; Ngai, Grace

doi:10.1007/s11280-011-0117-5

A probabilistic rating inference framework for mining user preferences from reviews

Published: 17 February 2011

Volume 14, pages 187–215, (2011)
Cite this article

World Wide Web Aims and scope Submit manuscript

Cane Wing-ki Leung¹,
Stephen Chi-fai Chan²,
Fu-lai Chung² &
…
Grace Ngai²

689 Accesses
47 Citations
Explore all metrics

Abstract

We propose a novel Probabilistic Rating infErence Framework, known as Pref, for mining user preferences from reviews and then mapping such preferences onto numerical rating scales. Pref applies existing linguistic processing techniques to extract opinion words and product features from reviews. It then estimates the sentimental orientations (SO) and strength of the opinion words using our proposed relative-frequency-based method. This method allows semantically similar words to have different SO, thereby addresses a major limitation of existing methods. Pref takes the intuitive relationships between class labels, which are scalar ratings, into consideration when assigning ratings to reviews. Empirical results validated the effectiveness of Pref against several related algorithms, and suggest that Pref can produce reasonably good results using a small training corpus. We also describe a useful application of Pref as a rating inference framework. Rating inference transforms user preferences described as natural language texts into numerical rating scales. This allows Collaborative Filtering (CF) algorithms, which operate mostly on databases of scalar ratings, to utilize textual reviews as an additional source of user preferences. We integrated Pref with a classical CF algorithm, and empirically demonstrated the advantages of using rating inference to augment ratings for CF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rating-Based Collaborative Filtering: Algorithms and Evaluation

Finding users preferences from large-scale online reviews for personalized recommendation

Article 08 October 2016

How to Find the Best Rated Items on a Likert Scale and How Many Ratings Are Enough

References

Adomavicius, G., Kwon, Y.: New recommendation techniques for multicriteria rating systems. IEEE Intell. Syst. 22(3), 48–55 (2007)
Article Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Breese, J. S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: 14th Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)
Bruce, R., Wiebe, J.: Recognizing subjectivity: a case study of manual tagging. Nat. Lang. Eng., 5(2), 187–205 (1999)
Article Google Scholar
Chesley, P., Vincent, B., Xu, L., Srihari, R.: Using verbs and adjectives to automatically classify blog sentiment. In: Proc. of the Spring Symposia on Computational Approaches to Analyzing Weblogs (2006)
Das, S., Chen, M.: Yahoo! for Amazon: extracting market sentiment from stock message boards. In: Asia Pacific Finance Association Annual Conference (2001)
Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: 12th International World Wide Web Conference, pp. 519–528 (2003)
Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification. In: ACM International Conference on Information and Knowledge Management (CIKM), pp. 617–624 (2005)
Esuli, A., Sebastiani, F.: SentiWordNnet: a publicly available lexical resource for opinion mining. In: 5th International Conference on Language Resources and Evaluation (LREC) (2006)
Gamon, M., Aue, A., Corston-Oliver, S., Ringger, E.K.: Pulse: mining customer opinions from free text. In: 6th International Symposium on Intelligent Data Analysis, pp. 121–132 (2005)
Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proc. of the HLT-NAACL Workshop on TextGraphs: Graph-based Algorithms for Natural Language Processing, pp. 45–52 (2006)
Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: 8th Conference on European Chapter of the Association for Computational Linguistics, pp. 174–181 (1997)
Herlocker, J., Konstan, J., Riedl, J.: An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5, 287–310 (2002)
Article Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 168–177 (2004)
Hu, M., Liu, B.: Mining opinion features in customer reviews. In: 19th National Conference on Artificial Intelligence, pp. 755–760 (2004)
Jindal, N., Liu, B.: Identifying comparative sentences in text documents. In: 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 244–251 (2006)
Joachims, T.: Making large-scale support vector machine learning practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 41–56. MIT Press (1999)
Google Scholar
Kaji, N., Kitsuregawa, M.: Building lexicon for sentiment analysis from massive collection of HTML documents. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1075–1083 (2007)
Kamps, J., Marx, M., Mokken, R.J., de Rijke, M.: Using WordNet to measure semantic orientations of adjectives. In: 4th International Conference on Language Resources and Evaluation (LREC), pp. 1115–1118 (2004)
Kim, S.-M., Hovy, E.: Determining the sentiment of opinions. In: Conference on Computational Linguistics, pp. 1367–1373 (2004)
Leung, C.W.K., Chan, S.C.F., Chung, F.L.: Integrating collaborative filtering and sentiment analysis: a rating inference approach. In: ECAI 2006 Workshop on Recommender Systems, pp. 62–66 (2006)
Leung, C.W.K., Chan, S.C.F., Chung, F.L.: Evaluation of a Rating Inference Approach to Utilizing Textual Reviews for Collaborative Recommendation. Cooperative Internet Computing, World Scientific Publisher (2008)
Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: 14th International WWW Conference, pp. 342–351 (2005)
Liu, H.: MontyLingua: an end-to-end natural language processor with common sense. http://web.media.mit.edu/∼hugo/montylingua/ (2004). Accessed 9 February 2011
Liu, J., Yao, J., Wu, G.: Sentiment classification using information extraction technique. In: Advances in Intelligent Data Analysis VI, pp. 216–227 (2005)
Manouselis, N., Costopoulou, C.: Analysis and classification of multi-criteria recommender systems. World Wide Web: Internet and Web Information Systems (WWWJ) 10(4), 415–441 (2007)
Google Scholar
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: an online lexical database. Int. J. Lexicogr. (Special Issue), 3(4), 235–312 (1990)
Google Scholar
Mishne, G., Glance, N.: Predicting movie sales from blogger sentiment. In: Spring Symposia on Computational Approaches to Analyzing Weblogs (2006)
Okanohara, D., Tsujii, J.: Assigning polarity scores to reviews using machine learning techniques. In: Second International Joint Conference on Natural Language Processing, pp. 314–325 (2005)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: 42nd Annual Meeting of the Association for Computation Linguistics, pp. 271–278 (2004)
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: 43rd Annual Meeting of the Association for Computation Linguistics, pp. 115–124 (2005)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Conference on Empirical Methods in Natural Language Processing, pp. 79–86 (2002)
Popescu, A., Etzioni, O.: Extracting product features and opinions from reviews. In: Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 339–346 (2005)
Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., Riedl, J.: GroupLens: an open architecture for collaborative filtering of Netnews. In: ACM 1994 Conference on Computer Supported Cooperative Work, pp. 175–186 (1994)
Ricci, F.: Travel recommender systems. IEEE Intell. Syst. 17(6), 55–57 (2002)
MathSciNet Google Scholar
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
MathSciNet Google Scholar
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Conference on Empirical Methods in Natural Language Processing, pp. 105–112 (2003)
Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Technical Report NC2-TR-1998-030, NeuroCOLT2, Royal Holloway College, University of London (1998)
Snyder, B., Barzilay, R.: Multiple aspect ranking using the good grief algorithm. In: Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 300–307 (2007)
Taboada, M., Anthony, C., Voll, K.: Methods for creating semantic orientation dictionaries. In: 5th International Conference on Language Resources and Evaluation (LREC), pp. 427–432 (2006)
Thomas, M., Pang, B., Lee, L.: Get out the vote: determining support or opposition from Congressional floor-debate transcripts. In: Conference on Empirical Methods in Natural Language Processing, pp. 327–335 (2006)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: 40th Annual Meeting of the Association for Computational Linguistics, pp. 417–424 (2002)
Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inf. Sys. 21(4), 315–346 (2003)
Article Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer (1995)
Wiebe, J., Bruce, R., Bell, M., Martin, M., Wilson, T.: A corpus study of evaluative and speculative language. In: 2nd ACL SIGdial Workshop on Discourse and Dialogue (2001)
Wiebe, J., Bruce, R., O’Hara, T.: Development and use of a gold-standard data set for subjectivity classifications. In: 37th Annual Meeting of the Association for Computational Linguistics, pp. 246–253 (1999)
Wilson, T., Wiebe, J., Hwa, R.: Just how mad are you? Finding strong and weak opinion clauses. In: 19th National Conference on Artificial Intelligence, pp. 761–769 (2004)
Yamanishi, K., Li, H.: Mining open answers in questionnaire data. IEEE Intell. Syst. 17(5), 58–63 (2002)
Article Google Scholar
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment Analyzer: extracting sentiments about a given topic using natural language processing techniques. In: 3rd IEEE International Conference on Data Mining (ICDM), pp. 427–434 (2003)

Download references

Author information

Authors and Affiliations

School of Information Systems, Singapore Management University, 80 Stamford Road, Singapore, 178902, Singapore
Cane Wing-ki Leung
Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Stephen Chi-fai Chan, Fu-lai Chung & Grace Ngai

Authors

Cane Wing-ki Leung
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Chi-fai Chan
View author publications
You can also search for this author in PubMed Google Scholar
Fu-lai Chung
View author publications
You can also search for this author in PubMed Google Scholar
Grace Ngai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cane Wing-ki Leung.

Additional information

This work is an extension of the preliminary work reported in the papers “Integrating collaborative filtering and sentiment analysis: A rating inference approach” [21] and “Evaluation of a Rating Inference Approach to Utilizing Textual Reviews for Collaborative Recommendation” [22].

This work was done when Cane Wing-ki Leung was with the Department of Computing, The Hong Kong Polytechnic University.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Leung, C.Wk., Chan, S.Cf., Chung, Fl. et al. A probabilistic rating inference framework for mining user preferences from reviews. World Wide Web 14, 187–215 (2011). https://doi.org/10.1007/s11280-011-0117-5

Download citation

Received: 03 March 2008
Revised: 13 April 2009
Accepted: 27 January 2011
Published: 17 February 2011
Issue Date: March 2011
DOI: https://doi.org/10.1007/s11280-011-0117-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A probabilistic rating inference framework for mining user preferences from reviews

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Rating-Based Collaborative Filtering: Algorithms and Evaluation

Finding users preferences from large-scale online reviews for personalized recommendation

How to Find the Best Rated Items on a Likert Scale and How Many Ratings Are Enough

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now