Skip to main content

Aspect-Based Restaurant Information Extraction for the Recommendation System

  • Conference paper
  • First Online:
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9561))

Included in the following conference series:

Abstract

In this paper information extraction task for the restaurant recommendation system is considered. We develop an information extraction system which is intended to gather restaurants aspects from users’ reviews and output them to the recommendation module. As many of the restaurant aspects are subjective, our task can also be called sentiment analysis, or opinion mining. Thus, we present an aspect-based approach towards sentiment analysis of reviews about restaurants for e-tourism recommender systems. The analyzed frames are service and food quality, cuisine, price level, noise level, etc. In this paper we focus on service quality, cuisine type and food quality. As part of the preprocessing phase, a method for Russian reviews corpus analysis (as part of information extraction) is proposed. Its importance is shown at the experimental phase, when the application of machine learning techniques to aspects extraction is analyzed. It is shown that the information obtained during corpus analysis improve system performance. We conduct experiments with several feature sets and classifiers and show that the use of resources learnt from the corpus leads to the improvement of the models. Naïve Bayes appears to be the best choice for sentiment classification, while Logistic Regression and SVM are best at deciding on the relevance of a review with respect to the particular aspect.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://pymorphy2.readthedocs.org/.

  2. 2.

    We should also note that after the first iteration top trigger words included “цена” /price/, “место” (place), “атмосфера” /atmosphere/, “блюдо” /dish/, “еда” /food/, “интерьер” /interior/ and “ресторан” /restaurant/. It means that service and food quality, price and noise level and general impression of a restaurant are described with roughly the same adjectives, and therefore the same IE scheme can probably be applied to these restaurant aspects.

  3. 3.

    For example, “кухня” /cuisine/ is referred to as “восточная” /eastern/ (which describes cuisine type) almost as frequently as “хорошая” /good/ and “вкусная” /tasty/ (which describes food quality) in the reviews corpus.

  4. 4.

    Phrases like “В этом ресторане обслуживание …” /In this restaurant the service is…/ or “Обслуживание ресторана…” /The service of the restaurant is…/ are quite common in the Russian language when restaurant reviews are considered.

  5. 5.

    http://scikit-learn.org .

References

  1. Bakliwal, A., Patil., A., Arora, P., Varma, V.: Towards enhanced opinion classification using NLP techniques. In: Proceedings of the Workshop on Sentiment Analysis where AI Meets Psychology (SAAIP), IJCNLP, pp. 101–107 (2011)

    Google Scholar 

  2. Benamara, F., Cesarano, C., Picariello, A., Reforgiato, D., Subrahmanian, V.S.: Sentiment analysis: adjectives and adverbs are better than adjectives alone. In: Proceedings of the International Conference on Weblogs and Social Media (ICWSM)(2007)

    Google Scholar 

  3. Bermingham, A., Smeaton, A.: Classifying sentiment in microblogs: is brevity an advantage? In: CIKM 2010, Toronto, Ontario, Canada, 26–29 October 2010

    Google Scholar 

  4. Carlson, A., Betteridge, J., Wang, R.C.: Coupled semi-supervised learning for information extraction. In: Third ACM International Conference on Web Search and Data Mining, New York, pp. 101–110 (2010)

    Google Scholar 

  5. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Empirical Methods in NLP (EMNLP) (1999)

    Google Scholar 

  6. Das, S.R., Chen, M.Y.: Yahoo! for Amazon: sentiment parsing from small talk on the web. Manage. Sci. 53(9), 1375–1388 (2007)

    Article  Google Scholar 

  7. Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web, New York, pp. 519–528 (2003)

    Google Scholar 

  8. Emadzadeh, E., Nikfarjam, A., Ghauth, K.I., Why, N.K.: Learning materials recommendation using a hybrid recommender system with automated keyword extraction. World Appl. Sci. J. 9(11), 1260–1271 (2010)

    Google Scholar 

  9. Huang, R., Riloff, E.: Multi-faceted event recognition with bootstrapped dictionaries. In: NAACL-HLT 2013, Atlanta, Georgia, USA, 9–14 June 2013, pp. 41–51 (2013)

    Google Scholar 

  10. Joorabchi, A., Mahdi, A.E.: A new method for bootstrapping an automatic text classification system utilizing public library resources. In: 19th Irish Conference on Artificial Intelligence and Cognitive Science (2008)

    Google Scholar 

  11. Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)

    Article  MathSciNet  Google Scholar 

  12. Leksin, V.A., Nikolenko, S.I.: Semi-supervised tag extraction in a web recommender system. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 206–212. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Lim, E.P., Sun, A., Marissa, M.: Conceptual classification of web pages using bootstrapping and co-training strategies. Cyberscape J. 4(1) (2006). Research Collection School of Information Systems

    Google Scholar 

  14. Lin, F., Cohen, W.W.: The MultiRank bootstrap algorithm: semi-supervised political blog classification and ranking using semi-supervised link classification (2007). Retrieved: http://www.cs.cmu.edu/~wcohen/postscript/icwsm-2007-frank-submitted.pdf. Accessed 18 September 2015

  15. Loukachevitch, N.V., Blinov, P.D., Kotelnikov, E. V., Rubtsova, Y.V., Ivanov, V.V., Tutubalina, E.: SentiRuEval: testing object‑oriented sentiment analysis systems in Russian. In: Proceedings of International Conference Dialog, pp. 3–9 (2015)

    Google Scholar 

  16. Murphy, T., Curran, J.R.: Experiments in mutual exclusion bootstrapping. In: Australasian Language Technology Workshop 2007, pp. 66–74 (2007)

    Google Scholar 

  17. Narayanan, V., Arora, I., Bhatia, A.: Fast and accurate sentiment classification using an enhanced naive bayes model. arXiv:1305.614 (2013)

  18. Naw, N., Hlaing, E.E.: Relevant words extraction method for recommendation system. Int. J. Emerg. Technol. Adv. Eng. 3(1), 680–685 (2013)

    Google Scholar 

  19. Niu, C., Li, W., Ding, J., Srihari, R.K.: A bootstrapping approach to named entity classification using successive learners. In: 41st Annual Meeting of the ACL (2003)

    Google Scholar 

  20. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)

    Google Scholar 

  21. Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  22. Pronoza, E., Yagunova, E., Volskaya, S.: Corpus-based information extraction and opinion mining for the restaurant recommendation system. In: Besacier, L., Dediu, A., Martín-Vide, C. (eds.) SLSP 2014. LNCS, vol. 8791, pp. 272–284. Springer, Heidelberg (2014)

    Google Scholar 

  23. Ricci, F., Rikach, L., Shapira, B., Kantor, P.: Recommender Systems Handbook, p. 62. Springer, Heidelberg (2010)

    Google Scholar 

  24. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Sixteenth National Conference on Artificial Intelligence (1999)

    Google Scholar 

  25. Saif, H.: Sentiment Analysis of Microblogs. Mining the New World. Technical Report KMI-12-2, March 2012 (2012)

    Google Scholar 

  26. Schafer, J.B., Frankowski, D., Herlocker, J., Sen, S.: Collaborative filtering recommender systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 291–324. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  27. Semeraro, G.: Content-based recommender systems: problems, challenges and research directions. In: 8th Workshop on Intelligent Techniques for Web Personalization & Recommender Systems (2010)

    Google Scholar 

  28. Shah, K., Munshi, N., Reddy, P.: Sentiment Analysis and Opinion Mining of Microblogs (2013)

    Google Scholar 

  29. Smith, A.D., Eisner, J.: Bootstrapping feature-rich dependency parsers with entropic priors. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, June 2007, pp. 667–677 (2007)

    Google Scholar 

  30. Thelen, M., Riloff, E.: A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: Empirical Methods in NLP (EMNLP) (2002)

    Google Scholar 

  31. Turney, P.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 417–424 (2002)

    Google Scholar 

  32. Wang, S., Manning, Ch.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 90–94 (2012)

    Google Scholar 

  33. Yangarber, R., Grishman, R., Tapanainen P., Huttunen, S.: Automatic acquisition of domain knowledge for information extraction. In: 18th Conference on Computational Linguistics (COLING 2000), vol. 2, pp. 940–946 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ekaterina Pronoza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Pronoza, E., Yagunova, E., Volskaya, S. (2016). Aspect-Based Restaurant Information Extraction for the Recommendation System. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43808-5_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43807-8

  • Online ISBN: 978-3-319-43808-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics