skip to main content
10.1145/3487664.3487773acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

Doc2Vec-based Approach for Extracting Diverse Evaluation Expressions from Online Review Data

Published: 30 December 2021 Publication History

Abstract

This paper proposes a method for extracting diverse expressions from online movie review texts for a given keyword query. When people watch a movie that makes them cry, they generally do not say “I cried.” Instead, they use such euphemistic language as “I needed a handkerchief” or “My makeup was running.” To enable information retrieval based on audience reactions such as “movies that make me cry” using review texts, a variety of paraphrased expressions must be collected for arbitrary queries. Our proposed method extracts such expressions from review datasets by applying two extensions to Doc2Vec: 1) it changes the granularity of the training sentences to mitigate a lack of context, and 2) it applies query expansion for similarity calculation in advance. We conducted a large-scale experiment using crowdsourcing with 1.29 million actual sentences taken from Yahoo! Movies, Japan. The experimental result revealed that changing the training data granularity and adding the query expansion are both effective to accurately collect more diverse expressions that have a meaning similar to the given query.

References

[1]
Nadeem Bader, Osnat Mokryn, and Joel Lanir. 2017. Exploring Emotions in Online Movie Reviews for Online Browsing. In Proceedings of the 22Nd International Conference on Intelligent User Interfaces Companion(Limassol, Cyprus) (IUI ’17 Companion). ACM, New York, NY, USA, 35–38. https://doi.org/10.1145/3030024.3040982
[2]
Oren Barkan and Noam Koenigstein. 2016. Item2vec: neural item embedding for collaborative filtering. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1–6.
[3]
Abd Samad Hasan Basari, Burairah Hussin, I Gede Pramudya Ananta, and Junta Zeniarja. 2013. Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Procedia Engineering 53(2013), 453–462.
[4]
Arti Buche, M. B. Chandak, and Akshay Zadgaonkar. 39 – 48. Opinion Mining and Analysis: A survey. International Journal on Natural Language Computing 2, 3 (39 – 48).
[5]
Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6(1990), 391–407.
[6]
Fatemeh Hemmatian and Mohammad Karim Sohrabi. 2019. A survey on classification techniques for opinion mining and sentiment analysis. Artificial Intelligence Review 52, 3 (2019), 1495–1545.
[7]
Nan Hu, Paul A. Pavlou, and Jennifer Zhang. 2006. Can Online Reviews Reveal a Product’s True Quality?: Empirical Findings and Analytical Modeling of Online Word-of-mouth Communication. In Proceedings of the 7th ACM Conference on Electronic Commerce (Ann Arbor, Michigan, USA) (EC ’06). ACM, New York, NY, USA, 324–330. https://doi.org/10.1145/1134707.1134743
[8]
Yohan Jo and Alice H. Oh. 2011. Aspect and Sentiment Unification Model for Online Review Analysis. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (Hong Kong, China) (WSDM ’11). ACM, New York, NY, USA, 815–824. https://doi.org/10.1145/1935826.1935932
[9]
Kenji Sugiki and Shigeki Matsubara. 2007. A product retrieval system robust to subjective queries. In 2007 2nd International Conference on Digital Information Management, Vol. 1. 351–356. https://doi.org/10.1109/ICDIM.2007.4444248
[10]
Jayashri Khairnar and Mayura Kinikar. 2013. Machine learning algorithms for opinion mining and sentiment classification. International Journal of Scientific and Research Publications 3, 6(2013), 1–6.
[11]
Kosuke Kurihara, Yoshiyuki Shoji, Sumio Fujita, and Martin J. Dürst. 2019. Target-Topic Aware Doc2Vec for Short Sentence Retrieval from User Generated Content. In Proceedings of the 21st International Conference on Information Integration and Web-Based Applications & Services (Munich, Germany) (iiWAS2019). Association for Computing Machinery, New York, NY, USA, 463–467.
[12]
J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 1 (1977), 159–174. http://www.jstor.org/stable/2529310
[13]
Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (Beijing, China) (ICML’14). JMLR.org, II–1188–II–1196. http://dl.acm.org/citation.cfm?id=3044805.3045025
[14]
Bing Liu and Lei Zhang. 2012. A survey of opinion mining and sentiment analysis. In Mining text data. Springer, 415–463.
[15]
Chien-Liang Liu, Wen-Hoar Hsaio, Chia-Hoang Lee, Gen-Chi Lu, and Emery Jou. 2011. Movie rating and review summarization in mobile environment. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42, 3 (2011), 397–407.
[16]
Gaojun Liu and Xingyu Wu. 2019. Using collaborative filtering algorithms combined with Doc2Vec for movie recommendation. In 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). IEEE, 1461–1464.
[17]
Van-Thuy Phi, Liu Chen, and Yu Hirate. 2016. Distributed representation based recommender systems in e-commerce. In DEIM Forum.
[18]
Vijay B Raut and DD Londhe. 2014. Opinion mining and summarization of hotel reviews. In 2014 International Conference on Computational Intelligence and Communication Networks. IEEE, 556–559.
[19]
Sumbal Riaz, Mehvish Fatima, Muhammad Kamran, and M Wasif Nisar. 2019. Opinion mining on large scale data using sentiment analysis and k-means clustering. Cluster Computing 22, 3 (2019), 7149–7164.
[20]
Vivek Kumar Singh, Rajesh Piryani, Ashraf Uddin, and Pranav Waila. 2013. Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification. In 2013 International Mutli-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s). 712–717. https://doi.org/10.1109/iMac4s.2013.6526500
[21]
Jiaxing Tan, Alexander Kotov, Rojiar Pir Mohammadiani, and Yumei Huo. 2017. Sentence Retrieval with Sentiment-specific Topical Anchoring for Review Summarization. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management(Singapore, Singapore) (CIKM ’17). ACM, New York, NY, USA, 2323–2326. https://doi.org/10.1145/3132847.3133153
[22]
Ivan Titov and Ryan McDonald. 2008. A joint model of text and aspect ratings for sentiment summarization. In proceedings of ACL-08: HLT. 308–316.
[23]
Lap Q. Trieu, Huy Q. Tran, and Minh-Triet Tran. 2017. News Classification from Social Media Using Twitter-based Doc2Vec Model and Automatic Query Expansion. In Proceedings of the Eighth International Symposium on Information and Communication Technology (Nha Trang City, Viet Nam) (SoICT 2017). ACM, New York, NY, USA, 460–467. https://doi.org/10.1145/3155133.3155206
[24]
Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas. 2018. Mix ’N Match: Integrating Text Matching and Product Substitutability Within Product Search. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM ’18). ACM, New York, NY, USA, 1373–1382. https://doi.org/10.1145/3269206.3271668
[25]
Libing Wu, Cong Quan, Chenliang Li, and Donghong Ji. 2018. PARL: Let Strangers Speak Out What You Like. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM ’18). ACM, New York, NY, USA, 677–686. https://doi.org/10.1145/3269206.3271695
[26]
Lili Zhao and Chunping Li. 2009. Ontology based opinion mining for movie reviews. In International Conference on Knowledge Science, Engineering and Management. Springer, 204–214.
[27]
Li Zhuang, Feng Jing, and Xiao-Yan Zhu. 2006. Movie Review Mining and Summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (Arlington, Virginia, USA) (CIKM ’06). ACM, New York, NY, USA, 43–50. https://doi.org/10.1145/1183614.1183625
[28]
Yuan Zuo, Junjie Wu, Hui Zhang, Hao Lin, Fei Wang, Ke Xu, and Hui Xiong. 2016. Topic Modeling of Short Texts: A Pseudo-Document View. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). ACM, New York, NY, USA, 2105–2114. https://doi.org/10.1145/2939672.2939880

Index Terms

  1. Doc2Vec-based Approach for Extracting Diverse Evaluation Expressions from Online Review Data
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      iiWAS2021: The 23rd International Conference on Information Integration and Web Intelligence
      November 2021
      658 pages
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 December 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. doc2vec
      2. euphemism
      3. online review

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      iiWAS2021

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 47
        Total Downloads
      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media