skip to main content
research-article

An Association-Based Unified Framework for Mining Features and Opinion Words

Published:31 March 2015Publication History
Skip Abstract Section

Abstract

Mining features and opinion words is essential for fine-grained opinion analysis of customer reviews. It is observed that semantic dependencies naturally exist between features and opinion words, even among features or opinion words themselves. In this article, we employ a corpus statistics association measure to quantify the pairwise word dependencies and propose a generalized association-based unified framework to identify features, including explicit and implicit features, and opinion words from reviews. We first extract explicit features and opinion words via an association-based bootstrapping method (ABOOT). ABOOT starts with a small list of annotated feature seeds and then iteratively recognizes a large number of domain-specific features and opinion words by discovering the corpus statistics association between each pair of words on a given review domain. Two instances of this ABOOT method are evaluated based on two particular association models, likelihood ratio tests (LRTs) and latent semantic analysis (LSA). Next, we introduce a natural extension to identify implicit features by employing the recognized known semantic correlations between features and opinion words. Experimental results illustrate the benefits of the proposed association-based methods for identifying features and opinion words versus benchmark methods.

References

  1. Apoorv Agarwal, Fadi Biadsy, and Kathleen R. Mckeown. 2009. Contextual phrase-level polarity analysis using lexical affect scoring and syntactic N-grams. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL’09). 24--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (March 2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Wanxiang Che, Zhenghua Li, and Ting Liu. 2010. LTP: A Chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations (COLING’10). 13--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kenneth Ward Church and Patrick Hanks. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics 16, 1 (1990), 22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Scott Deerwester, Susan T. Dumais, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 6 (1990), 391--407.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ted Dunning. 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19, 1 (March 1993), 61--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gene H. Golub and Charles F. Van Loan. 1996. Matrix Computations. The Johns Hopkins University Press (1996).Google ScholarGoogle Scholar
  8. Zhen Hai, Kuiyu Chang, and Jung-jae Kim. 2011. Implicit feature identification via co-occurrence association rule mining. In Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing’11). 393--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). 168--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wei Jin and Hung Hay Ho. 2009. A novel lexicalized HMM-based learning framework for web opinion mining. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). 465--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yohan Jo and Alice H. Oh. 2011. Aspect and sentiment unification model for online review analysis. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). 815--824. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Meeting of the Association for Computational Linguistics (ACL’03). 423--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fangtao Li, Chao Han, Minlie Huang, Xiaoyan Zhu, Ying-Ju Xia, Shu Zhang, and Hao Yu. 2010. Structure-aware review mining and summarization. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 653--661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chenghua Lin and Yulan He. 2009. Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). 375--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bing Liu. 2010. Sentiment analysis and subjectivity. Handbook of Natural Language Processing 2 (2010), 627--666.Google ScholarGoogle Scholar
  16. Bing Liu. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5, 1 (2012), 1--167.Google ScholarGoogle ScholarCross RefCross Ref
  17. Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL’11). 142--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Georgios Paltoglou and Mike Thelwall. 2012. Twitter, MySpace, Digg: Unsupervised sentiment analysis in social media. ACM Transactions on Intelligent Systems and Technology 3, 4 (2012), 66:1--66:19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 751--760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04). Article 271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’02). 79--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ana-Maria Popescu and Oren Etzioni. 2005. Extracting product features and opinions from reviews. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT&EMNLP’’05). 339--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. 2009. Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 1199--1204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. 2011. Opinion word expansion and target extraction through double propagation. Computational linguistics 37, 1 (March 2011), 9--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Qi Su, Xinying Xu, Honglei Guo, Zhili Guo, Xian Wu, Xiaoxun Zhang, Bin Swen, and Zhong Su. 2008. Hidden sentiment association in Chinese web opinion mining. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 959--968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ivan Titov and Ryan McDonald. 2008. Modeling online reviews with multi-grain topic models. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 111--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Peter D. Turney. 2002. Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). 417--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Anthony J. Viera and Joanne M. Garrett. 2005. Understanding interobserver agreement: The Kappa statistic. Family Medicine 37, 5 (2005), 360--363.Google ScholarGoogle Scholar
  29. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT&EMNLP’’05). 347--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yuanbin Wu, Qi Zhang, Xuanjing Huang, and Lide Wu. 2009. Phrase dependency parsing for opinion mining. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 1533--1541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lei Zhang, Bing Liu, Suk Hwan Lim, and Eamonn O’Brien-Strain. 2010. Extracting and ranking product features in opinion documents. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 1462--1470. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An Association-Based Unified Framework for Mining Features and Opinion Words

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 6, Issue 2
      Special Section on Visual Understanding with RGB-D Sensors
      May 2015
      381 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2753829
      • Editor:
      • Huan Liu
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 31 March 2015
      • Accepted: 1 August 2014
      • Revised: 1 June 2014
      • Received: 1 May 2013
      Published in tist Volume 6, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader