skip to main content
10.1145/3511095.3536367acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
extended-abstract

Identifying neutral reviews from unlabeled data: An exploratory study on user ratings and word-level polarity scores

Published:28 June 2022Publication History

ABSTRACT

The presence of the reviews containing mixed or contrasting opinions, also known as neutral reviews, is prevalent in user feedback data. By leveraging annotated data, supervised machine learning (ML) classifiers can learn implicit patterns to identify these neutral reviews. However, labeled data are barely available in most circumstances. When annotated data are unavailable, unsupervised approaches such as lexicon-based methods are employed that utilize word-level polarity scores with a set of rules. As a preliminary study for developing a sophisticated unsupervised framework for recognizing neutral reviews, here, we scrutinize the performances of the existing lexicon-based methods. When applied to four multi-domain review datasets, we observe that all of them perform poorly for identifying neutral reviews. We manually inspect the semantic attributes of a subset of neutral reviews classified wrong by these lexicon-based methods. The experimental results and manual analysis reveal that determining neutrality utilizing the lexical rule-based methods is often ineffective due to numerous reasons, such as user preferences on certain aspects, coverage of the sentiment lexicon, irregularly in the efficacy of aggregation rules, and the context-sensitive polarity of words. As a preliminary study, this analysis reveals traits of neutral reviews and limitations of existing approaches and provides insights to develop methods for neutral review identification from the unlabeled data.

Skip Supplemental Material Section

Supplemental Material

Neutrality.mp4

mp4

27.1 MB

References

  1. Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca J Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011). 30–38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Orestes Appel, Francisco Chiclana, Jenny Carter, and Hamido Fujita. 2018. Successes and challenges in developing a hybrid approach to sentiment analysis. Applied Intelligence 48, 5 (2018), 1176–1188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Luis Chiruzzo, Mathias Etcheverry, and Aiala Rosá. 2020. Sentiment analysis in Spanish tweets: Some experiments with focus on neutral tweets. (2020).Google ScholarGoogle Scholar
  4. Marco Guerini, Lorenzo Gatti, and Marco Turchi. 2013. Sentiment analysis: How to derive prior polarities from SentiWordNet. arXiv preprint arXiv:1309.5843(2013).Google ScholarGoogle Scholar
  5. AR Hamed, Renxi Qiu, and Dayou Li. 2016. The importance of neutral class in sentiment analysis of Arabic tweets. Int. J. Comput. Sci. Inform. Technol 8 (2016), 17–31.Google ScholarGoogle Scholar
  6. Clayton J Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.Google ScholarGoogle ScholarCross RefCross Ref
  7. Moshe Koppel and Jonathan Schler. 2006. The importance of neutral examples for learning sentiment. Computational Intelligence 22, 2 (2006), 100–109.Google ScholarGoogle ScholarCross RefCross Ref
  8. J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.Google ScholarGoogle Scholar
  9. Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies. 142–150.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrius Mudinas, Dell Zhang, and Mark Levene. 2012. Combining lexicon and learning based approaches for concept-level sentiment analysis. In Proceedings of the first international workshop on issues of sentiment discovery and opinion mining. 1–8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Finn Årup Nielsen. 2011. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903(2011).Google ScholarGoogle Scholar
  12. Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058(2004).Google ScholarGoogle Scholar
  13. Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075(2005).Google ScholarGoogle Scholar
  14. Filipe N Ribeiro, Matheus Araújo, Pollyanna Gonçalves, Marcos André Gonçalves, and Fabrício Benevenuto. 2016. Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science 5, 1 (2016), 1–29.Google ScholarGoogle ScholarCross RefCross Ref
  15. Salim Sazzed. 2020. Development of sentiment lexicon in bengali utilizing corpus and cross-lingual resources. In 2020 IEEE 21st International conference on information reuse and integration for data science (IRI). IEEE, 237–244.Google ScholarGoogle ScholarCross RefCross Ref
  16. Salim Sazzed. 2021. Improving sentiment classification in low-resource bengali language utilizing cross-lingual self-supervised learning. In International Conference on Applications of Natural Language to Information Systems. Springer, 218–230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Salim Sazzed and Sampath Jayarathna. 2019. A sentiment classification in bengali and machine translated english corpus. In 2019 IEEE 20th international conference on information reuse and integration for data science (IRI). IEEE, 107–114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Salim Sazzed and Sampath Jayarathna. 2021. Ssentia: a self-supervised sentiment analyzer for classification from unlabeled data. Machine Learning with Applications 4 (2021), 100026.Google ScholarGoogle ScholarCross RefCross Ref
  19. Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics 37, 2 (2011), 267–307.Google ScholarGoogle Scholar
  20. Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas. 2010. Sentiment strength detection in short informal text. Journal of the American society for information science and technology 61, 12 (2010), 2544–2558.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ana Valdivia, M Victoria Luzón, Erik Cambria, and Francisco Herrera. 2018. Consensus vote models for detecting and filtering neutrality in sentiment analysis. Information Fusion 44(2018), 126–135.Google ScholarGoogle ScholarCross RefCross Ref
  22. Lei Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. 2011. Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 89 (2011).Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    HT '22: Proceedings of the 33rd ACM Conference on Hypertext and Social Media
    June 2022
    272 pages
    ISBN:9781450392334
    DOI:10.1145/3511095

    Copyright © 2022 Owner/Author

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 June 2022

    Check for updates

    Qualifiers

    • extended-abstract
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate378of1,158submissions,33%

    Upcoming Conference

    HT '24
    35th ACM Conference on Hypertext and Social Media
    September 10 - 13, 2024
    Poznan , Poland
  • Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)5

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format