skip to main content
10.1145/1498759.1498805acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Adaptive subjective triggers for opinionated document retrieval

Published: 09 February 2009 Publication History

Abstract

This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigger model---originally developed for incorporating distant word dependencies---in order to model the characteristics of personal opinions that cannot be properly modeled by standard n-grams. Our primary assumption is that there are two constituents to form a subjective opinion. One is the subject of the opinion or the object that the opinion is about, and the other is a subjective expression; the former is regarded as a triggering word and the latter as a triggered word. We automatically identify those subjective trigger patterns to build a language model from a corpus of product customer reviews. Experimental results on the TREC Blog Track test collections show that, when used for reranking initial search results, our proposed model significantly improves opinionated document retrieval by over 20% in MAP. In addition, we report on an experiment on dynamic adaptation of the model to a given query, which is found effective for most of difficult queries categorized under politics and organizations.

References

[1]
E. Adar and L. Adamic. Tracking information epidemics in blogspace. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pages 207--214, 2005.
[2]
N. Agarwal, H. Liu, L. Tang, and P. S. Yu. Identifying the influential bloggers in a community. In Proceedings of the international conference on Web search and web data mining, pages 207--218, 2008.
[3]
X. Ding, B. Liu, and P. S. Yu. A holistic lexicon-based approach to opinion mining. In Proceedings of the international conference on Web search and web data mining, pages 231--240, 2008.
[4]
J. L. Elsas, J. Arguello, J. Callan, and J. G. Carbonell. Retrieval and feedback models for blog feed search. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 347--354, 2008.
[5]
A. Esuli and F. Sebastiani. PageRanking WordNet synsets: An application to opinion mining. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, 2007.
[6]
D. Hannah, C. Macdonald, J. Peng, B. He, and I. Ounis. University of Glasgow at TREC 2007: Experiments in blog and enterprise tracks with Terrier. In Proceesings of the 16th Text Retrieval Conference, 2007.
[7]
N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the international conference on Web search and web data mining, pages 219--230, 2008.
[8]
D. Jurafsky and J. H. Martin. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, 2000.
[9]
R. Lau, R. Rosenfeld, and S. Roukos. Trigger-based language models: a maximum entropy approach. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 45--48, 1993.
[10]
V. Lavrenko and B. Croft. Relevance based language models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 120--127, 2001.
[11]
C. Macdonald, B. He, I. Ounis, and I. Soboroff. Limits of opinion-finding baseline systems. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 747--748, 2008.
[12]
C. Macdonald, I. Ounis, and I. Soboroff. Overview of the TREC-2007 blog track. In Proceesings of the 16th Text Retrieval Conference, 2007.
[13]
C. D. Manning and H. Schütze. Foundations of statistical natural language processing. MIT Press, 1999.
[14]
Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: Modeling facets and opinions in weblogs. In Proceedings of the 16th International World Wide Web Conference, 2007.
[15]
D. Metzler and W. Croft. Combining the language model and inference network approaches to retrieval. Information Processing and Management. Special Issue on Bayesian Networks and Information Retrieval, 40(5):735--750, 2004.
[16]
G. Mishne. Multiple ranking strategies for opinion retrieval in blogs. In Proceesings of the 15th Text Retrieval Conference, 2006.
[17]
D. Oard, T. Elsayed, J. Wang, Y. Wu, P. Zhang, E. Abels, J. Lin, and D. Soergel. TREC-2006 at Maryland: Blog, enterprise, legal and QA tracks. In Proceesings of the 15th Text Retrieval Conference, 2006.
[18]
I. Ounis, M. de Rijke, C. Macdonald, G. Mishne, and I. Soboroff. Overview of the TREC-2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
[19]
T. Sakai, T. Manabe, and M. Koyama. Flexible pseudo-relevance feedback via selective sampling. ACM Transactions on Asian Language Information Processing, 4(2):111--135, 2005.
[20]
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1983.
[21]
K. Seki, Y. Kino, S. Sato, and K. Uehara. TREC 2007 blog track experiments at Kobe University. In Proceesings of the 16th Text Retrieval Conference, 2007.
[22]
K. Seki and J. Mostafa. An application of text categorization methods to gene ontology annotation. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pages 138--145, 2005.
[23]
K. Sparck Jones. Statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11--20, 1972.
[24]
T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162--169, 2006.
[25]
C. Tillmann and H. Ney. Grammatical Interference: Learning Syntax from Sentences, chapter Selection criteria for word trigger pairs in language modeling, pages 95--106. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 1996.
[26]
O. Vechtomova. Using subjective adjectives in opinion retrieval from blogs. In Proceesings of the 16th Text Retrieval Conference, 2007.
[27]
K. Yang, N. Yu, A. Valerio, and H. Zhang. WIDIT in trec-2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
[28]
K. Yang, N. Yu, and H. Zhang. WIDIT in TREC 2007 blog track: Combining lexicon-based methods to detect opinionated blogs. In Proceesings of the 16th Text Retrieval Conference, 2007.
[29]
M. Zhang and X. Ye. A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 411--418, 2008.
[30]
W. Zhang and C. Yu. UIC at TREC 2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
[31]
W. Zhang, C. Yu, and W. Meng. Opinion retrieval from blogs. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 831--840, 2007.
[32]
G. Zhou, H. Joshi, and C. Bayrak. Topic categorization for relevancy and opinion detection. In Proceesings of the 16th Text Retrieval Conference, 2007.

Cited By

View all
  • (2015)An effective approach to tweets opinion retrievalWorld Wide Web10.1007/s11280-013-0268-718:3(545-566)Online publication date: 1-May-2015
  • (2015)Structuring Tweets for improving Twitter searchJournal of the Association for Information Science and Technology10.1002/asi.2333266:12(2522-2539)Online publication date: 1-Dec-2015
  • (2012)Find me opinion sources in blogosphereProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124366(583-592)Online publication date: 8-Feb-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining
February 2009
314 pages
ISBN:9781605583907
DOI:10.1145/1498759
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. opinion retrieval
  2. trigger language models
  3. weblog

Qualifiers

  • Research-article

Conference

WSDM'09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)An effective approach to tweets opinion retrievalWorld Wide Web10.1007/s11280-013-0268-718:3(545-566)Online publication date: 1-May-2015
  • (2015)Structuring Tweets for improving Twitter searchJournal of the Association for Information Science and Technology10.1002/asi.2333266:12(2522-2539)Online publication date: 1-Dec-2015
  • (2012)Find me opinion sources in blogosphereProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124366(583-592)Online publication date: 8-Feb-2012
  • (2012)Opinion mining: reviewed from word to document levelSocial Network Analysis and Mining10.1007/s13278-012-0057-93:1(107-125)Online publication date: 25-Mar-2012
  • (2011)Measuring Opinion Relevance in Latent Topic Space2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing10.1109/PASSAT/SocialCom.2011.45(323-330)Online publication date: Oct-2011
  • (2010)Opinion finding in blogsAdaptivity, Personalization and Fusion of Heterogeneous Information10.5555/1937055.1937093(148-152)Online publication date: 28-Apr-2010
  • (2010)Blog track research at TRECACM SIGIR Forum10.1145/1842890.184289944:1(58-75)Online publication date: 18-Aug-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media