Abstract
In this paper, we present clear and formal definitions of ranking factors that should be concerned in opinion retrieval and propose a new opinion retrieval model which simultaneously combines the factors from the generative modeling perspective. The proposed model formally unifies relevance-based ranking with subjectivity detection at the document level by taking multiple ranking factors into consideration: topical relevance, subjectivity strength, and opinion-topic relatedness. The topical relevance measures how strongly a document relates to a given topic, and the subjectivity strength indicates the likelihood that the document contains subjective information. The opinion-topic relatedness reflects whether the subjective information is expressed with respect to the topic of interest. We also present the universality of our model by introducing the model’s derivations that represent other existing opinion retrieval approaches. Experimental results on a large-scale blog retrieval test collection demonstrate that not only are the individual ranking factors necessary in opinion retrieval but they cooperate advantageously to produce a better document ranking when used together. The retrieval performance of the proposed model is comparable to that of previous systems in the literature.
Similar content being viewed by others
Notes
They are based on their own baseline topical retrieval techniques instead of the five baseline results that TREC offered.
References
Anil, R., & Sarkar, S. (2008). IIT Kharagpur at TREC 2008 blog track. In TREC 2008: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Bermingham, A., Smeaton, A. F., Foster, J., & Hogan, D. (2008). DCU at the TREC 2008 blog track. In TREC 2008: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Clark, M., Beresi, U. C., Watt, S., & Harper, D. (2006). RGU at the TREC blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Hannah, D., Macdonald, C., Peng, J., He, B., & Ounis, I. (2007). University of Glasgow at TREC 2007: Experiments in blog and enterprise tracks with terrier. In TREC 2007: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Hoang, L., Lee, S. W., Hong, G. W., Lee, J. Y., & Rim, H. C. (2008). A hybrid method for opinion finding task (KUNLP at TREC 2008 blog track). In TREC 2008: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Joshi, H., Bayrak, C., & Xu, X. (2006). UALR at TREC: Blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Kim, S. M., Hovy, E. H. (2005). Automatic detection of opinion bearing words and sentences. In IJCNLP-05: Companion volume to the proceedings of the second international joint conference on natural language processing.
Kovacevic, M., & Huang, X. (2008). York University at TREC 2008: Blog track. In TREC 2008: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Li, B., Liu, F., & Liu, Y. (2008). UTDallas at TREC 2008: Blog track. In TREC 2008: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Liao, X., Cao, D., Tan, S., Liu, Y., Ding, G., & Cheng, X. (2006). Combining language model with sentiment analysis for opinion retrieval of blog-post. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Lv, Y., & Zhai, C. (2009). Positional language models for information retrieval. In SIGIR ’09: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval (pp. 299–306). New York, NY, USA: ACM.
Macdonald, C., & Ounis, I. (2006). The TREC Blogs06 collection: creating and analysing a blog test collection. Tech. Rep. TR-2006-224, Department of Computing Science, University of Glasgow.
Macdonald, C., Ounis, I., & Soboroff, I. (2007). Overview of the TREC 2007 blog track. In TREC 2007: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Mackay, D. J. C., & Petoy, L. C. B. (1995). A hierarchical Dirichlet language model. Natural Language Engineering, 1, 1–19.
Mishne, G. (2006). Multiple ranking strategies for opinion retrieval in blogs the University of Amsterdam at the 2006 TREC blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Momtazi, S., Kazalski, S., & Klakow, D. (2009). A combined query expansion technique for retrieving opinions from blogs. In Intelligent systems design and applications (pp. 791–796).
Na, S. H., & Ng, H. T. (2009). A 2-Poisson model for probabilistic coreference of named entities for improved text retrieval. In Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. SIGIR ’09 (pp. 275–282). New York, NY, USA: ACM.
Oard, D., Elsayed, T., Wang, J., Wu, Y., Zhang, P., Abels, E., et al. (2006). TREC-2006 at Maryland: Blog, enterprise, legal and QA tracks. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Ounis, I., de Rijke, M., Macdonald, C., Mishne, G., & Soboroff, I. (2006). Overview of the TREC 2006 blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA (pp. 17–31).
Santos, R. L., He, B., Macdonald, C., & Ounis, I. (2009). Integrating proximity to subjective sentences for blog opinion retrieval. In ECIR ’09: Proceedings of the 31th European conference on IR research on advances in information retrieval (pp. 325–336). Springer: Berlin, Heidelberg.
Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: A Computer approach to content analysis. MIT Press.
Turney, P., & Littman, M. (2003). Measuring praise and criticism: inference of semantic orientation from association. In ACM transactions on information systems (Vol. 21, pp. 315–346).
Vechtomova, O. (2007). Using subjective adjectives in opinion retrieval from blogs. In TREC 2007: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Wiebe, J., Wilson, T., Bruce, R., Bell, M., & Martin, M. (2004). Learning subjective language. Computational Linguistics, 30(3), 277–308.
Yang, H., Si, L., & Callan, J. (2006a) Knowledge transfer and opinion detection in the TREC2006 blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Yang, K., Yu, N., Valerio, A., & Zhang, H. (2006b). WIDIT in TREC-2006 blog track. In TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Zhang, E., & Zhang, Y. (2006). UCSC on TREC 2006 blog opinion mining. In: TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Zhang, M., & Ye, X. (2008). A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 411–418). New York, NY, USA: ACM.
Zhang, W., & Clement, T. Y. (2006). UIC at TREC 2006 blog track. In: TREC 2006: Proceedings of the fifteenth text retrieval conference, Gaithersburg, Maryland, USA.
Zhou, G., Joshi, H., & Bayrak, C. (2007). Topic categorization for relevancy and opinion detection. In TREC 2007: Proceedings of the sixteenth text retrieval conference, Gaithersburg, Maryland, USA.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the Second Brain Korea 21 Project.
Rights and permissions
About this article
Cite this article
Lee, SW., Song, YI., Lee, JT. et al. A new generative opinion retrieval model integrating multiple ranking factors. J Intell Inf Syst 38, 487–505 (2012). https://doi.org/10.1007/s10844-011-0164-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-011-0164-5