Can predicate-argument structures be used for contextual opinion retrieval from blogs?

Orimaye, Sylvester O.; Alhashmi, Saadat M.; Siew, Eu-Gene

doi:10.1007/s11280-012-0170-8

Can predicate-argument structures be used for contextual opinion retrieval from blogs?

Published: 31 May 2012

Volume 16, pages 763–791, (2013)
Cite this article

World Wide Web Aims and scope Submit manuscript

Sylvester O. Orimaye¹,
Saadat M. Alhashmi¹ &
Eu-Gene Siew¹

352 Accesses
2 Citations
Explore all metrics

Abstract

We present the results of our investigation on the use of predicate-argument structures for contextual opinion retrieval. The use of predicate-argument structure for opinion retrieval is a novel approach that exploits the grammatical derivation of sentences to show contextual and subjective relevance. We do not use frequency of certain keywords as it is usually done in keyword-based opinion retrieval approaches. Rather, our novel solution is based on frequency of contextually relevant and subjective sentences. We use a linear relevance model that leverages semantic similarities among predicate-argument structures of sentences. Thus, this paper presents the evaluation results of the linear relevance model. The model does a linear combination of a popular relevance model, our proposed transformed terms similarity model, and the absolute value of a sentence subjectivity scoring scheme. The predicate-argument structures are derived from the grammatical derivations of natural language query topics and the well formed sentences from blog documents. The derived predicate-argument structures are then semantically compared to compute an opinion relevance score. Our scoring technique uses the highest frequency of semantically related predicate-argument structures enriched with the total subjectivity score from sentences. Evaluation and experimental results show that predicate-argument structures can indeed be used for contextual opinion retrieval as it improves performance of opinion retrieval task by 15% over the popular TREC baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding Opinion Targets in News Comments and Book Reviews

Sentiment Analysis and Sentence Classification in Long Book-Search Queries

Opinion Extraction from Quora Using User-Biased Sentiment Analysis

References

Agarwal, N., Liu, H.: Blogosphere: research issues, tools, and applications. SIGKDD Explor. Newsl. 10(1), 18–31 (2008)
Article Google Scholar
Akaike, H.: Likelihood of a model and information criteria. Econometrics 16, 3–14 (1981)
Article MATH Google Scholar
Akaike, H.: Factor analysis and AIC. Psychometrika 52(3), 317–332 (1987)
Article MathSciNet MATH Google Scholar
Amati, G., Rijsbergen, C.J.V.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20(4), 357–389 (2002)
Article Google Scholar
Amati, G., Amodeo, G., Bianchi, M., Gaibisso, C., Gambosi, G.: A Uniform Theoretic Approach to Opinion and Information Retrieval. In: Armano, G., de Gemmis, M., Semeraro, G., Vargiu, E. (eds.) Intelligent Information Access, vol. 301. Studies in Computational Intelligence, pp. 83-108. Springer Berlin/Heidelberg, (2010)
Bermingham, A., Smeaton, A.F.: A study of inter-annotator agreement for opinion retrieval. In: Proc. of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Boston, MA, USA (2009)
Boiy, E., Moens, M.-F.: A machine learning approach to sentiment analysis in multilingual Web texts. Inf. Retriev. 12(5), 526–558 (2009)
Article Google Scholar
Bozdogan, H.: Model selection and Akaike’s Information Criterion (AIC): the general theory and its analytical extentions. Psychometrika 52(3), 345–370 (1987)
Article MathSciNet MATH Google Scholar
Burnham, K.P., Anderson, D.R: Model Selection and Multimodel Inference. Springer-Verlag New York, Inc. (2002)
Charniak, E.: A maximum-entropy-inspired parser. In: Proc. of the 1st North American chapter of the Association for Computational Linguistics conference, Seattle, Washington (2000)
Charniak, E.: Top-down nearly-context-sensitive parsing. In: Proc. of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, Massachusetts (2010)
Clark, S., Curran, J.R.: Wide-coverage efficient statistical parsing with ccg and log-linear models. Comput. Linguist. 33(4), 493–552 (2007)
Article MATH Google Scholar
Curran, J.R., Clark, S., Bos, J.: Linguistically motivated large-scale NLP with C & C and boxer. In: Proc. of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, Prague, Czech Republic (2007)
Ding, X., Liu, B.: The utility of linguistic rules in opinion mining. In: Proc. of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, The Netherlands (2007)
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proc. of the international conference on Web search and web data mining, Palo Alto, California, USA (2008)
Du, W., Tan, S.: An iterative reinforcement approach for fine-grained opinion mining. In: Proc. of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado (2009)
Duan, H., Hsu, B.-J.: Online spelling correction for query completion. In: Proc. of the 20th international conference on World Wide Web, Hyderabad, India (2011)
Dumais, S.T., Furnas, G.W., Landauer, T.K., Deerwester, S., Harshman, R.: Using latent semantic analysis to improve access to textual information. In: Proc. of the SIGCHI conference on Human factors in computing systems, Washington, D.C., USA, (1988)
Esuli, A.: Automatic generation of lexical resources for opinion mining: models, algorithms and applications. SIGIR Forum 42(2), 105–106 (2008)
Article Google Scholar
Fernández, R.T., Losada, D.E, Azzopardi, L.A: Extending the language modeling framework for sentence retrieval to include local context. Information Retrieval, 1-35 (2010)
Gerani, S., Carman, M.J., Crestani, F.: Proximity-Based Opinion Retrieval. SIGIR ACM,Geneva, Switzerland, 978 (2010)
Gildea, D., Hockenmaier, J.: Identifying semantic roles using Combinatory Categorial Grammar. In: Proc. of the 2003 conference on Empirical methods in natural language processing, Sapporo, Japan (2003)
He, B., Macdonald, C., Ounis, I.: Ranking opinionated blog posts using OpinionFinder. In: Proc. of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, Singapore (2008)
Hiemstra, D.: Using language models for information retrieval. Centre for Telematics and Information Technology, The Netherlands (2000)
Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Berkeley, California, USA (1999)
Huang, X., Croft, W.B.: A unified relevance model for opinion retrieval. In: Proc. of the 18th ACM conference on Information and knowledge management, Hong Kong, China (2009)
Huang, J., Efthimiadis, E.N.: Analyzing and evaluating query reformulation strategies in web search logs. In: Proc. of the 18th ACM conference on Information and knowledge management, Hong Kong, China (2009)
Javanmardi, S., Gao, J., Wang, K.: Optimizing two stage bigram language models for IR. In: Proc. of the 19th international conference on World Wide Web, Raleigh, North Carolina, USA (2010)
Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proc. of the 15th international conference on World Wide Web, Edinburgh, Scotland (2006)
Kanayama, H., Nasukawa, T.: Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: Proc. of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia (2006)
Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proc. of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, USA (2001)
Lee Y, Jung H-y, Song W, Lee J-H. Mining the blogosphere for top news stories identification. In: Proc. of the 33rd international ACM SIGIR conference on Research and development in information retrieval, Geneva, Switzerland; 2010.
Lee, S.-W., Lee, J.-T., Song, Y.-I., Rim, H.-C.: High precision opinion retrieval using sentiment-relevance flows. In: Proc. of the 33rd international ACM SIGIR conference on Research and development in information retrieval, Geneva, Switzerland (2010)
Leung, C., Chan, S., Chung, F-l, Ngai, G.: A probabilistic rating inference framework for mining user preferences from reviews. World Wide Web 14(2), 187–215 (2011)
Article Google Scholar
Liu, B.: Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second Edition (2010)
Lv, Y., Zhai, C.: A comparative study of methods for estimating query language models with pseudo feedback. In: Proc. of the 18th ACM conference on Information and knowledge management, Hong Kong, China (2009)
Macdonald, C., Santos, R.L.T., Ounis, I., Soboroff, I.: Blog track research at TREC. SIGIR Forum 44(1), 58–75 (2010)
Article Google Scholar
Mukherjee, S., Ramakrishnan, I.V.: Automated semantic analysis of schematic data. World Wide Web 11(4), 427–464 (2008)
Article Google Scholar
Müller, C., Gurevych, I.: Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) Evaluating Systems for Multilingual and Multimodal Information Access, vol. 5706. Lecture Notes in Computer Science, pp. 219-226. Springer Berlin/Heidelberg (2009)
Munson, S.A., Resnick, P.: Presenting diverse political opinions: how and how much. In: Proc. of the 28th international conference on Human factors in computing systems, Atlanta, Georgia, USA (2010)
Nam, S.-H., Na, S.-H., Lee, Y., Lee, J.-H.: DiffPost: Filtering Non-relevant Content Based on Content Difference between Two Consecutive Blog Posts. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) Advances in Information Retrieval, vol. 5478. Lecture Notes in Computer Science, pp. 791-795. Springer Berlin/Heidelberg (2009)
Natalie, S.G., Matthew, H., Takashi, T.: BlogPulse: Automated Trend Discovery for Weblogs. In. WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation (2004)
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: Proc. of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR 2006), Seattle, Washington, USA (2006)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proc. of the ACL-02 conference on Empirical methods in natural language processing, Philadelphia, USA (2002)
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Article Google Scholar
Rijsbergen, C.J.V.: A Theoretical Basis for the use of Co-Occurrence Data in Information Retrieval. J. Doc. 33(2), 106–119 (1977)
Article Google Scholar
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends in Inf. Retriev. 3(4), 333–389 (2009)
Article Google Scholar
Santos, R.L.T, He, B., Macdonald, C., Ounis, I.: Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval. ECIR Advances in Information Retrieval 5478/2009, 325-336 (2009)
Sarmento, S., Carvalho, P., Silva, M.-J., Eugénio de Oliveira: Automatic creation of a reference corpus for political opinion mining in user-generated content. In: Proc. of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, Hong Kong, China (2009)
Siersdorfer, S.,Chelaru, S., Pedro, J.-S: How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings. In: Proc. of the 19th International World Wide Web Conference, Raleigh, North Carolina, USA, 891-900 (2010)
Steedman, M.: The Syntactic Process (Language, Speech, and Communication). The MIT Press (2000)
Surdeanu, M., Harabagiu, S., Williams, J., Aarseth, P.: Using predicate-argument structures for information extraction. In: Proc. of the 41st Annual Meeting on Association for Computational Linguistics, Sapporo, Japan (2003)
Tata, S., Patel, J.M.: Estimating the selectivity of < i > tf-idf</i > based cosine similarity predicates. SIGMOD Rec. 36(4), 75–80 (2007)
Article Google Scholar
Thet, T.T., Na, J.-C., Khoo, C.S.G., Shakthikumar, S.: Sentiment analysis of movie reviews on discussion boards using a linguistic approach. In: Proc. of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, Hong Kong, China (2009)
Tumasjan, A., Sprenger, T.-O., Sandner, P.-J., Welpe, I.-M.: Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. In: Proc. of the Fourth International AAAI Conference on Weblogs and Social Media (2010)
Wei, Z., Clement, Y.: UIC at TREC 2006 Blog Track. In: TREC (ed.). (2006)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Lang. Res. and Eval. 39(2/3), 165–210 (2005)
Article Google Scholar
Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: OpinionFinder: a system for subjectivity analysis. In: Proc. of HLT/EMNLP on Interactive Demonstrations, Vancouver, British Columbia, Canada (2005)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Article Google Scholar
Xu, X., Liu, Y., Xu, H., Yu, X., Song, L., Guan, F., Peng, Z., Cheng, X.: ICTNET at Blog Track TREC 2009. TREC 2009 (2009)
Zafarani, R., Cole, W., Liu, H.: Sentiment propagation in social networks: a case study in livejournal. In: Chai, S.-K., Salerno, J., Mabry, P. (eds.) Advances in Social Computing, vol. 6007. Lecture Notes in Computer Science, pp. 413–420. Springer, Berlin (2010)
Google Scholar
Zhai, C.: Statistical language models for information retrieval a critical review. Foundations and Trends in Inf. Retriev. 2(3), 137–213 (2008)
Article Google Scholar
Zhang, W., Yu, C., Meng, W.: Opinion retrieval from blogs. In: Proc. of the sixteenth ACM conference on Conference on information and knowledge management, Lisbon, Portugal (2007)
Zhang, R., Tran, T., Mao, Y.: Opinion helpfulness prediction in the presence of “words of few mouths”. World Wide Web, 1-22 (2011)

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Monash University, Sunway Campus, Bandar Sunway, Malaysia
Sylvester O. Orimaye, Saadat M. Alhashmi & Eu-Gene Siew

Authors

Sylvester O. Orimaye
View author publications
You can also search for this author inPubMed Google Scholar
Saadat M. Alhashmi
View author publications
You can also search for this author inPubMed Google Scholar
Eu-Gene Siew
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sylvester O. Orimaye.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Orimaye, S.O., Alhashmi, S.M. & Siew, EG. Can predicate-argument structures be used for contextual opinion retrieval from blogs?. World Wide Web 16, 763–791 (2013). https://doi.org/10.1007/s11280-012-0170-8

Download citation

Received: 04 July 2011
Revised: 10 May 2012
Accepted: 16 May 2012
Published: 31 May 2012
Issue Date: November 2013
DOI: https://doi.org/10.1007/s11280-012-0170-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Can predicate-argument structures be used for contextual opinion retrieval from blogs?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Finding Opinion Targets in News Comments and Book Reviews

Sentiment Analysis and Sentence Classification in Long Book-Search Queries

Opinion Extraction from Quora Using User-Biased Sentiment Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now