Abstract
Sentiment analysis of online documents such as news articles, blogs and microblogs has received increasing attention in recent years. In this article, we propose an efficient algorithm and three pruning strategies to automatically build a word-level emotional dictionary for social emotion detection. In the dictionary, each word is associated with the distribution on a series of human emotions. In addition, a method based on topic modeling is proposed to construct a topic-level dictionary, where each topic is correlated with social emotions. Experiment on the real-world data sets has validated the effectiveness and reliability of the methods. Compared with other lexicons, the dictionary generated using our approach is language-independent, fine-grained, and volume-unlimited. The generated dictionary has a wide range of applications, including predicting the emotional distribution of news articles, identifying social emotions on certain entities and news events.
Similar content being viewed by others
References
Baccianella S., Esuli A., Sebastiani F.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of The 7th Conference on Language Resources and Evaluation, pp. 2200–2204, (2010)
Banea, C., Mihalcea R., Wiebe J.: A bootstrapping method for building subjectivity lexicons for languages with scarce resources. In: Proceedings of The 6th International Conference on Language Resources and Evaluation, (2008)
Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., Yu, Y.: Mining social emotions from affective text. IEEE Trans. Knowl. Data Eng. 24, 1658–1670 (2011)
Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., Yu Y.: Joint emotion-topic modeling for social affective text mining. In: Proceedings of The 9th IEEE International Conference on Data Mining, pp. 699–704, (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Cai, D., Mei, Q., Han, J., Zhai, C.: Modeling hidden topics on document manifold. In: Proceedings of The 17th ACM Conference on Information and Knowledge Management, pp. 911–920, (2008)
Chaumartin, F.R.: Upar7: A knowledge-based system for headline sentiment tagging. In: The 4th International Workshop on Semantic Evaluations, 422–425, Association for Computational Linguistics, (2007)
Das, S., Chen, M.: Yahoo! for Amazon: Extracting market sentiment from stock message boards. In: Proceedings of The 8th Asia Pacific Finance Association Annual Conference, (2001)
Griffiths T. L., Steyvers, M.: Finding scientific topics. In: Proceedings of the National Academy of Sciences of the United States of America, 101, pp. 5228–5235, (2004)
Hofmann, T.: Probabilistic latent semantic indexing. In Proceedings of The 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 50–57, (1999)
Ifrim, G., Weikum, G.: The bag-of-opinions method for review rating prediction from sparse text patterns. In: Proceedings of Coling, (2010)
Jindal, N., Liu, B.: Opinion spam and analysis. In: The International Conference on Web Search and Web Data Mining, pp. 219–230, (2008)
Katz, P., Singleton, M., Wicentowski, R.: Swat-mp: The semeval-2007 systems for task 5 and task 14, In: The 4th International Workshop on Semantic Evaluations, 308–313. Association for Computational Linguistics, (2007)
Koga, H., Taniguchi, T.: Developing a user recommendation engine on twitter using estimated latent topics. In: Proceedings of The 14th international conference on Human-computer interaction: design and development approaches - vol. Part I, pp. 461–470, (2011)
Kolya, A., Das, D., Ekbal, A., Bandyopadhyay, S.: Identifying event-sentiment association using lexical equivalence and co-reference approaches. In: Workshop on Relational Models of Semantics Collocated with ACL, pp. 19–27, (2011)
Lin, K.H.-Y., Yang, C., Chen, H.-H.: Emotion classification of online news articles from the reader’s perspective. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 220–226, (2008)
Lin, K.H.-Y., Yang, C., Chen, H.-H.: What emotions do news articles trigger in their readers?. In: Proceedings of The 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 733–734, (2007)
Liu, J., Seneff, S.: Review sentiment scoring via a parse-and-paraphrase paradigm. In: Empirical methods in natural language processing, ACL, (2009)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press, pp. 156–281, (2008)
Moreo, A., Romero, M., Castro, J.L., Zurita, J.M.: Lexicon-based comments-oriented news sentiment analyzer system. Expert Syst. Appl. 39, 9166–9180 (2012)
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Affect analysis model: novel rule-based approach to affect sensing from text. Nat. Lang. Eng. 17(1), 95–135 (2011)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Empirical methods in natural language processing, pp. 79-86, (2002)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of The 20th Conference in Uncertainty in Artificial Intelligence, pp. 487–494, (2004)
Smet, W. D., Moens, M. -F.: An aspect based document representation for event clustering. In: Proceedings of The 19th meeting of Computational Linguistics, (2009)
Snow, R., Connor, B.O’, Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good? Evaluation non-expert annotations for natural language tasks. In: Empirical Methods in Natural Language Processing, pp. 254–263, (2008)
Song, Y., Wang, H., Wang, Z., Li, H., Chen, W.: Short text conceptualization using a probabilistic knowledgebase. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2330–2336, (2011)
Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Proceedings of the 2008 ACM Symposium on Applied Computing, Fortaleza, Brazil, pp. 1556–1560, (2008)
Strapparava, C., Mihalcea, R.: Semeval-2007 task 14: Affective text. In: Proceedings of The 4th International Workshop on Semantic Evaluations, pp. 70–74, (2007)
Strapparava, C., Valitutti, A.: Wordnet-affect: an affective extension of wordnet. In: Proceedings of The 4th International Conference on Language Resources and Evaluation, pp. 1083–1086, (2004)
Tong, R.M.: An operational system for detecting and tracking opinions in on-line discussions. In: Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification, pp. 1–6, (2001)
Turney, P.D., Littman, M.L.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus, Technical Report EGB-1094, National Research Council Canada, (2002)
Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of The 40th annual meeting of the Association for Computational Linguistics, pp. 17–424, (2002)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rao, Y., Lei, J., Wenyin, L. et al. Building emotional dictionary for sentiment analysis of online news. World Wide Web 17, 723–742 (2014). https://doi.org/10.1007/s11280-013-0221-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-013-0221-9