Abstract
The main task of sentiment classification is to automatically judge sentiment polarity (positive or negative) of published sentiment data (e.g. news or reviews). Some researches have shown that supervised methods can achieve good performance for blogs or reviews. However, the polarity of a news report is hard to judge. Web news reports are different from other web documents. The sentiment features in news are less than the features in other Web documents. Besides, the same words in different domains have different polarity. So we propose a self-growth algorithm to generate a cross-domain sentiment word list, which is used in sentiment classification of Web news. This paper considers some previously undescribed features for automatically classifying Web news, examines the effectiveness of these techniques in isolation and when aggregated using classification algorithms, and also validates the self-growth algorithm for the cross-domain word list.
Supported by NSFC under Grant No. 61073081.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alpaydin, E.: Introduction to machine learning. The MIT Press (2004)
Aue, A., Gamon, M.: Customizing sentiment classifiers to new domains: A case study. In: ICRA in NLP (2005)
Becker, I., Aharonson, V.: Last but definitely not least: On the role of the last sentence in automatic polarity-classification. In: ACL (2010)
Brody, S., Diakopoulos, N.: Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! using word lengthening to detect sentiment in microblogs. In: EMNLP (2011)
Brody, S., Elhadad, N.: An unsupervised aspect-sentiment model for online reviews. In: ACL (2010)
Bun, K.K., Ishizuka, M.: Topic extraction from news archive using tf*pdf algorithm. In: WISE (2002)
Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: WWW (2003)
Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: Exploiting low association with known sentiment terms. In: ACL (2005)
Gyongyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: VLDB (2004)
Na, J., Sui, H., Khoo, C., Chan, S., Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews. In: ISKO (2004)
Pan, S.J., Ni, X., Sun, J.-T., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: WWW (2010)
Pang, B., Lee, L.: A sentiment education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: ACL (2004)
Qiu, L., Zhang, W., Hu, C., Zhao, K.: Selc: A self-supervised model for sentiment classification. In: IKM (2009)
Read, J.: Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: ACL (2005)
Turney, P.D.: Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: ACL (2002)
Velikovich, L., Blair-Goldensohn, S., Hannan, K., McDonald, R.: The viability of web-derived polarity lexicons. In: ACL (2010)
Yu, H., Hatzivassiloglou, V.: Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In: EMNLP (2003)
Zagibalov, T., Carroll, J.: Automatic seed word selection for unsupervised sentiment classification of chinese text. In: COLING (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yan, L., Zhang, Y. (2012). News Sentiment Analysis Based on Cross-Domain Sentiment Word Lists and Content Classifiers. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-35527-1_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35526-4
Online ISBN: 978-3-642-35527-1
eBook Packages: Computer ScienceComputer Science (R0)