research-article

Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification

Authors:

Xiaoyan ZhuAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 35, Issue 3

Article No.: 26, Pages 1 - 27

https://doi.org/10.1145/3052770

Published: 05 June 2017 Publication History

Abstract

Phrase/Sentence representation is one of the most important problems in natural language processing. Many neural network models such as Convolutional Neural Network (CNN), Recursive Neural Network (RNN), and Long Short-Term Memory (LSTM) have been proposed to learn representations of phrase/sentence, however, rich syntactic knowledge has not been fully explored when composing a longer text from its shorter constituent words. In most traditional models, only word embeddings are utilized to compose phrase/sentence representations, while the syntactic information of words is yet to be explored. In this article, we discover that encoding syntactic knowledge (part-of-speech tag) in neural networks can enhance sentence/phrase representation. Specifically, we propose to learn tag-specific composition functions and tag embeddings in recursive neural networks, and propose to utilize POS tags to control the gates of tree-structured LSTM networks. We evaluate these models on two benchmark datasets for sentiment classification, and demonstrate that improvements can be obtained with such syntactic knowledge encoded.

References

[1]

Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. 2012. Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning. NIPS 2012 Workshop.

[2]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. The Journal of Machine Learning Research 3 (2003), 1137--1155.

Digital Library

[3]

Tao Chen, Ruifeng Xu, Yulan He, and Xuan Wang. 2017. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Systems with Applications 72 (2017), 221--230.

Digital Library

[4]

Yanqing Chen and Steven Skiena. 2014. Building sentiment lexicons for all major languages. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 383--389.

[5]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555 (2014).

[6]

Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 160--167.

Digital Library

[7]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (2011), 2493--2537.

Digital Library

[8]

Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2014. Adaptive multi-compositionality for recursive neural models with applications to sentiment analysis. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI’14). AAAI.

[9]

Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 260--269.

[10]

Jeffrey L. Elman. 1990. Finding structure in time. Cognitive Science 14, 2 (1990), 179--211.

[11]

Peter W. Foltz, Walter Kintsch, and Thomas K. Landauer. 1998. The measurement of textual coherence with latent semantic analysis. Discourse Processes 25, 2--3 (1998), 285--307.

[12]

Alan Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. 2013. Hybrid speech recognition with deep bidirectional LSTM. In Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE, 273--278.

[13]

Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. 2015. LSTM: A search space odyssey. arXiv:1503.04069 (2015).

[14]

Karl Moritz Hermann and Phil Blunsom. 2013. The role of syntax in vector space models of compositional semantics. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computer Linguistics, 894--904.

[15]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.

Digital Library

[16]

Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168--177.

Digital Library

[17]

Ozan Irsoy and Claire Cardie. 2014. Deep recursive neural networks for compositionality in language. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14). 2096--2104.

Digital Library

[18]

Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL’14). Association for Computer Linguistics, 655--665.

[19]

Yoon Kim. 2014. Convolutional neural networks for sentence classification. In EMNLP. Association for Computational Linguistics, 1746--1751.

[20]

Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics—Volume 1. Association for Computational Linguistics, 423--430.

[21]

Thomas K. Landauer and Susan T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104, 2 (1997), 211.

[22]

Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2015. Molding CNNs for text: Non-linear, non-consecutive convolutions. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association of Computational Linguistics, 1565--1575.

[23]

Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. arXiv:1506.01057 (2015).

[24]

Biao Liu, Minlie Huang, Jiashen Sun, and Xuan Zhu. 2015. Incorporating domain and sentiment supervision in representation learning for domain adaptation. In Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 1277--1283.

[25]

Tomas Mikolov. 2012. Statistical language models based on neural networks. Presentation at Google, Mountain View, April 2, 2012.

[26]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. CoRR (2013).

[27]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13). 3111--3119.

Digital Library

[28]

Jeff Mitchell and Mirella Lapata. 2008. Vector-based models of semantic composition. In Proceedings of ACL. 236--244.

[29]

Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 115--124.

Digital Library

[30]

Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2, 1--2 (2008), 1--135.

Digital Library

[31]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP’14) 12 (2014), 1532--1543.

[32]

Qiao Qian, Bo Tian, Minlie Huang, Yang Liu, Xuan Zhu, and Xiaoyan Zhu. 2015. Learning tag embeddings and tag-specific composition functions in recursive neural network. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Vol. 1. 1365--1374.

[33]

Sebastian Rudolph and Eugenie Giesbrecht. 2010. Compositional matrix-space models of language. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). Association for Computer Linguistics, 907--916.

[34]

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature 323 (1986), 533--536.

[35]

Aliaksei Severyn and Alessandro Moschitti. 2015a. On the automatic learning of sentiment lexicons. In Proceedings of the NAACL HLT 2015 Conference of the North American Chapter of the Association for Computational Linguistics. 1397--1402.

[36]

Aliaksei Severyn and Alessandro Moschitti. 2015b. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM SIGIR, 959--962.

Digital Library

[37]

Paul Smolensky. 1990. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence 46, 1 (1990), 159--216.

Digital Library

[38]

Richard Socher, John Bauer, Christopher D. Manning, and Andrew Y. Ng. 2013a. Parsing with compositional vector grammars. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computer Linguistics, 455--465.

[39]

Richard Socher, Eric H. Huang, Jeffrey Pennin, Christopher D. Manning, and Andrew Y. Ng. 2011a. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS’11). 801--809.

[40]

Richard Socher, Brody Huval, Christopher D. Manning, and Andrew Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). Association for Computational Linguistics, 1201--1211.

[41]

Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. 2011b. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). Association for Computational Linguistics, 151--161.

[42]

Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013b. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP. Association for Computational Linguistics, 1631--1642.

[43]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of Advances in Neural Information Processing Systems. 3104--3112.

[44]

Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075 (2015).

[45]

Duyu Tang, Bing Qin, and Ting Liu. 2015. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1422--1432.

[46]

Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1555--1565.

[47]

Zhiyang Teng, Duy-Tin Vo, and Yue Zhang. 2016. Context-sensitive lexicon features for neural sentiment analysis. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1629--1638.

[48]

Tin Duy Vo and Yue Zhang. 2016. Don’t count, predict&excl; an automatic approach to learning sentiment lexicons for short text. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 219--224.

[49]

Sida I. Wang and Christopher D. Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers—Volume 2 (ACL’12). Association for Computational Linguistics, 90--94.

[50]

Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT’05). 347--354.

Digital Library

[51]

Ainur Yessenalina and Claire Cardie. 2011. Compositional matrix-space models for sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). Association for Computer Linguistics, 172--182.

Digital Library

[52]

Xiaodan Zhu, Parinaz Sobhani, and Hongyu Guo. 2015. Long short-term memory over tree structures. arXiv:1503.04881.

Cited By

Wang BZhang RXue BZhao YYang LLiang H(2024)Automatically Distinguishing People’s Explicit and Implicit Attitude Bias by Bridging Psychological Measurements with Sentiment Analysis on Large CorporaApplied Sciences10.3390/app1410419114:10(4191)Online publication date: 15-May-2024
https://doi.org/10.3390/app14104191
Liu PQian WLi HCao J(2024)Semi-Supervised Dimensional Media Sentiment Analysis via Exploring Sample RelationshipsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.330768511:4(5298-5307)Online publication date: Aug-2024
https://doi.org/10.1109/TCSS.2023.3307685
Lin DWen YWang WSu Y(2024)Enhanced Sentiment Intensity Regression Through LoRA Fine-Tuning on Llama 3IEEE Access10.1109/ACCESS.2024.343835312(108072-108087)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3438353
Show More Cited By

Index Terms

Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Lexical semantics
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Sentiment Analysis in the Light of LSTM Recurrent Neural Networks

Long short-term memory LSTM is a special type of recurrent neural network RNN architecture that was designed over simple RNNs for modeling temporal sequences and their long-range dependencies more accurately. In this article, the authors work with ...
Sentiment Lexicon Enhanced Neural Sentiment Classification
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Sentiment classification is an important task in the sentiment analysis field. Many deep learning based sentiment classification methods have been proposed in recent years. However, these methods usually rely on massive labeled texts to train sentiment ...
Sentiment Classification with Convolutional Neural Network using Multiple Word Representations
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

Most neural network models for sentiment classification use word vectors pre-trained by word embedding methods to represent a word. Although word vectors are trained on large corpus, most of them are restricted by the vocabularies in the corpus. Since ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 35, Issue 3

July 2017

410 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3026478

Editor:
Maarten de Rijke
University of Amsterdam, The Netherlands

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2017

Accepted: 01 December 2016

Revised: 01 October 2016

Received: 01 June 2016

Published in TOIS Volume 35, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Basic Research Program (973 Program)
National Science Foundation of China
Beijing Higher Education Young Elite Teacher Project

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

85
Total Citations
View Citations
1,037
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)6

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang BZhang RXue BZhao YYang LLiang H(2024)Automatically Distinguishing People’s Explicit and Implicit Attitude Bias by Bridging Psychological Measurements with Sentiment Analysis on Large CorporaApplied Sciences10.3390/app1410419114:10(4191)Online publication date: 15-May-2024
https://doi.org/10.3390/app14104191
Liu PQian WLi HCao J(2024)Semi-Supervised Dimensional Media Sentiment Analysis via Exploring Sample RelationshipsIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.330768511:4(5298-5307)Online publication date: Aug-2024
https://doi.org/10.1109/TCSS.2023.3307685
Lin DWen YWang WSu Y(2024)Enhanced Sentiment Intensity Regression Through LoRA Fine-Tuning on Llama 3IEEE Access10.1109/ACCESS.2024.343835312(108072-108087)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3438353
Zahour OBenlahmar EEddaoui AHourane O(2024)Towards a System that Predicts the Category of Educational and Vocational Guidance Questions, Utilizing Bidirectional Encoder Representations of Transformers (BERT)Engineering Applications of Artificial Intelligence10.1007/978-3-031-50300-9_5(81-94)Online publication date: 20-Feb-2024
https://doi.org/10.1007/978-3-031-50300-9_5
Lee TLee HLee JKim J(2023)Ensemble Approach to Combining Episode Prediction Models Using Sequential Circadian Rhythm Sensor Data from Mental Health PatientsSensors10.3390/s2320854423:20(8544)Online publication date: 18-Oct-2023
https://doi.org/10.3390/s23208544
Allouch MMansbach NAzaria AAzoulay R(2023)Utilizing Machine Learning for Detecting Harmful Situations by Audio and TextApplied Sciences10.3390/app1306392713:6(3927)Online publication date: 20-Mar-2023
https://doi.org/10.3390/app13063927
Smitha ESendhilkumar SMahalakshmi G(2023)Intelligence system for sentiment classification with deep topic embedding using N-gram based topic modelingJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23024645:1(1539-1565)Online publication date: 2-Jul-2023
https://doi.org/10.3233/JIFS-230246
Li MXie YYang WChen S(2023)Multistream BertGCN for Sentiment Classification Based on Cross-Document LearningQuantum Engineering10.1155/2023/36689602023(1-9)Online publication date: 13-Nov-2023
https://doi.org/10.1155/2023/3668960
Tao JZhou L(2023)Can Online Consumer Reviews Signal Restaurant Closure: A Deep Learning-Based Time-Series AnalysisIEEE Transactions on Engineering Management10.1109/TEM.2020.301632970:3(834-848)Online publication date: Mar-2023
https://doi.org/10.1109/TEM.2020.3016329
Raghavendra Nayaka PRanjan R(2023)An Efficient framework for Metadata Extraction over Scholarly Documents using Ensemble CNN and BiLSTM Technique2023 2nd International Conference for Innovation in Technology (INOCON)10.1109/INOCON57975.2023.10101029(1-9)Online publication date: 3-Mar-2023
https://doi.org/10.1109/INOCON57975.2023.10101029
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents