Context-augmented convolutional neural networks for twitter sarcasm detection
Introduction
With the development of social media, twitter has become one of the most popular micro-blog services. There are large amounts of valuable information in twitter. So sentiment analysis and data mining based on twitter data has become a heated research topic [1], [2], [3]. Recently, sentiment analysis in twitter has received extensively attentions [4], [5], [6]. The purpose of twitter sentiment analysis is to automatically analyze the polarity of a tweet. However, sarcastic utterance in twitter can transform the polarity of positive or negative utterance into its opposite. To some extent, this affects the performance of sentiment analysis task. So it is very important to distinguish sarcastic statement from the utterances with positive or negative polarity.
Twitter sarcasm detection has attracted increasing research in the past few years [7], [8], [9], [10], [11]. Generally, twitter sarcasm detection task is regarded as a classification problem. Previous work mainly focuses on designing effective features to improve the detection performance according to the tweet content itself. For example, Barbieri et al., uses rich linguistically-motivated features from the target tweet for twitter sarcasm detection [12]. Typically, the used features refer to lexical features (n-gram information), punctuation marks, quotes, emoticons, pronunciations, tweet sentiment information, and word sentiment formation. Moreover, some work tries to design more sophisticated features by using external resources. The used features include POS (Part of Speech) tags, Brown clusters and dependency-based tree structures [13].
The above work has two limitations. First, these work relies on discrete models, requiring large number of manual features, which can be expensive to obtain. Second, these work focuses on designing rich features according to the target tweet itself, which does not utilize contextual information regarding the target tweet or the tweet author. This limits the performance of the task. However, tweets may contain some available contextual information, which includes a conversation or the history tweets of the tweet author. For example, given the following tweet posted by Erik_in_Raleigh, which cites syydsand and gretchlol:
- •
Erik_in_Raleigh: @syydsand @gretchlol this seems like a lie, do you no longer associate with yourself?
Does the tweet tend to sarcastic? It is very hard to know if no contextual information is available. However, if contextual information is given, which is shown as follows:
- •
syydsand: I literally will not associate myself with anyone who lies. worst quality ever. (2015-01-26 03:32:57)
- •
Erik_in_Raleigh: @syydsand @gretchlol this seems like a lie, do you no longer associate with yourself? (2015-01-26 03:53:44)
Here, the target tweet and its contextual tweet form a conversation. Based on this conversation, we can easily infer that this tweet is sarcastic. According to the above example, we know that twitter sarcasm detection can benefit from contextual information.
Recently, some work begins to use contextual features for sarcasm detection [13], [14]. In particular, Rajadasingan et al. propose to model sarcasm detection by using a behavioral approach, using a set of statistical indicators extracted from the target tweet and history tweets [14]. In this work, they only use a type of contextual information: the history-based contexts. Different from this work, Wang et al. propose to model the target tweet and its contextual tweets as a sequence, using a sequence labeling algorithm to jointly predict their category labels [13]. In their work, they use two types of contextual information, including the conversation-based contexts and the history-based contexts. Consistent with the above example, these work suggests that contextual information is useful for twitter sarcasm detection. However, these methods mainly focus on discrete models with spare manual features, which can be expensive to obtain. Different from the above context-based models, we will explore the context-augmented neural network models for twitter sarcasm detection.
More recently, neural networks models have been successfully applied for many NLP (Natural Language Processing) tasks, achieving competitive results [15], [16], [17], [18]. Excellent performance on these tasks shows potentials of neural network models for sarcasm detection. Compared to traditional discrete models, neural network models mainly have two advantages for sarcasm detection. One is neural layers can automatically induce features, avoiding manual feature engineering [15], [16], [18], [19]. The second is neural models use real-valued word vectors, which can be trained from large scale raw texts, solving the feature sparsity problem of discrete models to some extent.
In this paper, we explore the context-augmented convolutional neural network models for twitter sarcasm detection based on three questions. First, neural representation can give strong performance for many NLP tasks [2], [20], [21], but it’s not clear whether neural models can achieve better performance for twitter sarcasm detection. Second, we want to know whether the context-augmented neural models can capture more sarcastic clues from contextual tweets, comparing with discrete context-based models. Third, we want to explore the effects on different contextual information for twitter sarcasm detection. Apparently, the conversation-based contexts and the history-based contexts have different effects for capturing sarcastic evidence.
Results show that neural models outperform state-of-the-art discrete model, demonstrating the advantage of automatically capturing sarcastic clues. Moreover, the proposed context-augmented neural models further improve the detection performance, showing the usefulness of contextual information for twitter sarcasm detection.
Section snippets
Related work
In this section, we will introduce related work from two perspectives, including sarcasm detection and neural network models.
Approach
This paper aims to explore the neural network models for twitter sarcasm detection. In this paper, we explore the usefulness of contextual information from two aspects. Based on two types of contextual information, we propose two context-augmented neural network models to fully capture sarcastic cues from contextual information. First, we think that sarcastic evidence can be easily captured by using some key information of the history-based contexts. Intuitively, the number of the history-based
Dataset
In this paper, we use the dataset constructed by Wang et al.. Statistical information of the dataset is shown in Table 1. Based on Table 1, we know that basic dataset consists of 1500 tweets, which contains all target tweets. For all contextual tweets, Table 1 shows that the history-based context contain 6774 tweets, and the conversation-based context only contains 453 tweets. The above information tell us that the number of the conversation-based contexts is far less than the number of the
Conclusion and Future Work
We proposed the context-augmented neural network models for twitter sarcasm detection. Compared with previous work, our neural model incorporated the features of the tweet content itself and contextual features into a single model in the form of word vectors. Experimental results showed that our proposed context-augmented neural model gave better performance compared with the state-of-the-art discrete model and context-based model, demonstrating the effectiveness of context-augmented neural
Acknowledgments
This work is supported by the State Key Program of National Natural Science Foundation of China (Grant No.61133012), the National Natural Science Foundation of China (Grant No.61702121, 61373108) and the National Philosophy Social Science Major Bidding Project of China (Grant No. 11&ZD189).
Yafeng Ren, received Ph.D degree in computer school from Wuhan University, China, 2015. He was a postdoctoral research fellow with Singapore University of Technology and Design from 2015 to 2016. He is currently an associate professor with Guangdong University of Foreign Studies. His research interests include natural language processing, machine learning and data mining. He has published over 10 papers in related conferences and journals, including AAAI, EMNLP, COLING etc.
References (38)
- et al.
Combine hownet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis
Neurocomputing
(2017) - et al.
A topic-enhanced word embedding for twitter sentiment classification
Information Sciences
(2016) - et al.
Neural networks for deceptive opinion spam detection: an empirical study
Information Sciences
(2017) Sentiment analysis and opinion mining
Synthesis Lectures on Human Language Technologies
(2012)- et al.
Opinion mining and sentiment analysis
Foundations and trends in information retrieval
(2008) - et al.
Using unsupervised information to improve semi-supervised tweet sentiment classification
Information Sciences
(2016) - et al.
Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence
(2016) - et al.
Semi-supervised recognition of sarcastic sentences in twitter and amazon
Proceedings of the Fourteenth Conference on Computational Natural Language Learning
(2010) - et al.
Identifying sarcasm in twitter: a closer look
Proceedings of the Annual Meeting of the Association for Computational Linguistics
(2011) - et al.
A multidimensional approach for detecting irony in twitter
Language Resources and Evaluation
(2013)
Sarcasm as contrast between a positive sentiment and negative situation.
Proceedings of the Conference on Empirical Methods in Natural Language Processing
Sarcasm detection on czech and english twitter.
Proceedings of the 25th International Conference on Computational Linguistics
Modelling irony in twitter
Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics
Twitter sarcasm detection exploiting a context-based model
Proceedings of the International Conference on Web Information Systems Engineering
Sarcasm detection on twitter: A behavioral modeling approach
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining
Recursive deep models for semantic compositionality over a sentiment treebank
Proceedings of the Conference on Empirical Methods in Natural Language Processing
Deep convolutional neural networks for sentiment analysis of short texts
Proceedings of the 25th International Conference on Computational Linguistics
Context-sensitive twitter sentiment classification using neural network
Proceedings of the AAAI Conference on Artificial Intelligence
A convolutional neural network for modelling sentences
arXiv preprint arXiv:1404.2188
Cited by (0)
Yafeng Ren, received Ph.D degree in computer school from Wuhan University, China, 2015. He was a postdoctoral research fellow with Singapore University of Technology and Design from 2015 to 2016. He is currently an associate professor with Guangdong University of Foreign Studies. His research interests include natural language processing, machine learning and data mining. He has published over 10 papers in related conferences and journals, including AAAI, EMNLP, COLING etc.
Donghong Ji, is currently a professor in Wuhan University and Guangdong University of Foreign Studies. He received his Ph.D, M.Sc. and B.Sc. Degrees from Wuhan University in 1995, 1992 and 1989 respectively. His main research interests include natural language processing and information retrieval.
Han Ren, is currently an associate professor in Guangdong University of Foreign Studies. He received his Ph.D degree in Wuhan University, China, in 2011. He was a postdoctoral research fellow with Wuhan University from 2012 to 2012. His research interests include natural language processing and machine learning.