Elsevier

Neurocomputing

Volume 308, 25 September 2018, Pages 1-7
Neurocomputing

Context-augmented convolutional neural networks for twitter sarcasm detection

https://doi.org/10.1016/j.neucom.2018.03.047Get rights and content

Abstract

Sarcasm detection on twitter has received increasing research in recent years. However, existing work has two limitations. First, existing work mainly uses discrete models, requiring a large number of manual features, which can be expensive to obtain. Second, most existing work focuses on feature engineering according to the tweet itself, and does not utilize contextual information regarding the target tweet. However, contextual information (e.g. a conversation or the history tweets of the target tweet author) may be available for the target tweet. To address the above two issues, we explore the neural network models for twitter sarcasm detection. Based on convolutional neural network, we propose two different context-augmented neural models for this task. Results on the dataset show that neural models can achieve better performance compared to state-of-the-art discrete models. Meanwhile, the proposed context-augmented neural models can effectively decode sarcastic clues from contextual information, and give a relative improvement in the detection performance.

Introduction

With the development of social media, twitter has become one of the most popular micro-blog services. There are large amounts of valuable information in twitter. So sentiment analysis and data mining based on twitter data has become a heated research topic [1], [2], [3]. Recently, sentiment analysis in twitter has received extensively attentions [4], [5], [6]. The purpose of twitter sentiment analysis is to automatically analyze the polarity of a tweet. However, sarcastic utterance in twitter can transform the polarity of positive or negative utterance into its opposite. To some extent, this affects the performance of sentiment analysis task. So it is very important to distinguish sarcastic statement from the utterances with positive or negative polarity.

Twitter sarcasm detection has attracted increasing research in the past few years [7], [8], [9], [10], [11]. Generally, twitter sarcasm detection task is regarded as a classification problem. Previous work mainly focuses on designing effective features to improve the detection performance according to the tweet content itself. For example, Barbieri et al., uses rich linguistically-motivated features from the target tweet for twitter sarcasm detection [12]. Typically, the used features refer to lexical features (n-gram information), punctuation marks, quotes, emoticons, pronunciations, tweet sentiment information, and word sentiment formation. Moreover, some work tries to design more sophisticated features by using external resources. The used features include POS (Part of Speech) tags, Brown clusters and dependency-based tree structures [13].

The above work has two limitations. First, these work relies on discrete models, requiring large number of manual features, which can be expensive to obtain. Second, these work focuses on designing rich features according to the target tweet itself, which does not utilize contextual information regarding the target tweet or the tweet author. This limits the performance of the task. However, tweets may contain some available contextual information, which includes a conversation or the history tweets of the tweet author. For example, given the following tweet posted by Erik_in_Raleigh, which cites syydsand and gretchlol:

  • Erik_in_Raleigh: @syydsand @gretchlol this seems like a lie, do you no longer associate with yourself?

Does the tweet tend to sarcastic? It is very hard to know if no contextual information is available. However, if contextual information is given, which is shown as follows:

  • syydsand: I literally will not associate myself with anyone who lies. worst quality ever. (2015-01-26 03:32:57)

  • Erik_in_Raleigh: @syydsand @gretchlol this seems like a lie, do you no longer associate with yourself? (2015-01-26 03:53:44)

Here, the target tweet and its contextual tweet form a conversation. Based on this conversation, we can easily infer that this tweet is sarcastic. According to the above example, we know that twitter sarcasm detection can benefit from contextual information.

Recently, some work begins to use contextual features for sarcasm detection [13], [14]. In particular, Rajadasingan et al. propose to model sarcasm detection by using a behavioral approach, using a set of statistical indicators extracted from the target tweet and history tweets [14]. In this work, they only use a type of contextual information: the history-based contexts. Different from this work, Wang et al. propose to model the target tweet and its contextual tweets as a sequence, using a sequence labeling algorithm to jointly predict their category labels [13]. In their work, they use two types of contextual information, including the conversation-based contexts and the history-based contexts. Consistent with the above example, these work suggests that contextual information is useful for twitter sarcasm detection. However, these methods mainly focus on discrete models with spare manual features, which can be expensive to obtain. Different from the above context-based models, we will explore the context-augmented neural network models for twitter sarcasm detection.

More recently, neural networks models have been successfully applied for many NLP (Natural Language Processing) tasks, achieving competitive results [15], [16], [17], [18]. Excellent performance on these tasks shows potentials of neural network models for sarcasm detection. Compared to traditional discrete models, neural network models mainly have two advantages for sarcasm detection. One is neural layers can automatically induce features, avoiding manual feature engineering [15], [16], [18], [19]. The second is neural models use real-valued word vectors, which can be trained from large scale raw texts, solving the feature sparsity problem of discrete models to some extent.

In this paper, we explore the context-augmented convolutional neural network models for twitter sarcasm detection based on three questions. First, neural representation can give strong performance for many NLP tasks [2], [20], [21], but it’s not clear whether neural models can achieve better performance for twitter sarcasm detection. Second, we want to know whether the context-augmented neural models can capture more sarcastic clues from contextual tweets, comparing with discrete context-based models. Third, we want to explore the effects on different contextual information for twitter sarcasm detection. Apparently, the conversation-based contexts and the history-based contexts have different effects for capturing sarcastic evidence.

Results show that neural models outperform state-of-the-art discrete model, demonstrating the advantage of automatically capturing sarcastic clues. Moreover, the proposed context-augmented neural models further improve the detection performance, showing the usefulness of contextual information for twitter sarcasm detection.

Section snippets

Related work

In this section, we will introduce related work from two perspectives, including sarcasm detection and neural network models.

Approach

This paper aims to explore the neural network models for twitter sarcasm detection. In this paper, we explore the usefulness of contextual information from two aspects. Based on two types of contextual information, we propose two context-augmented neural network models to fully capture sarcastic cues from contextual information. First, we think that sarcastic evidence can be easily captured by using some key information of the history-based contexts. Intuitively, the number of the history-based

Dataset

In this paper, we use the dataset constructed by Wang et al.. Statistical information of the dataset is shown in Table 1. Based on Table 1, we know that basic dataset consists of 1500 tweets, which contains all target tweets. For all contextual tweets, Table 1 shows that the history-based context contain 6774 tweets, and the conversation-based context only contains 453 tweets. The above information tell us that the number of the conversation-based contexts is far less than the number of the

Conclusion and Future Work

We proposed the context-augmented neural network models for twitter sarcasm detection. Compared with previous work, our neural model incorporated the features of the tweet content itself and contextual features into a single model in the form of word vectors. Experimental results showed that our proposed context-augmented neural model gave better performance compared with the state-of-the-art discrete model and context-based model, demonstrating the effectiveness of context-augmented neural

Acknowledgments

This work is supported by the State Key Program of National Natural Science Foundation of China (Grant No.61133012), the National Natural Science Foundation of China (Grant No.61702121, 61373108) and the National Philosophy Social Science Major Bidding Project of China (Grant No. 11&ZD189).

Yafeng Ren, received Ph.D degree in computer school from Wuhan University, China, 2015. He was a postdoctoral research fellow with Singapore University of Technology and Design from 2015 to 2016. He is currently an associate professor with Guangdong University of Foreign Studies. His research interests include natural language processing, machine learning and data mining. He has published over 10 papers in related conferences and journals, including AAAI, EMNLP, COLING etc.

References (38)

  • X. Fu et al.

    Combine hownet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis

    Neurocomputing

    (2017)
  • Y. Ren et al.

    A topic-enhanced word embedding for twitter sentiment classification

    Information Sciences

    (2016)
  • Y. Ren et al.

    Neural networks for deceptive opinion spam detection: an empirical study

    Information Sciences

    (2017)
  • B. Liu

    Sentiment analysis and opinion mining

    Synthesis Lectures on Human Language Technologies

    (2012)
  • B. Pang et al.

    Opinion mining and sentiment analysis

    Foundations and trends in information retrieval

    (2008)
  • N.F.F.D. Silva et al.

    Using unsupervised information to improve semi-supervised tweet sentiment classification

    Information Sciences

    (2016)
  • Y. Ren et al.

    Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings

    Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence

    (2016)
  • D. Davidov et al.

    Semi-supervised recognition of sarcastic sentences in twitter and amazon

    Proceedings of the Fourteenth Conference on Computational Natural Language Learning

    (2010)
  • R. Gonzez-Ibez et al.

    Identifying sarcasm in twitter: a closer look

    Proceedings of the Annual Meeting of the Association for Computational Linguistics

    (2011)
  • A. Reyes et al.

    A multidimensional approach for detecting irony in twitter

    Language Resources and Evaluation

    (2013)
  • E. Riloff et al.

    Sarcasm as contrast between a positive sentiment and negative situation.

    Proceedings of the Conference on Empirical Methods in Natural Language Processing

    (2013)
  • T. Ptácek et al.

    Sarcasm detection on czech and english twitter.

    Proceedings of the 25th International Conference on Computational Linguistics

    (2014)
  • F. Barbieri et al.

    Modelling irony in twitter

    Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics

    (2014)
  • Z. Wang et al.

    Twitter sarcasm detection exploiting a context-based model

    Proceedings of the International Conference on Web Information Systems Engineering

    (2015)
  • A. Rajadesingan et al.

    Sarcasm detection on twitter: A behavioral modeling approach

    Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

    (2015)
  • R. Socher et al.

    Recursive deep models for semantic compositionality over a sentiment treebank

    Proceedings of the Conference on Empirical Methods in Natural Language Processing

    (2013)
  • C.N. dos Santos et al.

    Deep convolutional neural networks for sentiment analysis of short texts

    Proceedings of the 25th International Conference on Computational Linguistics

    (2014)
  • Y. Ren et al.

    Context-sensitive twitter sentiment classification using neural network

    Proceedings of the AAAI Conference on Artificial Intelligence

    (2016)
  • N. Kalchbrenner et al.

    A convolutional neural network for modelling sentences

    arXiv preprint arXiv:1404.2188

    (2014)
  • Cited by (0)

    Yafeng Ren, received Ph.D degree in computer school from Wuhan University, China, 2015. He was a postdoctoral research fellow with Singapore University of Technology and Design from 2015 to 2016. He is currently an associate professor with Guangdong University of Foreign Studies. His research interests include natural language processing, machine learning and data mining. He has published over 10 papers in related conferences and journals, including AAAI, EMNLP, COLING etc.

    Donghong Ji, is currently a professor in Wuhan University and Guangdong University of Foreign Studies. He received his Ph.D, M.Sc. and B.Sc. Degrees from Wuhan University in 1995, 1992 and 1989 respectively. His main research interests include natural language processing and information retrieval.

    Han Ren, is currently an associate professor in Guangdong University of Foreign Studies. He received his Ph.D degree in Wuhan University, China, in 2011. He was a postdoctoral research fellow with Wuhan University from 2012 to 2012. His research interests include natural language processing and machine learning.

    View full text