Elsevier

Knowledge-Based Systems

Volume 214, 28 February 2021, 106727
Knowledge-Based Systems

Seq2Emoji: A hybrid sequence generation model for short text emoji prediction

https://doi.org/10.1016/j.knosys.2020.106727Get rights and content

Highlights

  • Seq2Emoji, a hybrid sequence generation model, is proposed for predicting emojis matching short texts.

  • The model can learn the semantics between the text and emoji labels, as well as the correlation between predicted emojis.

  • The model performs well in the diversity of emojis with the diverse beam search algorithm.

  • Our model has good performance on the emoji prediction task.

Abstract

As a new form of visual language, emojis are widely used in social media for their vivid image and rich meaning. Predicting the most likely emojis that fit a particular short text has become an important and challenging task in both academia and industry. In this paper, we propose a hybrid sequence generation model, Seq2Emoji, to predict multiple emojis based on a short text. Seq2Emoji is an encoder–decoder model, in which we consider the correlations between emojis and take the emoji prediction task as a sequence generation problem. It extracts features through a hierarchical structure and self-attention mechanism and decodes them with a composite recurrent neural network before predicting emojis. During the prediction, Diverse Beam Search algorithm is also introduced to increase the diversity of predicted emojis. Experiments are carried out on our collected Weibo dataset (Chinese) and the results show that our proposed Seq2Emoji model is superior to the competitive models in both accuracy and diversity of emoji prediction.

Introduction

Due to the rapid development of the Internet, Social Media platforms such as Sina Weibo and Twitter have provided convenient channels for netizens to exchange their opinions and present their emotions. Users have been accustomed to express rich emotions and subjective tendencies through social platforms [1]. As a new form of visual language, emojis play an important role in the emotional expression and visual effect enhancement of short text messages [2], [3]. The study of the relationship between emojis and text information provides significant help for the mining of user emotions, which has aroused attention from academics. With the development of natural language processing (NLP) technology, in addition to the semantics [4], usage [5] and emotional expression [6], [7] of emoticons, predicting the possible accompanying emojis in text information gradually become one of the most interesting tasks in social media research.

The purpose of the emoji prediction task is to predict emojis that are most likely to be used, based on a given piece of plain text that excludes emojis. As the example shown in Fig. 1, after removing the emojis, we get the plain text ‘

,
’(‘#Weather is Providence# I was finally accepted. Although it was painful at the time, I am so happy now’), according to which our goal is to predict the two emojis
and
that emotionally match the text.

Most of the existing emoji prediction models focus on predicting single emoji. However, with the popularity of emojis, the multi-emoji hybrid expression forms are gradually becoming popular among netizens, which leads to the situation that the single classification methods cannot be directly applied to the classification of mixed emojis. To address this issue, Barbieri et al. [8] proposed the first shared task of the multi-lingual emoji prediction, which focused on emoji prediction based on English and Spanish. They also considered using multi-modal input for emoji prediction tasks [9], that is, combining text information with image information to predict emoticons. In addition, they used emojis and their categories in keyboard conversation as predictive targets to train a multi-tasking model [10] , and further increased the number of emoji labels to 300, which makes the emoji categories more comprehensive. However, these studies are all for single emoji prediction. In fact, a text often contains not just one meaning or one emotion, whereas often multiple expressions of emotion with similar or even opposite meanings. As shown in Fig. 1, the sentence simply uses

or
, which does not fully reflect the text message. The first part of the text expresses sad feelings, but the second part turns to happiness. As we can see, it is difficult to express all the meanings with only one emoji. Therefore, in this work, we focus on multi-emoji prediction tasks that are more suitable for practical applications. In other words, our multi-emoji prediction task is to predict one or more emojis that are likely to appear in a given piece of plain text without explicitly including emojis.

Essentially, the multi-emoji prediction task can be treated as a multi-label classification task and an emoji is regarded as a class label. The system needs to obtain the grammatical and semantic information of the text and predict the emojis with similar meanings and emotions based on the obtained information, which is what deep learning technology is good at [11]. Deep learning can learn the semantic expression and response strategy of text from a large number of sentences. Recently, this technology has gradually replaced the traditional methods of manually extracting rules for classification.

With the development of neural networks, a series of deep learning models based on convolutional neural networks (CNN) and recurrent neural networks (RNN) have gradually became the mainstream machine learning models. Wu et al. [12] regarded the emoji prediction task as a multi-label classification task, proposed a new hierarchical model with attention mechanism, and predicted the 30 most commonly used emojis on the Twitter platform. However, in the final classification process, they simply made a multi-classification and ignored the correlation between the emojis, while in fact there is a correlation between emojis. Lin et al. [13] studied the combination of emojis. They also believed that a text should be accompanied by multiple emojis and adopted a retrieval strategy to predict multiple emojis. However, they did not consider the correlation between labels. Unlike the previous work, in this paper, we take the correlation between emojis into account and combine the relationship between text and emojis to improve the accuracy of emoji prediction.

Currently, the commonly used multi-label classification methods perform poorly on the task of label correlation prediction. However, our idea is to treat the labelset as a sequence, so that the multi-emoji prediction task can be transformed into a sequence generation task. Based on this consideration, this paper proposes a prediction model, named Seq2Emoji, which regards multi-emoji prediction as a sequence generation to better learn the correlation between emojis. Seq2Emoji firstly processes the input sentence through a hierarchical structure of Bi-LSTM-CNN, in which CNN is used to extract local information and Bi-LSTM is employed to extract global information. Secondly, the extracted features are applied to obtain more important information about emojis through a model of attention mechanism. Finally, a combination of RNN models in different directions is used for decoding to calculate the emojis that match the sentence.

The contributions of our work are summarized as follows:

  • We regard the multi-emoji prediction task as a multi-label classification task, and Seq2Emoji, a sequence generation model is constructed for predicting emojis according to short texts. Compared with existing models, it is more suitable for the multi-emoji prediction task in practical applications and pays more attention to the correlation between emojis.

  • In order to address the lack of diversity issues, in the stage of emoji predicting, we introduce Diverse Beam Search algorithm, which can solve the problem on the premise of ensuring the accuracy of emoji generation.

  • We conducted extensive experiments on our collected Weibo dataset and their results suggested that our model is more applicable to emoji prediction than the competitive models.

The rest of our paper is structured as follows. In Section 2, we describe the related work, and in Section 3, after defining the problem, we present the idea and framework of the proposed Seq2Eomji model. Section 4 describes the experimental results and analysis, and finally, in Section 5, we draw conclusions and introduce our future work.

Section snippets

Related work

In recent years, emojis, a new type of ideographic character, have been widely used by Social Media. Compared with simple text, the emoji is a more vivid expression and an artistic imitation of human facial expression and body language. It not only overcomes the abstraction of text language but also expresses emotions more intuitively. Therefore, the combination of emojis and plain text plays a great role in improving the emotional expression of short text messages. Academia has carried out a

Our model

In this section, we start with formally describing the task of multi-emoji prediction, followed by the detailed introduction of our multi-emoji prediction model which is named Seq2Emoji. In order to facilitate our discussion later, the notations we used and their definitions are listed in Table 1.

Experiments

In order to test the validity of the proposed model, we compare our proposed model, Seq2Emoji, with the baseline models. In this section, we introduce the experimental settings firstly, including the dataset, baseline models, evaluation metrics, and so on. Then, we demonstrate the performance of the models on various metrics by comparing their experimental results.

Conclusion

In this work, we propose the Seq2Emoji model, which is an improved model for emoji prediction. It can predict multiple emojis for one piece of text, which is closer to practical applications. Seq2Emoji is an encoder–decoder model, which can generate emojis according to sequence correlation, thus considering the consistency between text and emojis.

To make the proposed model has a better ability to learn sentence representation, in the encoder, the model adopts the combined model of LSTM and CNN

CRediT authorship contribution statement

Dunlu Peng: Conceptualization, Formal analysis, Funding acquisition, Writing - review. Huimin Zhao: Writing - original draft, Writing - editing, Data curation, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The work is supported by the National Natural Science Foundation of China under Grant No. 61772342. We would like to express our special thanks to the reviewers for their comments and the members in our lab for their valuable discussion on this work.

References (47)

  • NovakP.K. et al.

    Sentiment of emojis

    (2015)
  • BarbieriF. et al.

    SemEval 2018 task 2: Multilingual emoji prediction

  • BarbieriF. et al.

    Multimodal emoji prediction

    (2018)
  • F. Barbieri, L.s. Marujo, P. Karuturi, W. Brendel, Multi-task emoji learning,...
  • AlswaidanN. et al.

    A survey of state-of-the-art approaches for emotion recognition in text

    Knowl. Inf. Syst.

    (2020)
  • WuC. et al.

    Tweet emoji prediction using hierarchical model with attention

  • LinW. et al.

    Predict emoji combination with retrieval strategy

    (2019)
  • L. Vidal, G. Ares, S.R. Jaeger, Use of emoticon and emoji in tweets for food-related emotional expression, 49 119–128,...
  • WijeratneS. et al.

    A semantics-based measure of emoji similarity

    (2017)
  • CambriaE. et al.

    Benchmarking multimodal sentiment analysis

    (2017)
  • BarbieriF. et al.

    Are emojis predictable?

    (2017)
  • RonzanoF. et al.

    Overview of the EVALITA 2018 Italian emoji prediction (ITAMoji) task

  • TomihiraT. et al.

    What does your tweet emotion mean?: Neural emoji prediction for sentiment analysis

  • Cited by (10)

    • Exploring the personalization-intrusiveness-intention framework to evaluate the effects of personalization in social media

      2022, International Journal of Information Management
      Citation Excerpt :

      However, most digital marketing and information science studies, for instance, Das et al. (2019), Ge and Gretzel (2018), and McShane et al. (2021) attempted to focus primarily on the bright side of digital mnemonics. Considering the affective characteristics of digital mnemonics (Kaye et al., 2021; Peng & Zhao, 2021; Wolf, 2000), emotional nuances of digital mnemonics in social media personalization could be perceived differently from what the firms intended if they are not aptly used. Thus, developing the PI2 model in this research, we argue that digital mnemonics may affect the relation between mitigating the favorable effect of personalization while increasing the negative influence of intrusiveness.

    • MultiEmo: Multi-task framework for emoji prediction

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Emojis, which are small ideograms depicting objects, have become an essential part of social media, acting as a new visual language [1].

    • PERD: Personalized Emoji Recommendation with Dynamic User Preference

      2022, SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    View all citing articles on Scopus
    View full text