Skip to main content

Emoji Prediction for Portuguese

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12037))

Abstract

Besides alternative text-based forms, emojis became highly common in social media. Given their importance in daily communication, we tackled the problem of emoji prediction in Portuguese social media text. We created a dataset with occurrences of frequent emojis, used as labels, and then compared the performance of traditional machine learning algorithms with neural networks when predicting them. Either considering five or ten of the most popular emojis, an LSTM neural network clearly outperformed Naive Bayes in the latter task, with F1-scores of 60% and 52%, respectively, against 33% and 23%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Real-time usage of emojis in Twitter is available in http://emojitracker.com.

  2. 2.

    Available in http://www.tweepy.org.

  3. 3.

    See https://scikit-learn.org/stable/tutorial/text_analytics/working_with_tex_data.html for using scikit-learn with textual data.

  4. 4.

    We used NLTK’s Portuguese stopword list, https://www.nltk.org.

  5. 5.

    https://keras.io.

References

  1. Barbieri, F., Ballesteros, M., Saggion, H.: Are emojis predictable? In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, pp. 105–111. ACL, April 2017

    Google Scholar 

  2. Barbieri, F., et al.: SemEval 2018 task 2: multilingual emoji prediction. In: Proceedings of the 12th International Workshop on Semantic Evaluation, pp. 24–33 (2018)

    Google Scholar 

  3. Barbieri, F., Kruszewski, G., Ronzano, F., Saggion, H.: How cosmopolitan are emojis?: exploring emojis usage and meaning over different languages with distributional semantics. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 531–535. ACM (2016)

    Google Scholar 

  4. Chen, X., Vorvoreanu, M., Madhavan, K.: Mining social media data for understanding students’ learning experiences. IEEE Trans. Learn. Technol. 7(3), 246–259 (2014)

    Article  Google Scholar 

  5. Cunha, J.M., Martins, P., Machado, P.: Emojinating: representing concepts using emoji. In: Proceedings of the ICCBR 2018 Workshop on Knowledge-Based Systems in Computational Design and Media (KBS-CDM), Stockholm, Sweden (2018)

    Google Scholar 

  6. Duarte, L., Macedo, L., Gonçalo Oliveira, H.: Exploring emojis for emotion recognition in portuguese text. In: Moura Oliveira, P., Novais, P., Reis, L.P. (eds.) EPIA 2019. LNCS (LNAI), vol. 11805, pp. 719–730. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30244-3_59

    Chapter  Google Scholar 

  7. Eisner, B., Rocktäschel, T., Augenstein, I., Bošnjak, M., Riedel, S.: emoji2vec: learning emoji representations from their description. In: Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, Austin, TX, USA, pp. 48–54. ACL Press, November 2016

    Google Scholar 

  8. Graves, A., Fernández, S., Schmidhuber, J.: Bidirectional LSTM networks for improved phoneme classification and recognition. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 799–804. Springer, Heidelberg (2005). https://doi.org/10.1007/11550907_126

    Chapter  Google Scholar 

  9. Guibon, G., Ochs, M., Bellot, P.: Emoji recommendation in private instant messages. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 1821–1823. ACM (2018)

    Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991 (2015)

    Google Scholar 

  12. Novak, P.K., Smailović, J., Sluban, B., Mozetič, I.: Sentiment of emojis. PLoS ONE 10(12), e0144296 (2015)

    Article  Google Scholar 

  13. Pavalanathan, U., Eisenstein, J.: Emoticons vs. emojis on Twitter: a causal inference approach. arXiv preprint arXiv:1510.08480 (2015)

  14. Rodrigues, D., Prada, M., Gaspar, R., Garrido, M.V., Lopes, D.: Lisbon emoji and emoticon database (LEED): norms for emoji and emoticons in seven evaluative dimensions. Behav. Res. Methods 50(1), 392–405 (2018)

    Article  Google Scholar 

  15. Shiha, M., Ayvaz, S.: The effects of emoji in sentiment analysis. Int. J. Comput. Electr. Eng. (IJCEE.) 9(1), 360–369 (2017)

    Article  Google Scholar 

  16. Suttles, J., Ide, N.: Distant supervision for emotion classification with discrete binary values. In: Gelbukh, A. (ed.) CICLing 2013. LNCS, vol. 7817, pp. 121–136. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_11

    Chapter  Google Scholar 

  17. Van Nes, F., Abma, T., Jonsson, H., Deeg, D.: Language differences in qualitative research: is meaning lost in translation? Eur. J. Ageing 7(4), 313–316 (2010)

    Article  Google Scholar 

  18. Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp. 606–615. Association for Computational Linguistics, November 2016

    Google Scholar 

  19. Wood, I.D., Ruder, S.: Emoji as emotion tags for tweets. In: Proceedings of the Emotion and Sentiment Analysis Workshop LREC2016, Portorož, Slovenia, pp. 76–79 (2016)

    Google Scholar 

  20. Xie, R., Liu, Z., Yan, R., Sun, M.: Neural emoji recommendation in dialogue systems. CoRR abs/1612.04609 (2016). http://arxiv.org/abs/1612.04609

  21. Zhao, P., Jia, J., An, Y., Liang, J., Xie, L., Luo, J.: Analyzing and predicting emoji usages in social media. In: Companion Proceedings of the the Web Conference 2018, pp. 327–334. International World Wide Web Conferences Steering Committee (2018)

    Google Scholar 

Download references

Acknowledgements

This work was developed in the scope of the SOCIALITE Project (PTDC/EEISCR/2072/2014), co-financed by COMPETE 2020, Portugal 2020 – Operational Program for Competitiveness and Internationalization (POCI), European Union’s ERDF (European Regional Development Fund), and the Portuguese Foundation for Science and Technology (FCT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Duarte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Duarte, L., Macedo, L., Gonçalo Oliveira, H. (2020). Emoji Prediction for Portuguese. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds) Computational Processing of the Portuguese Language. PROPOR 2020. Lecture Notes in Computer Science(), vol 12037. Springer, Cham. https://doi.org/10.1007/978-3-030-41505-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41505-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41504-4

  • Online ISBN: 978-3-030-41505-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics