Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification

Suman, Chanchal; Saha, Sriparna; Bhattacharyya, Pushpak; Chaudhari, Rohit Shyamkant

doi:10.1007/s12559-020-09715-7

Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification

Published: 02 March 2020

Volume 13, pages 261–276, (2021)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Chanchal Suman¹,
Sriparna Saha¹,
Pushpak Bhattacharyya¹ &
…
Rohit Shyamkant Chaudhari¹

773 Accesses
15 Citations
Explore all metrics

Abstract

In the current paper, we have proposed a new multi-modal authorship verification approach for social media texts. Authorship verification is a task of verifying whether an unknown text is written by a suspect or not. Use of social media like Facebook and Twitter is increasing day by day because of digitization. People have grown accustomed to regularly post or tweet about their everyday life, memorable incidences, random thoughts, opinions, and much more. Emojis are widely used in these tweets and posts. The writing style of a user can differ from others, since word choices, sentence structures, usage of punctuation symbols, and use of emoji can be different. We have applied a multi-modal Siamese-based framework for automatic extraction of features from the given texts and emojis. After the extraction of features, the extracted features are applied to a neural network–based architecture for binary classification. A multi-modal Twitter-based dataset is created for evaluating the performance of the proposed framework. We obtained an average accuracy of 61.56% with 78.08%, 61.50%, and 58.32% precision, recall, and f-measure values, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sarcasm Detection for Japanese Text Using BERT and Emoji

Exploring Emojis for Emotion Recognition in Portuguese Text

On the Multilingual and Genre Robustness of EmoGraphs for Author Profiling in Social Media

Notes

https://www.seventeen.com/celebrity/a25486/celebs-favorite-emojis-decoded/
https://www.lightworkers.com/celebs-emojis/
http://www.mtv.com.au/style/pictures/celebrities-most-used-emojis
https://www.reddit.com/r/datasets/comments/6fniik/overonemilliontweetscollectedfromus/
https://archive.ics.uci.edu/ml/datasets/Reuter_50_50
https://pan.webis.de/clef19/pan19-web/
The dataset can be received for research purpose by mailing the authors.

References

Ahmed H, et al. 2019. Sample size in arabic authorship verification. Association for Computational Linguistics.
Ahmed H. Dynamic similarity threshold in authorship verification: evidence from classical arabic. Procedia Comput Sci 2017;117:145–52.
Article Google Scholar
Ahmed H. The role of linguistic feature categories in authorship verification. Procedia Comput Sci 2018;142: 214–21.
Article Google Scholar
Ahmed H. Distance-based authorship verification across modern standard arabic genres. Proceedings of the 3rd workshop on arabic corpus linguistics; 2019. p. 89–96.
Al-Ghadir AI, Azmi AM. A study of arabic social media users—posting behavior and author’s gender prediction. Cogn Comput 2019;11(1):71–86.
Article Google Scholar
Bagnall D. 2015.
Bartoli A, Dagri A, De Lorenzo A, Medvet E, Tarlao F. An author verification approach based on differential features. Conference and labs of the evaluation forum. CEUR; 2015.
Bevendorff J, Hagen M, Stein B, Potthast M. Bias analysis and mitigation in the evaluation of authorship verification. Proceedings of the 57th annual meeting of the association for computational linguistic; 2019. p. 6301–6.
Boumber D, Zhang Y, Hosseinia M, Mukherjee A, Vilalta R. 2019. Robust authorship verification with transfer learning. Tech. rep., EasyChair.
Brocardo ML, Traore I, Saad S, Woungang I. Authorship verification for short messages using stylometry. 2013 International conference on computer, information and telecommunication systems (CITS). IEEE; 2013. p. 1–6.
Brocardo ML, Traore I, Woungang I. Authorship verification of e-mail and tweet messages applied for continuous authentication. J Comput Syst Sci 2015;81(8):1429–40.
Article MathSciNet Google Scholar
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R. Signature verification using a“ siamese” time delay neural network. Advances in neural information processing systems; 1994. p. 737–44.
Canales O, Monaco V, Murphy T, Zych E, Stewart J, Castro CTA, Sotoye O, Torres L, Truley G. 2011. A stylometry system for authenticating students taking online tests. P. of Student-Faculty Research Day, Ed., CSIS. Pace University.
Castro DC, Arcia YA, Brioso MP, Guillena RM. Authorship verification, average similarity analysis. Proceedings of the international conference recent advances in natural language processing; 2015. p. 84–90.
Ding SH, Fung BC, Iqbal F, Cheung WK. Learning stylometric representations for authorship analysis. IEEE Trans Cybern 2017;49(1):107–21.
Article Google Scholar
Eisner B, Rocktäschel T, Augenstein I, Bošnjak M, Riedel S. 2016. emoji2vec: learning emoji representations from their description. arXiv:1609.08359.
Fréry J, Largeron C, Juganaru-Mathieu M. 2014. Ujm at clef in author identification. In: Proceedings CLEF-2014, Working Notes, pp 1042–48.
Frery J, Largeron C, Juganaru-Mathieu M. Ujm at clef in author verification based on optimized classification trees. Proc. int. conf. CLEF notebook PAN; 2014. p. 1042–8.
Halvani O, Graner L, Vogel I. Authorship verification in the absence of explicit features and thresholds. European conference on information retrieval. Springer; 2018. p. 454–65.
Hochreiter S, Schmidhuber J. Long short-term memory. Neur Comput 1997;9(8):1735–80.
Article Google Scholar
Hosseinia M, Mukherjee A. 2018. Experiments with neural networks for small and large scale authorship verification. arXiv:1803.06456.
Hürlimann M, Weck B, van den Berg E, Suster S, Nissim M. Glad: groningen lightweight authorship detection. CLEF (Working Notes); 2015.
Kestemont M, Tschuggnall M, Stamatatos E, Daelemans W, Specht G, Stein B, Potthast M. Overview of the author identification task at pan-2018: cross-domain authorship attribution and style change detection. Working notes papers of the CLEF 2018 evaluation labs. Avignon, France, September 10-14, 2018/Cappellato, Linda [edit.]; et al; 2018. p. 1–25.
Khonji M, Iraqi Y. A slightly-modified gi-based author-verifier with lots of features (asgalf). CLEF (Working Notes) 2014;1180:977–83.
Google Scholar
Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. ICML deep learning workshop; 2015.
Kocher M, Savoy J. A simple and efficient algorithm for authorship verification. J Assoc Inform Sci Technol 2017;68(1):259–69.
Article Google Scholar
Koppel M, Schler J. Authorship verification as a one-class classification problem. Proceedings of the twenty-first international conference on machine learning. ACM; 2004 . p. 62.
Koppel M, Schler J, Argamon S. Authorship attribution in the wild. Lang Resour Eval 2011;45(1):83–94.
Article Google Scholar
Koppel M, Schler J, Bonchek-Dokow E. Measuring differentiability: unmasking pseudonymous authors. J Mach Learn Res 2007;8:1261–76.
MATH Google Scholar
Koppel M, Winter Y. Determining if two documents are written by the same author. J Assoc Inform Sci Technol 2014;65(1):178–87.
Article Google Scholar
Li Y, Yang L, Xu B, Wang J, Lin H. 2019. Improving user attribute classification with text and social network attention. Cogn Comput, 1–10.
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems; 2013. p. 3111–9.
Moreau E, Jayapal A, Lynch G, Vogel C. 2015. Author verification: basic stacked generalization applied to predictions from a set of heterogeneous learners-notebook for pan at clef 2015.
Pennington J, Socher R, Manning C. Glove: global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP); 2014. p. 1532–43.
Poria S, Cambria E, Hussain A, Huang GB. Towards an intelligent framework for multimodal affective data analysis. Neur Netw 2015;63:104–16.
Article Google Scholar
Poria S, Chaturvedi I, Cambria E, Hussain A. Convolutional mkl based multimodal emotion recognition and sentiment analysis. 2016 IEEE 16th international conference on data mining (ICDM). IEEE; 2016. p. 439–48.
Potha N, Stamatatos E. A profile-based method for authorship verification. Hellenic conference on artificial intelligence. Springer; 2014. p. 313–26.
Potha N, Stamatatos E. An improved impostors me verification. International conference of the cross-language evaluation forum for European languages. Springer; 2017. p. 138–44.
Potha N, Stamatatos E. Intrinsic author verification using topic modeling. Proceedings of the 10th Hellenic conference on artificial intelligence. ACM; 2018. p. 20.
Potha N, Stamatatos E. Dynamic ensemble selection for author verification. European conference on information retrieval. Springer; 2019. p. 102–15.
Schwartz R, Tsur O, Rappoport A, Koppel M. Authorship attribution of micro-messages. Proceedings of the 2013 conference on empirical methods in natural language processing; 2013. p. 1880–91.
Seidman S. Authorship verification using the impostors method. CLEF 2013 evaluation labs and workshop-online working notes. Citeseer; 2013.
Stamatatos E. Authorship verification: a review of recent advances. Res Comput Sci 2016;123:9–25.
Article Google Scholar
Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 2018;13(3):55–75.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Chanchal Suman, Sriparna Saha, Pushpak Bhattacharyya & Rohit Shyamkant Chaudhari

Authors

Chanchal Suman
View author publications
You can also search for this author in PubMed Google Scholar
Sriparna Saha
View author publications
You can also search for this author in PubMed Google Scholar
Pushpak Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar
Rohit Shyamkant Chaudhari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chanchal Suman.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suman, C., Saha, S., Bhattacharyya, P. et al. Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification. Cogn Comput 13, 261–276 (2021). https://doi.org/10.1007/s12559-020-09715-7

Download citation

Received: 03 October 2019
Accepted: 21 January 2020
Published: 02 March 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s12559-020-09715-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification

Abstract

Access this article

Similar content being viewed by others

Sarcasm Detection for Japanese Text Using BERT and Emoji

Exploring Emojis for Emotion Recognition in Portuguese Text

On the Multilingual and Genre Robustness of EmoGraphs for Author Profiling in Social Media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification

Abstract

Access this article

Similar content being viewed by others

Sarcasm Detection for Japanese Text Using BERT and Emoji

Exploring Emojis for Emotion Recognition in Portuguese Text

On the Multilingual and Genre Robustness of EmoGraphs for Author Profiling in Social Media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Ethical Approval

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation