Skip to main content
Log in

Novel authorship verification model for social media accounts compromised by a human

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

A Correction to this article was published on 16 February 2021

This article has been updated

Abstract

Social media networks usage is spreading but accompanied by a new shape of the social engineering attacks in which users’ accounts are compromised by attackers to spread malicious messages for different purposes. To overcome these attacks, authorship verification, a classification problem for classifying a text, whether it belongs to a user or not, is needed. Moreover, the verification must be accurate and fast. Herein, an authorship verification model proposed. The model uses XGBoost, as a preprocessor, to discover functional features of the text message, which ranked using MCDM methods to build a classification model. Twitter messages are used to test the model; however, any social media’s data might be used. The suggested model was evaluated against a crawled dataset from Twitter composed of 16124 tweets with 280 characters. The proposed method achieved F-score over 0.94.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Change history

References

  1. Al-Khatib MA, Al-qaoud JK (2020) Authorship verification of opinion articles in online newspapers using the idiolect of author: a comparative study. Inf Commun Soc:1–19

  2. Alazab M, Huda S, Abawajy J, Islam R, Yearwood J, Venkatraman S, Broadhurst R (2014) A hybrid wrapper-filter approach for malware detection. J Netw 9(11):2878–2891

    Google Scholar 

  3. Barbon S, Igawa RA, Zarpelão B. B. (2017) Authorship verification applied to detection of compromised accounts on online social networks. Multimed Tools Appl 76(3):3213–3233

    Article  Google Scholar 

  4. Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, pp 12

  5. Bhattacharya S, Kaluri R, Singh S, Alazab M, Tariq U, et al. (2020) A novel pca-firefly based xgboost classification model for intrusion detection in networks using gpu. Electronics 9(2):219

    Article  Google Scholar 

  6. Boenninghoff B, Rupp J, Nickel RM, Kolossa D (2020) Deep bayes factor scoring for authorship verification. arXiv:2008.10105

  7. borison R (2014) Presenting: The 100 most influential tech people on twitter. https://www.businessinsider.com/100-influential-tech-people-on-twitter-2014-4

  8. Brestovac G, Grgurinam R (2013) Applying multi-criteria decision analysis methods in embedded systems design

  9. Calabresi M (2017) Inside russia’s social media war on america. Time Magazine

  10. Castro A, Lindauer B (2012) Author identification on twitter

  11. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, pp 785–794

  12. Egele M, Stringhini G, Kruegel C, Vigna G (2013) Compa: Detecting compromised accounts on social networks. In: NDSS

  13. Fülöp J (2005) Introduction to decision making methods. In; BDEI-3 Workshop, washington. Citeseer, pp 1–15

  14. Gong NZ, Frank M, Mittal P (2014) Sybilbelief: a semi-supervised learning approach for structure-based sybil detection. IEEE Trans Inf Forensic Secur 9(6):976–987

    Article  Google Scholar 

  15. Hall MA (1999) Correlation-based feature selection for machine learning

  16. Jahan A, Edwards KL, Bahraminasab M (2016) Multi-criteria decision analysis for supporting the selection of engineering materials in product design. Butterworth-Heinemann

  17. Juola P et al (2008) Authorship attribution. Found Trends®; Inf Retr 1(3):233–334

    Article  Google Scholar 

  18. Kaur R, Singh S, Kumar H (2018) Authcom: Authorship verification and compromised account detection in online social networks using ahp-topsis embedded profiling based technique. Expert Syst Appl 113:397–414

    Article  Google Scholar 

  19. Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Data preprocessing for supervised leaning. Int J Comput Sci 1(2):111–117

    Google Scholar 

  20. Kumar GD, Kumar GD (2018) Machine learning techniques for improved business analytics. IGI global

  21. Lagerholm F (2017) Using artificial intelligence to verify authorship of anonymous social media posts

  22. Lee E (2013) Associated press twitter account hacked in marketmoving attack. Bloomberg Technology

  23. Li JS, Monaco JV, Chen LC, Tappert CC (2014) Authorship authentication using short messages from social networking sites. In: 2014 IEEE 11Th international conference on e-business engineering. IEEE, pp 314–319

  24. Maria KA (2016) Authorship Attribution Forensics: Feature selection methods in authorship identification using a small e-mail dataset. Master’s thesis, Technoglossia University, Greec

  25. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  26. Nauta M (2016) Detecting hacked twitter accounts by examining behavioural change using twtter metadata. In: Proceedings of the 25th Twente Student Conference on IT

  27. Okereafor K, Adelaiye O Randomized cyber attack simulation model: A cybersecurity mitigation proposal for post covid-19 digital era

  28. Parmigiani G (2001) Decision theory. Bayesian

  29. phys.org: Twitter to double tweet limit to 280 characters (update) (2017). https://phys.org/news/2017-11-twitter-character-limit.html

  30. Press CU (2009) Tokenization. https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html

  31. Ramos J, et al. (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, vol 242, Piscataway, pp 133–142

  32. Rocha A, Scheirer WJ, Forstall CW, Cavalcante T, Theophilo A, Shen B, Carvalho AR, Stamatatos E (2016) Authorship attribution for social media forensics. IEEE Trans Inf Forensic Secur 12(1):5–33

    Article  Google Scholar 

  33. Roszkowska E (2013) Rank ordering criteria weighting methods–a comparative overview

  34. Saaty TL (2001) Decision making with the analytic network process (anp) and its super decisions software: the national missile defense (nmd) example. ISAHP 2001 proceedings, pp 2–4

  35. Saaty TL (2005) Theory and applications of the analytic network process: decision making with benefits, opportunities, costs, and risks. RWS publications

  36. Saaty TL (2008) Decision making with the analytic hierarchy process. Int J Serv Sci 1(1):83–98

    Google Scholar 

  37. Savyan P, Bhanu SMS (2020) Ubcadet: detection of compromised accounts in twitter based on user behavioural profiling. Multimed Tools Appl:1–37

  38. Schoenfeld B, Giraud-Carrier C, Poggemann M, Christensen J, Seppi K (2018) Preprocessor selection for machine learning pipelines. arXiv:1810.09942

  39. Seyler D, Li L, Zhai C (2018) Identifying compromised accounts on social media using statistical text analysis. arXiv:1804.07247

  40. Sivic J, Zisserman A (2008) Efficient visual search of videos cast as text retrieval. IEEE Trans Pattern Anal Mach Intell 31(4):591–606

    Article  Google Scholar 

  41. Stamatatos E (2009) A survey of modern authorship attribution methods. J Amer Soc Inf Sci Technol 60(3):538–556

    Article  Google Scholar 

  42. Statista: Number of social network users worldwide from 2017 to 2025 (2020). https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/

  43. live stats, I.: Twitter usage statistics (2020). https://www.internetlivestats.com/twitter-statistics/

  44. Steinert-Threlkeld ZC (2018) Twitter as data. Cambridge University Press

  45. Suman C, Saha S, Bhattacharyya P, Chaudhari RS (2020) Emoji helps! a multi-modal siamese architecture for tweet user verification. Cogn Comput:1–16

  46. Trång D, Johansson F, Rosell M (2015) Evaluating algorithms for detection of compromised social media user accounts. In: 2015 Second european network intelligence conference. IEEE, pp 75–82

  47. Usha A, Thampi SM (2017) Authorship analysis of social media contents using tone and personality features. In: International conference on security, privacy and anonymity in computation, communication and storage. Springer, pp 212–228

  48. Worldometers: World population projections (2020). https://www.worldometers.info/world-population/world-population-projections/

  49. Zangerle E, Specht G (2014) Sorry, i was hacked: a classification of compromised twitter accounts. In: Proceedings of the 29th annual acm symposium on applied computing. ACM, pp 587–593

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suleyman Alterkavı.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The author name “Suleyman Alterkavı” was incorrectly presented.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alterkavı, S., Erbay, H. Novel authorship verification model for social media accounts compromised by a human. Multimed Tools Appl 80, 13575–13591 (2021). https://doi.org/10.1007/s11042-020-10361-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10361-2

Keywords

Navigation