Skip to main content
Log in

An intrinsic authorship verification technique for compromised account detection in social networks

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The proliferation of social networks resulted in a remarkable increase in their popularity empowering users to create, share and exchange content for interaction and communication among them. However, these have also opened new avenues for malicious and unauthorized use. This paper presents an intrinsic profiling-based technique for the assessment of authorship verification and its application toward detection of compromised accounts. For the same, efficiency of different textual features has been examined. Four categories of features, namely content free, content specific, stylometric and folksonomy, are extracted and evaluated. Experiments are performed with 3057 twitter users taking 4000 latest tweets for each user. Various feature selection techniques are used to rank and select optimal features for each user which are further combined using a popular rank aggregation technique called BORDA. The problem of authorship verification in this paper is studied as a unary classification problem. Performance of various one-class classifiers, namely Local Outlier Factor, Isolation Forest and One-Class SVM, is analyzed on the basis of different evaluation metrics. Experimental results depict that OCC-SVM with rbf kernel outperformed other one class classifiers attaining an average F-score of 87.29% and Matthews Correlation Coefficient of 74.42% under varied parameter settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html.

References

  • Al-Andoli M, Cheah WP, Tan SC (2020) Deep learning-based community detection in complex networks with network partitioning and reduction of trainable parameters. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02389-x

  • Al-Ayyoub M, Al-andoli M, Jararweh Y, Smadi M, Gupta B (2019) Improving fuzzy c-mean-based community detection in social networks using dynamic parallelism. Comput Electr Eng 74:533–546

    Article  Google Scholar 

  • Al-Qurishi M, Alhuzami S, AlRubaian M, Hossain MS, Alamri A, Rahman MA (2018) User profiling for big social media data using standing ovation model. Multimed Tools Appl 77(9):11179–11201

    Article  Google Scholar 

  • Amato F, Moscato V, Picariello A, Sperli’ì G (2019) Extreme events management using multimedia social networks. Future Gener Comput Syst 94:444–452

    Article  Google Scholar 

  • Barbon S, Igawa RA, Zarpelao BB (2017) Authorship verification applied to detection of compromised accounts on online social networks. Multimed Tools Appl 76(3):3213–3233

    Article  Google Scholar 

  • Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104

  • Brocardo ML, Traore I (2014) Continuous authentication using micro-messages. In: 2014 12th annual international conference on privacy, security and trust (PST). IEEE, pp 179–188

  • Brocardo ML, Traore I, Woungang I (2015) Authorship verification of e-mail and tweet messages applied for continuous authentication. J Comput Syst Sci 81(8):1429–1440

    Article  MathSciNet  MATH  Google Scholar 

  • Brocardo ML, Traore I, Woungang I, Obaidat MS (2017) Authorship verification using deep belief network systems. Int J Commun Syst 30(12):1–10

    Article  Google Scholar 

  • Brocardo ML, Traore I, Woungang I (2019) Continuous authentication using writing style. In: Obaidat M, Traore I, Woungang I (eds) Biometric-based physical and cybersecurity systems. Springer, Berlin, pp 211–232

    Chapter  Google Scholar 

  • Chakraborty M, Pal S, Pramanik R, Chowdary CR (2016) Recent developments in social spam detection and combating techniques: a survey. Inf Process Manag 52(6):1053–1073

    Article  Google Scholar 

  • de Borda JC (1784) Mémoire sur les élections au scrutin

  • Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank aggregation methods for the web. In: Proceedings of the 10th international conference on World Wide Web. ACM, pp 613–622

  • Egele M, Stringhini G, Kruegel C, Vigna G (2017) Towards detecting compromised accounts on social networks. IEEE Trans Dependable Secure Comput 14(4):447–460

    Article  Google Scholar 

  • Feng W, Zhang Z, Wang J, Han L (2016) A novel authorization delegation scheme for multimedia social networks by using proxy re-encryption. Multimed Tools Appl 75(21):13995–14014

    Article  Google Scholar 

  • Green RM, Sheppard JW (2013) Comparing frequency-and style-based features for twitter author identification. In: FLAIRS conference. AAAI, pp 64–69

  • Gupta BB, Perez GM, Agrawal DP, Gupta D (2020) Handbook of computer networks and cyber security. Springer, Berlin

    Book  Google Scholar 

  • Halvani O, Steinebach M (2014) Vebav-a simple, scalable and fast authorship verification scheme. In: CLEF (working notes), pp 1049–1062

  • Halvani O, Graner L, Vogel I (2018a) Authorship verification in the absence of explicit features and thresholds. In: European conference on information retrieval. Springer, pp 454–465

  • Halvani O, Winter C, Graner L (2018b) Unary and binary classification approaches and their implications for authorship verification. arXiv preprint arXiv:1901.00399

  • Igawa RA, Almeida A, Zarpelão B, Barbon S Jr (2016) Recognition on online social network by user’s writing style. iSys-Revista Brasileira de Sistemas de Informação 8(3):64–85

    Google Scholar 

  • Inuwa-Dutse I, Liptrott M, Korkontzelos I (2018) Detection of spam-posting accounts on twitter. Neurocomputing 315:496–511

    Article  Google Scholar 

  • Jankowska M, Keselj V, Milios E (2013) Proximity based one-class classification with common n-gram dissimilarity for authorship verification task. In: CLEF 2013 evaluation labs and workshop—working notes papers, pp 23–26

  • Javed A, Burnap P, Rana O (2019) Prediction of drive-by download attacks on twitter. Inf Process Manag 56(3):1133–1145

    Article  Google Scholar 

  • Kaur R, Singh S, Kumar H (2018a) Rise of spam and compromised accounts in online social networks: a state-of-the-art review of different combating approaches. J Netw Comput Appl 112:53–88

    Article  Google Scholar 

  • Kaur R, Singh S, Kumar H (2018b) Authcom: authorship verification and compromised account detection in online social networks using ahp-topsis embedded profiling based technique. Expert Syst Appl 113:397–414

    Article  Google Scholar 

  • Kocher M, Savoy J (2016) Unine at clef 2016: author profiling. In: CLEF (working notes), pp 903–911

  • Kocher M, Savoy J (2017) A simple and efficient algorithm for authorship verification. J Assoc Inf Sci Technol 68(1):259–269

    Article  Google Scholar 

  • Koppel M, Schler J (2004) Authorship verification as a one-class classification problem. In: Proceedings of the 21st international conference on machine learning. ACM, p 62

  • Koppel M, Winter Y (2014) Determining if two documents are written by the same author. J Assoc Inf Sci Technol 65(1):178–187

    Article  Google Scholar 

  • Li R, Wang S, Deng H, Wang R, Chang KC-C (2012) Towards social user profiling: unified and discriminative influence model for inferring home locations. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1023–1031

  • Li JS, Chen L-C, Monaco JV, Singh P, Tappert CC (2016) A comparison of classifiers and features for authorship authentication of social networking messages. Concurr Comput Pract Exp 29(14):1–15

    Google Scholar 

  • Li C, Zhang Z, Zhang L (2018) A novel authorization scheme for multimedia social networks under cloud storage method by using ma-cp-abe. Int J Cloud Appl Comput 8(3):32–47

    Google Scholar 

  • Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 8th IEEE international conference on data mining. IEEE, pp 413–422

  • Lorena LH, Carvalho AC, Lorena AC (2015) Filter feature selection for one-class classification. J Intell Robot Syst 80(1):227–243

    Article  Google Scholar 

  • Miller Z, Dickinson B, Deitrick W, Hu W, Wang AH (2014) Twitter spammer detection using data stream clustering. Inf Sci 260:64–73

    Article  Google Scholar 

  • Namsrai E, Munkhdalai T, Li M, Shin J-H, Namsrai O-E, Ryu KH (2013) A feature selection-based ensemble method for arrhythmia classification. J Inf Process Syst 9(1):31–40

    Article  Google Scholar 

  • Neal T, Sundararajan K, Woodard D (2018) Exploiting linguistic style as a cognitive biometric for continuous verification. In: 2018 international conference on biometrics (ICB). IEEE, pp 270–276

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Peng J, Choo K-KR, Ashman H (2016) Bit-level n-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J Netw Comput Appl 70:171–182

    Article  Google Scholar 

  • Prati RC (2012) Combining feature ranking algorithms through rank aggregation. In: The 2012 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  • Ruan X, Wu Z, Wang H, Jajodia S (2016) Profiling online social behaviors for compromised account detection. IEEE Trans Inf Forensics Secur 11(1):176–187

    Article  Google Scholar 

  • Saari D (2001) Chaotic elections!: a mathematician looks at voting. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Sageder J, Demleitner A, Irlbacher O, Wimmer R (2019) Applying voting methods in user research. In: Proceedings of Mensch und computer 2019, pp 571–575

  • Sahoo SR, Gupta BB (2019) Hybrid approach for detection of malicious profiles in twitter. Comput Elect Eng 76:65–81

    Article  Google Scholar 

  • Schölkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (2000) Support vector method for novelty detection. Adv Neural Inf Process Syst 12:582–588

    Google Scholar 

  • Seidman S (2013) Authorship verification using the impostors method. In: CLEF 2013 evaluation labs and workshop-online working notes, Citeseer

  • Serrai W, Abdelli A, Mokdad L, Hammal Y (2017) Towards an efficient and a more accurate web service selection using mcdm methods. J Comput Sci 22:253–267

    Article  Google Scholar 

  • Seyler D, Li L, Zhai C (2018) Identifying compromised accounts on social media using statistical text analysis. arXiv preprint arXiv:1804.07247

  • Shen Q, Diao R, Su P (2012) Feature selection ensemble. Turing-100 10:289–306

    Google Scholar 

  • Singh J, Sharan A (2015) Relevance feedback based query expansion model using borda count and semantic similarity approach. Comput Intell Neurosci 2015:1–13. https://doi.org/10.1155/2015/568197

  • Singh M, Bansal D, Sofat S (2018) Who is who on twitter-spammer, fake or compromised account? A tool to reveal true identity in real-time. Cybern Syst 49(1):1–25

    Article  Google Scholar 

  • Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437

    Article  Google Scholar 

  • Stamatatos E (2009) A survey of modern authorship attribution methods. J Am Soc Inf Sci Technol 60(3):538–556

    Article  Google Scholar 

  • Trång D, Johansson F, Rosell M (2015) Evaluating algorithms for detection of compromised social media user accounts. In: 2nd European network intelligence conference. IEEE, pp 75–82

  • Tsymbal A, Pechenizkiy M, Cunningham P (2003) Diversity in ensemble feature selection. The University of Dublin: technical report TCD-CS-2003-44

  • Tsymbal A, Pechenizkiy M, Cunningham P (2005) Diversity in search strategies for ensemble feature selection. Inf Fusion 6(1):83–98

    Article  Google Scholar 

  • Van Der Walt E, Eloff J (2018) Using machine learning to detect fake identities: bots vs humans. IEEE Access 6:6540–6549

    Article  Google Scholar 

  • Velayudhan SP, Somasundaram MSB (2019) Compromised account detection in online social networks: a survey. Concurr Comput Practi Exp 31:e5346

    Google Scholar 

  • Wald R, Khoshgoftaar TM, Dittman D, Awada W, Napolitano A (2012) An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: 2012 IEEE 13th international conference on information reuse and integration (IRI). IEEE, pp 377–384

  • Wang G, Park J, Sandhu R, Wang J, Gui X (2019) Dynamic trust evaluation model based on bidding and multi-attributes for social networks. Int J High Perform Comput Netw 13(4):436–454

    Article  Google Scholar 

  • Wu T, Wen S, Xiang Y, Zhou W (2018) Twitter spam detection: survey of new approaches and comparative study. Comput Secur 76:265–284

    Article  Google Scholar 

  • Zhang X, Ghorbani AA (2020) An overview of online fake news: Characterization, detection, and discussion. Info Process Manag 57(2):1–26

  • Zhang Z, Sun R, Zhao C, Wang J, Chang CK, Gupta BB (2017) Cyvod: a novel trinity multimedia social network scheme. Multimed Tools Appl 76(18):18513–18529

    Article  Google Scholar 

  • Zhang Z, Sun R, Wang X, Zhao C (2019a) A situational analytic method for user behavior pattern in multimedia social networks. IEEE Trans Big Data 5(4):520–528. https://doi.org/10.1109/TBDATA.2017.2657623

  • Zhang Z, Sun R, Choo K-KR, Fan K, Wu W, Zhang M, Zhao C (2019b) A novel social situation analytics-based recommendation algorithm for multimedia social networks. IEEE Access 7:117749–117760

    Article  Google Scholar 

  • Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393

    Article  Google Scholar 

Download references

Acknowledgements

This publication is an outcome of the R&D work under the Visvesvaraya Ph.D. Scheme of Ministry of Electronics & Information Technology, Government of India, being implemented by Digital India Corporation under the Grant No. PhD/MLA/4(61)/2015-16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravneet Kaur.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with direct human participants or animals performed by any of the authors. However, public tweets using Twitter API were fetched for around 3000 users. While fetching the tweets full adherence to Twitter Developer policy and agreement was made.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, R., Singh, S. & Kumar, H. An intrinsic authorship verification technique for compromised account detection in social networks. Soft Comput 25, 4345–4366 (2021). https://doi.org/10.1007/s00500-020-05445-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-05445-y

Keywords

Navigation