Abstract
The proliferation of social networks resulted in a remarkable increase in their popularity empowering users to create, share and exchange content for interaction and communication among them. However, these have also opened new avenues for malicious and unauthorized use. This paper presents an intrinsic profiling-based technique for the assessment of authorship verification and its application toward detection of compromised accounts. For the same, efficiency of different textual features has been examined. Four categories of features, namely content free, content specific, stylometric and folksonomy, are extracted and evaluated. Experiments are performed with 3057 twitter users taking 4000 latest tweets for each user. Various feature selection techniques are used to rank and select optimal features for each user which are further combined using a popular rank aggregation technique called BORDA. The problem of authorship verification in this paper is studied as a unary classification problem. Performance of various one-class classifiers, namely Local Outlier Factor, Isolation Forest and One-Class SVM, is analyzed on the basis of different evaluation metrics. Experimental results depict that OCC-SVM with rbf kernel outperformed other one class classifiers attaining an average F-score of 87.29% and Matthews Correlation Coefficient of 74.42% under varied parameter settings.
Similar content being viewed by others
References
Al-Andoli M, Cheah WP, Tan SC (2020) Deep learning-based community detection in complex networks with network partitioning and reduction of trainable parameters. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02389-x
Al-Ayyoub M, Al-andoli M, Jararweh Y, Smadi M, Gupta B (2019) Improving fuzzy c-mean-based community detection in social networks using dynamic parallelism. Comput Electr Eng 74:533–546
Al-Qurishi M, Alhuzami S, AlRubaian M, Hossain MS, Alamri A, Rahman MA (2018) User profiling for big social media data using standing ovation model. Multimed Tools Appl 77(9):11179–11201
Amato F, Moscato V, Picariello A, Sperli’ì G (2019) Extreme events management using multimedia social networks. Future Gener Comput Syst 94:444–452
Barbon S, Igawa RA, Zarpelao BB (2017) Authorship verification applied to detection of compromised accounts on online social networks. Multimed Tools Appl 76(3):3213–3233
Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104
Brocardo ML, Traore I (2014) Continuous authentication using micro-messages. In: 2014 12th annual international conference on privacy, security and trust (PST). IEEE, pp 179–188
Brocardo ML, Traore I, Woungang I (2015) Authorship verification of e-mail and tweet messages applied for continuous authentication. J Comput Syst Sci 81(8):1429–1440
Brocardo ML, Traore I, Woungang I, Obaidat MS (2017) Authorship verification using deep belief network systems. Int J Commun Syst 30(12):1–10
Brocardo ML, Traore I, Woungang I (2019) Continuous authentication using writing style. In: Obaidat M, Traore I, Woungang I (eds) Biometric-based physical and cybersecurity systems. Springer, Berlin, pp 211–232
Chakraborty M, Pal S, Pramanik R, Chowdary CR (2016) Recent developments in social spam detection and combating techniques: a survey. Inf Process Manag 52(6):1053–1073
de Borda JC (1784) Mémoire sur les élections au scrutin
Dwork C, Kumar R, Naor M, Sivakumar D (2001) Rank aggregation methods for the web. In: Proceedings of the 10th international conference on World Wide Web. ACM, pp 613–622
Egele M, Stringhini G, Kruegel C, Vigna G (2017) Towards detecting compromised accounts on social networks. IEEE Trans Dependable Secure Comput 14(4):447–460
Feng W, Zhang Z, Wang J, Han L (2016) A novel authorization delegation scheme for multimedia social networks by using proxy re-encryption. Multimed Tools Appl 75(21):13995–14014
Green RM, Sheppard JW (2013) Comparing frequency-and style-based features for twitter author identification. In: FLAIRS conference. AAAI, pp 64–69
Gupta BB, Perez GM, Agrawal DP, Gupta D (2020) Handbook of computer networks and cyber security. Springer, Berlin
Halvani O, Steinebach M (2014) Vebav-a simple, scalable and fast authorship verification scheme. In: CLEF (working notes), pp 1049–1062
Halvani O, Graner L, Vogel I (2018a) Authorship verification in the absence of explicit features and thresholds. In: European conference on information retrieval. Springer, pp 454–465
Halvani O, Winter C, Graner L (2018b) Unary and binary classification approaches and their implications for authorship verification. arXiv preprint arXiv:1901.00399
Igawa RA, Almeida A, Zarpelão B, Barbon S Jr (2016) Recognition on online social network by user’s writing style. iSys-Revista Brasileira de Sistemas de Informação 8(3):64–85
Inuwa-Dutse I, Liptrott M, Korkontzelos I (2018) Detection of spam-posting accounts on twitter. Neurocomputing 315:496–511
Jankowska M, Keselj V, Milios E (2013) Proximity based one-class classification with common n-gram dissimilarity for authorship verification task. In: CLEF 2013 evaluation labs and workshop—working notes papers, pp 23–26
Javed A, Burnap P, Rana O (2019) Prediction of drive-by download attacks on twitter. Inf Process Manag 56(3):1133–1145
Kaur R, Singh S, Kumar H (2018a) Rise of spam and compromised accounts in online social networks: a state-of-the-art review of different combating approaches. J Netw Comput Appl 112:53–88
Kaur R, Singh S, Kumar H (2018b) Authcom: authorship verification and compromised account detection in online social networks using ahp-topsis embedded profiling based technique. Expert Syst Appl 113:397–414
Kocher M, Savoy J (2016) Unine at clef 2016: author profiling. In: CLEF (working notes), pp 903–911
Kocher M, Savoy J (2017) A simple and efficient algorithm for authorship verification. J Assoc Inf Sci Technol 68(1):259–269
Koppel M, Schler J (2004) Authorship verification as a one-class classification problem. In: Proceedings of the 21st international conference on machine learning. ACM, p 62
Koppel M, Winter Y (2014) Determining if two documents are written by the same author. J Assoc Inf Sci Technol 65(1):178–187
Li R, Wang S, Deng H, Wang R, Chang KC-C (2012) Towards social user profiling: unified and discriminative influence model for inferring home locations. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1023–1031
Li JS, Chen L-C, Monaco JV, Singh P, Tappert CC (2016) A comparison of classifiers and features for authorship authentication of social networking messages. Concurr Comput Pract Exp 29(14):1–15
Li C, Zhang Z, Zhang L (2018) A novel authorization scheme for multimedia social networks under cloud storage method by using ma-cp-abe. Int J Cloud Appl Comput 8(3):32–47
Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 8th IEEE international conference on data mining. IEEE, pp 413–422
Lorena LH, Carvalho AC, Lorena AC (2015) Filter feature selection for one-class classification. J Intell Robot Syst 80(1):227–243
Miller Z, Dickinson B, Deitrick W, Hu W, Wang AH (2014) Twitter spammer detection using data stream clustering. Inf Sci 260:64–73
Namsrai E, Munkhdalai T, Li M, Shin J-H, Namsrai O-E, Ryu KH (2013) A feature selection-based ensemble method for arrhythmia classification. J Inf Process Syst 9(1):31–40
Neal T, Sundararajan K, Woodard D (2018) Exploiting linguistic style as a cognitive biometric for continuous verification. In: 2018 international conference on biometrics (ICB). IEEE, pp 270–276
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Peng J, Choo K-KR, Ashman H (2016) Bit-level n-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J Netw Comput Appl 70:171–182
Prati RC (2012) Combining feature ranking algorithms through rank aggregation. In: The 2012 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Ruan X, Wu Z, Wang H, Jajodia S (2016) Profiling online social behaviors for compromised account detection. IEEE Trans Inf Forensics Secur 11(1):176–187
Saari D (2001) Chaotic elections!: a mathematician looks at voting. American Mathematical Society, Providence
Sageder J, Demleitner A, Irlbacher O, Wimmer R (2019) Applying voting methods in user research. In: Proceedings of Mensch und computer 2019, pp 571–575
Sahoo SR, Gupta BB (2019) Hybrid approach for detection of malicious profiles in twitter. Comput Elect Eng 76:65–81
Schölkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (2000) Support vector method for novelty detection. Adv Neural Inf Process Syst 12:582–588
Seidman S (2013) Authorship verification using the impostors method. In: CLEF 2013 evaluation labs and workshop-online working notes, Citeseer
Serrai W, Abdelli A, Mokdad L, Hammal Y (2017) Towards an efficient and a more accurate web service selection using mcdm methods. J Comput Sci 22:253–267
Seyler D, Li L, Zhai C (2018) Identifying compromised accounts on social media using statistical text analysis. arXiv preprint arXiv:1804.07247
Shen Q, Diao R, Su P (2012) Feature selection ensemble. Turing-100 10:289–306
Singh J, Sharan A (2015) Relevance feedback based query expansion model using borda count and semantic similarity approach. Comput Intell Neurosci 2015:1–13. https://doi.org/10.1155/2015/568197
Singh M, Bansal D, Sofat S (2018) Who is who on twitter-spammer, fake or compromised account? A tool to reveal true identity in real-time. Cybern Syst 49(1):1–25
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437
Stamatatos E (2009) A survey of modern authorship attribution methods. J Am Soc Inf Sci Technol 60(3):538–556
Trång D, Johansson F, Rosell M (2015) Evaluating algorithms for detection of compromised social media user accounts. In: 2nd European network intelligence conference. IEEE, pp 75–82
Tsymbal A, Pechenizkiy M, Cunningham P (2003) Diversity in ensemble feature selection. The University of Dublin: technical report TCD-CS-2003-44
Tsymbal A, Pechenizkiy M, Cunningham P (2005) Diversity in search strategies for ensemble feature selection. Inf Fusion 6(1):83–98
Van Der Walt E, Eloff J (2018) Using machine learning to detect fake identities: bots vs humans. IEEE Access 6:6540–6549
Velayudhan SP, Somasundaram MSB (2019) Compromised account detection in online social networks: a survey. Concurr Comput Practi Exp 31:e5346
Wald R, Khoshgoftaar TM, Dittman D, Awada W, Napolitano A (2012) An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: 2012 IEEE 13th international conference on information reuse and integration (IRI). IEEE, pp 377–384
Wang G, Park J, Sandhu R, Wang J, Gui X (2019) Dynamic trust evaluation model based on bidding and multi-attributes for social networks. Int J High Perform Comput Netw 13(4):436–454
Wu T, Wen S, Xiang Y, Zhou W (2018) Twitter spam detection: survey of new approaches and comparative study. Comput Secur 76:265–284
Zhang X, Ghorbani AA (2020) An overview of online fake news: Characterization, detection, and discussion. Info Process Manag 57(2):1–26
Zhang Z, Sun R, Zhao C, Wang J, Chang CK, Gupta BB (2017) Cyvod: a novel trinity multimedia social network scheme. Multimed Tools Appl 76(18):18513–18529
Zhang Z, Sun R, Wang X, Zhao C (2019a) A situational analytic method for user behavior pattern in multimedia social networks. IEEE Trans Big Data 5(4):520–528. https://doi.org/10.1109/TBDATA.2017.2657623
Zhang Z, Sun R, Choo K-KR, Fan K, Wu W, Zhang M, Zhao C (2019b) A novel social situation analytics-based recommendation algorithm for multimedia social networks. IEEE Access 7:117749–117760
Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393
Acknowledgements
This publication is an outcome of the R&D work under the Visvesvaraya Ph.D. Scheme of Ministry of Electronics & Information Technology, Government of India, being implemented by Digital India Corporation under the Grant No. PhD/MLA/4(61)/2015-16.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with direct human participants or animals performed by any of the authors. However, public tweets using Twitter API were fetched for around 3000 users. While fetching the tweets full adherence to Twitter Developer policy and agreement was made.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kaur, R., Singh, S. & Kumar, H. An intrinsic authorship verification technique for compromised account detection in social networks. Soft Comput 25, 4345–4366 (2021). https://doi.org/10.1007/s00500-020-05445-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05445-y