An enhanced graph-based semi-supervised learning algorithm to detect fake users on Twitter

BalaAnand, M.; Karthikeyan, N.; Karthik, S.; Varatharajan, R.; Manogaran, Gunasekaran; Sivaparthipan, C. B.

doi:10.1007/s11227-019-02948-w

An enhanced graph-based semi-supervised learning algorithm to detect fake users on Twitter

Published: 30 July 2019

Volume 75, pages 6085–6105, (2019)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

M. BalaAnand ORCID: orcid.org/0000-0002-9509-6943¹,
N. Karthikeyan²,
S. Karthik³,
R. Varatharajan⁴,
Gunasekaran Manogaran⁵ &
…
C. B. Sivaparthipan³

1324 Accesses
63 Citations
Explore all metrics

Abstract

Over the the past decade, social networking services (SNS) have proliferated on the web. The nature of such sites makes identity deception easy, providing a fast means for creating and managing identities, and then connecting with and deceiving others. Fake users are those accounts specifically created for purposes such as stalking or abuse of another user, for slander, or for marketing. The current system for detecting deception depends on behavioral, non-behavioral and user-generated content (UGC) information gathered from users. Although these methods have high detection accuracy, they cannot be implemented in databases with massive volumes of data. To address this issue, this paper proposes an enhanced graph-based semi-supervised learning algorithm (EGSLA) to detect fake users from a large volume of Twitter data. The proposed method encompasses four modules: data collection, feature extraction, classification and decision making. Data collected from Twitter using Scrapy is utilized for the evaluation. The performance of the proposed algorithm is tested with existing game theory, k-nearest neighbor (KNN), support vector machine (SVM) and decision tree techniques. The results show that the proposed EGSLA algorithm achieves 90.3% accuracy in spotting fake users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

Article 21 March 2022

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Article Open access 11 May 2022

Fake news detection in social media based on sentiment analysis using classifier techniques

Article 11 March 2023

References

Hanna R, Rohm A, Crittenden VL (2011) We’re all connected: the power of the social media ecosystem. Bus Horiz 54(3):265–273
Article Google Scholar
Doan A, Ramakrishnan R, Halevy AY (2011) Crowdsourcing systems on the World-Wide Web. Commun ACM 54(4):86–96
Article Google Scholar
Ding Y, Yan S, Zhang Y, Dai W, Dong L (2016) Predicting the attributes of social network users using a graph-based machine learning method. Comput Commun 73:3–11
Article Google Scholar
Krombholz K, Merkl D, Weippl E (2012) Fake identities in social media: a case study on the sustainability of the Facebook business model. J Serv Sci Res 4(2):175–212
Article Google Scholar
Chu Z, Gianvecchio S, Wang H, Jajodia S (2012) Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans Dependable Secure Comput 9(6):811–824
Article Google Scholar
Gilani Z, Farahbakhsh R, Tyson G, Wang L, Crowcroft J (2017) Of bots and humans (on Twitter), In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp 349–354. ACM
Qiang F, Feng B, Guo D, Li Q (2018) Combating the evolving spammers in online social networks. Comput Secur 72:60–73
Article Google Scholar
Yang C, Harkreader R, Guofei G (2013) Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans Inf Forensics Secur 8(8):1280–1293
Article Google Scholar
Aggarwal A, Rajadesingan A, Kumaraguru P (2012) PhishAri: Automatic Realtime Phishing Detection on Twitter, eCrime Researchers Summit (eCrime), 2012, pp 1–12, IEEE
Yan G (2013) Peri-Watchdog: hunting for hidden botnets in the periphery of online social networks. Comput Netw 57(2):540–555
Article Google Scholar
Drevs Y, Svodtsev A (2016) Formalization of criteria for social bots detection systems. Procedia-Soc Behav Sci 236:9–13
Article Google Scholar
Farasat A, Nikolaev A, Srihari NS, Blair RH (2015) Probabilistic graphical models in modern social network analysis. Soc Netw Anal Min 5(1):5–62
Article Google Scholar
Ramalingam D, Chinnaiah V (2017) Fake profile detection techniques in large-scale online social networks: a comprehensive review. Comput Electr Eng 65:165–177
Article Google Scholar
Boshmaf Y, Logothetis D, Siganos G, Lería J, Lorenzo J, Ripeanu M, Beznosov K, Halawa H (2016) Íntegro: leveraging victim prediction for robust fake account detection in large scale OSNs. Comput Secur 61:142–168
Article Google Scholar
Escalante HJ, Villatoro-Tello E, Garza SE, López-Monroy AP, Montes-y-Gómez M, Villaseñor-Pineda L (2017) Early detection of deception and aggressiveness using profile-based representations. Expert Syst Appl 89:99–111
Article Google Scholar
Tsikerdekis M (2017) Real-time identity deception detection techniques for social media: optimizations and challenges. IEEE Internet Comput 99:1–11
Google Scholar
Kuruvilla AM, Varghese S (2015) A detection system to counter identity deception in social media applications, In: International Conference Circuit, Power and Computing Technologies (ICCPCT), 2015, pp 1–5, IEEE
Gera T, Singh J (2015) A parameterized approach to deal with sock puppets, In: 2015 Third International Conference Computer, Communication, Control and Information Technology (C3IT), pp 1–6, IEEE
Jiang X, Li Q, Ma Z, Dong M, Wu J, Guo D (2018) QuickSquad: a new single-machine graph computing framework for detecting fake accounts in large-scale social networks. Peer-to-Peer Netw Appl 1–18
Yuan W, Yang M, Li H, Wang C, Wang B (2018) End-to-end learning for high-precision lane keeping via multi-state model. CAAI Trans Intell Technol 3:185–190
Article Google Scholar
Shi Q, Lam HK, Xiao B, Tsai SH (2018) Adaptive PID controller based on Q-learning algorithm. CAAI Trans Intell Technol 3(4):235–244
Article Google Scholar
Wang K, Zhu N, Cheng Y, Li R, Zhou T, Long X (2018) Fast feature matching based on r-nearest k-means searching. CAAI Trans Intell Technol 3(4):198–207
Article Google Scholar
BalaAnand M, Karthikeyan N, Karthik S (2018) Designing a framework for communal software: based on the assessment using relation modelling. Int J Parallel Prog. https://doi.org/10.1007/s10766-018-0598-2
Article Google Scholar
Solomon Z, Sivaparthipan CB, Punitha P, BalaAnand M, Karthikeyan N (2018) Certain investigation on power preservation in sensor networks. In: 2018 International Conference on Soft-Computing and Network Security (ICSNS), Coimbatore, India, https://doi.org/10.1109/icsns.2018.8573688
Sivaparthipan CB, Karthikeyan N, Karthik S (2018) Designing statistical assessment healthcare information system for diabetics analysis using big data. Multimed Tools Appl

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, V.R.S. College of Engineering & Technology, Viluppuram, India
M. BalaAnand
Department of MCA, SNS College of Engineering, Coimbatore, India
N. Karthikeyan
Department of Computer Science & Engineering, SNS College of Technology, Coimbatore, India
S. Karthik & C. B. Sivaparthipan
Department of Computer Science & Engineering, Anna University, Chennai, India
R. Varatharajan
Department of Computer Science, University of California, Davis, USA
Gunasekaran Manogaran

Authors

M. BalaAnand
View author publications
You can also search for this author in PubMed Google Scholar
N. Karthikeyan
View author publications
You can also search for this author in PubMed Google Scholar
S. Karthik
View author publications
You can also search for this author in PubMed Google Scholar
R. Varatharajan
View author publications
You can also search for this author in PubMed Google Scholar
Gunasekaran Manogaran
View author publications
You can also search for this author in PubMed Google Scholar
C. B. Sivaparthipan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. BalaAnand.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

BalaAnand, M., Karthikeyan, N., Karthik, S. et al. An enhanced graph-based semi-supervised learning algorithm to detect fake users on Twitter. J Supercomput 75, 6085–6105 (2019). https://doi.org/10.1007/s11227-019-02948-w

Download citation

Published: 30 July 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s11227-019-02948-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhanced graph-based semi-supervised learning algorithm to detect fake users on Twitter

Abstract

Access this article

Similar content being viewed by others

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Fake news detection in social media based on sentiment analysis using classifier techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhanced graph-based semi-supervised learning algorithm to detect fake users on Twitter

Abstract

Access this article

Similar content being viewed by others

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Fake news detection in social media based on sentiment analysis using classifier techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation