Machine and Deep Learning Algorithms for Twitter Spam Detection

Alsaffar, Dalia; Alfahhad, Amjad; Alqhtani, Bashaier; Alamri, Lama; Alansari, Shahad; Alqahtani, Nada; Alboaneen, Dabiah A.

doi:10.1007/978-3-030-31129-2_44

Dalia Alsaffar¹⁷,
Amjad Alfahhad¹⁷,
Bashaier Alqhtani¹⁷,
Lama Alamri¹⁷,
Shahad Alansari¹⁷,
Nada Alqahtani¹⁷ &
…
Dabiah A. Alboaneen¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1058))

Included in the following conference series:

International Conference on Advanced Intelligent Systems and Informatics

2390 Accesses
8 Citations
4 Altmetric

Abstract

Twitter allows users to send short text-based messages with up to 280 characters which is called “tweets”. The reputation of Twitter attracts the spammers to spread malevolent programming through URLs attached in tweets. Twitter spam has become a critical problem. Spam refers to a variety of prohibited behaviours that violate the Twitter rules. In this paper, different machine and deep learning algorithms are used to detect if the tweet is spammer or not. The performance of six machine learning algorithms, namely Random Forest (RF), Naive Bayes (NB), Bayesian Network (BN), Support Vector Machine (SVM), K-Nearest Neighbour (KNN), and Multi-Layer Perceptron (MLP) and one deep learning algorithm which is Recurrent Neural Network (RNN) are evaluated. Different test options are used, namely cross validation and percentage split tests. Results show that RF predicts the best result with lowest error rate and highest classification accuracy rate with different test options comparing to all algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://nsclab.org/nsclab/resource.

References

bin Othman, M.F., Yau, T.M.: Comparison of different classification techniques using WEKA for breast cancer. In: 3rd Kuala Lumpur International Conference on Biomedical Engineering, pp. 520–523. Springer, Heidelberg (2007)
Google Scholar
Frank, C., Habach, A., Seetan, R.: Predicting smoking status using machine learning algorithms and statistical analysis. J. Comput. Sci. Coll. 33, 66 (2018)
Google Scholar
Wang, A.H.: Don’t follow me: spam detection in Twitter. In: 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10. IEEE (2010)
Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-abuse and Spam Conference (CEAS), vol. 6, pp. 12. (2010)
Google Scholar
Gao, Y., Mi, G., Tan, Y.: Variable length concentration based feature construction method for spam detection. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2015)
Google Scholar
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Google Scholar
Zhang, H.: The optimality of naive Bayes. AA 1(2), 3 (2004)
Google Scholar
Jensen, F.V.: An Introduction to Bayesian Networks. UCL Press, London (1996)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–97 (1995)
MATH Google Scholar
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–85 (1992)
MathSciNet Google Scholar
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Malaysia (2016)
MATH Google Scholar
Haykin, S.S.: Neural Networks and Learning Machines. Pearson Education, Upper Saddle River (2009)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Sedhai, S., Sun, A.: Semi-supervised spam detection in Twitter stream. IEEE Trans. Comput. Soc. Syst. 5(1), 169–175 (2017)
Article Google Scholar
Witten, I.H., Frank, E., Trigg, L.E., Hall, M.A., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations (1999)
Google Scholar
Team, D.: Deeplearning4j: open-source distributed deep learning for the JVM. Apache Software Foundation License 2 (2016)
Google Scholar
Chen, C., Zhang, J., Chen, X., Xiang, Y., Zhou, W.: 6 million spam tweets: a large ground truth for timely Twitter spam detection. In: IEEE International Conference on Communications (ICC), pp. 7065–7070. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Department, College of Science and Humanities, Imam Abdulrahman Bin Faisal University, P.O. Box 31961, Jubail, Saudi Arabia
Dalia Alsaffar, Amjad Alfahhad, Bashaier Alqhtani, Lama Alamri, Shahad Alansari, Nada Alqahtani & Dabiah A. Alboaneen

Authors

Dalia Alsaffar
View author publications
You can also search for this author in PubMed Google Scholar
Amjad Alfahhad
View author publications
You can also search for this author in PubMed Google Scholar
Bashaier Alqhtani
View author publications
You can also search for this author in PubMed Google Scholar
Lama Alamri
View author publications
You can also search for this author in PubMed Google Scholar
Shahad Alansari
View author publications
You can also search for this author in PubMed Google Scholar
Nada Alqahtani
View author publications
You can also search for this author in PubMed Google Scholar
Dabiah A. Alboaneen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dabiah A. Alboaneen .

Editor information

Editors and Affiliations

Cairo University, Giza, Egypt
Aboul Ella Hassanien
The British University in Dubai, Dubai, United Arab Emirates
Khaled Shaalan
Ain Shams University, Cairo, Egypt
Mohamed Fahmy Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alsaffar, D. et al. (2020). Machine and Deep Learning Algorithms for Twitter Spam Detection. In: Hassanien, A., Shaalan, K., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019. AISI 2019. Advances in Intelligent Systems and Computing, vol 1058. Springer, Cham. https://doi.org/10.1007/978-3-030-31129-2_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-31129-2_44
Published: 02 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31128-5
Online ISBN: 978-3-030-31129-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics