Abstract
Filtering out the illegitimate Twitter accounts for online social media mining tasks reduces the noise and thus improves the quality of the outcomes of those tasks. Developing a supervised machine learning classifier requires a large annotated dataset. While building the annotation guidelines, the rules were found suitable to develop an unsupervised rule-based classifying program. However, despite its high accuracy, the performance of the rule-based program was not time efficient. So, we decided to use the unsupervised rule-based program to create a massive annotated dataset to build a supervised machine learning classifier, which was found to be fast and matched the unsupervised classifier performance with an F-Score of 92%. The impact of removing those illegitimate accounts on an influential users identification program developed by the authors, was investigated. There were slight improvements in the precision results but not statistically significant, which indicated that the influential user program didn’t identify erroneously spam accounts as influential.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ala’M, A. Z., Alqatawna, J., Faris, H.: Spam profile detection in social networks based on public features. In: 2017 8th International Conference on Information and Communication Systems (ICICS), pp. 130–135. IEEE (2017)
Aslan, Ç.B., Sağlam, R.B., Li, S.: Automatic detection of cyber security related accounts on online social networks: Twitter as an example. In: Proceedings of the 9th International Conference on Social Media and Society, pp. 236–240. ACM (2018)
Chavoshi, N., Hamooni, H., Mueen, A.: Identifying correlated bots in Twitter. In: Spiro, E., Ahn, Y.-Y. (eds.) SocInfo 2016. LNCS, vol. 10047, pp. 14–21. Springer, Cham (2016a)
Chavoshi, N., Hamooni, H., Mueen, A.: DeBot: Twitter bot detection via warped correlation. In: ICDM, pp. 817–822 (2016b)
Chavoshi, N., Hamooni, H., Mueen, A.: Temporal patterns in bot activities. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 1601–1606. International World Wide Web Conferences Steering Committee (2017)
Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: Botornot: a system to evaluate social bots. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 273–274. International World Wide Web Conferences Steering Committee (2016)
Duh, A., Slak Rupnik, M., Korošak, D.: Collective behavior of social bots is encoded in their temporal Twitter activity. Big data 6(2), 113–123 (2018)
Inuwa-Dutse, I., Bello, B.S., Korkontzelos, I.: Lexical analysis of automated accounts on Twitter. arXiv preprint arXiv:1812.07947 (2018)
Jain, G., Sharma, M., Agarwal, B.: Spam detection on social media using semantic convolutional neural network. Int. J. Knowl. Discov. Bioinform. 8(1), 12–26 (2018)
Kudugunta, S., Ferrara, E.: Deep neural networks for bot detection. Inf. Sci. 467, 312–322 (2018)
Liu, S., Wang, Yu., Chen, C., Xiang, Y.: An ensemble learning approach for addressing the class imbalance problem in Twitter spam detection. In: Liu, J.K.K., Steinfeld, R. (eds.) ACISP 2016. LNCS, vol. 9722, pp. 215–228. Springer, Cham (2016)
Madisetty, S., Desarkar, M.S.: A neural network-based ensemble approach for spam detection in Twitter. IEEE Trans. Comput. Soc. Syst. 5(4), 973–984 (2018)
Shalaby, M., Rafea, A.: Identifying the topic-specific influential users in Twitter. Int. J. Comput. Appl. 179(18), 34–39 (2018)
Subrahmanian, V.S., Azaria, A., Durst, S., Kagan, V., Galstyan, A., Lerman, K., Zhu, L., Ferrara, E., Flammini, A., Menczer, F.: The DARPA Twitter bot challenge. Computer 49(6), 38–46 (2016)
Varol, O., Ferrara, E., Davis, C. A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. In: Eleventh International AAAI Conference on Web and Social Media (2017)
Acknowledgments
The authors would like to thank ITIDA and AUC for sponsoring the project entitled “Sentiment Analysis Tool for Arabic”.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shalaby, M., Rafea, A. (2020). Filtering Users Accounts for Enhancing the Results of Social Media Mining Tasks. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S., Orovic, I., Moreira, F. (eds) Trends and Innovations in Information Systems and Technologies. WorldCIST 2020. Advances in Intelligent Systems and Computing, vol 1160. Springer, Cham. https://doi.org/10.1007/978-3-030-45691-7_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-45691-7_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45690-0
Online ISBN: 978-3-030-45691-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)