Abstract
This paper describes our proposed method for the author profiling task at PAN 2019. The aim of this task is to identify the type of a Twitter user (i.e. bot or human). Then, in case of a human, determine its gender (i.e. male or female). Our approach uses a set of language-independent features and it applies machine learning algorithms. After an in-depth experimental study, conducted on English and Spanish datasets, we show that by using a simple set of stylistic information, we can surpass other existing methods that mainly depend on the content of the tweets. For the English dataset, accuracies of 93.06% and 90.04% are obtained for bot an gender classification tasks respectively. Using Spanish tweets, accuracies of 90.53% and 89.11% are achieved for bot and gender detection task respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Daneshvar, S., Inkpen, D.: Gender identification in twitter using n-grams and lsa. In: Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018) (2018)
Fatima, M., Hasan, K., Anwar, S., Nawab, R.M.A.: Multilingual author profiling on Facebook. Inf. Process. Manage. 53(4), 886–904 (2017)
Rangel Pardo, F.M., Celli, F., Rosso, P., Potthast, M., Stein, B., Daelemans, W.: Overview of the 3rd Author Profiling Task at PAN 2015. In: CLEF 2015 Evaluation Labs and Workshop Working Notes Papers, pp. 1–8 (2015)
Juola, P.: Industrial uses for authorship analysis. Mathematics and Computers in Sciences and Industry, pp. 21–25 D(2015)
Subrahmanian, V., Azaria, A., Durst, S., Kagan, V., Galstyan, A., Lerman, K., et al.: The DARPA Twitter bot challenge. Computer 49(6), 38–46 (2016)
Rangel, F., Rosso, P.: Overview of the 7th author profiling task at PAN 2019: bots and gender profiling in Twitter. In: Working Notes Papers of the CLEF 2019 Evaluation Labs. CEUR Workshop, vol. 2380 (2019)
Ouni, S., Fkih, F., Omri, M.N.: Toward a new approach to author profiling based on the extraction of statistical features. Soc. Netw. Anal. Min. 11(1), 1–16 (2021). https://doi.org/10.1007/s13278-021-00768-6
Cai, C., Li, L., Zengi, D.: Behavior enhanced deep bot detection in social media. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 128–30. IEEE (2017)
Fkih, F., Omri, M.N.: Hidden data states-based complex terminology extraction from textual web data model. Appl. Intell. 50(6), 1813–1831 (2020). https://doi.org/10.1007/s10489-019-01568-4
Mabrouk, O., Hlaoua, L., Omri, M.N.: Exploiting Ontology Information in Fuzzy SVM Social Media Profile Classification. Applied Intelligence, September 2020
Mahmoud, R., Belgacem, S., Omri, M.N.: Deep signature-based isolated and large scale continuous gesture recognition approach. J. King Saud Univ.-Comput. Inf. Sci. (2020)
Mahmoud, R., Belgacem, S., Omri, M.N.: Towards wide-scale continuous gesture recognition model for in-depth and grayscale input videos. Int. J. Mach. Learn. Cybern. 12(4), 1173–1189 (2021). https://doi.org/10.1007/s13042-020-01227-y
Mabrouk, O., Hlaoua, L., Omri, M.N.: Exploiting ontology information in fuzzy SVM social media profile classification. Appl. Intell. 51(6), 3757–3774 (2020). https://doi.org/10.1007/s10489-020-01939-2
Mabrouk, O., Hlaoua, L., Omri, M.N.: Profile categorization system based on features reduction. In: ISAIM (2018)
Mabrouk, O., Hlaoua, L., Omri, M.N.: Fuzzy twin SVM based-profile categorization approach. In: 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 547–553. IEEE (2018)
Hall, A., Terveen, L., Halfaker, A.: Bot detection in Wikidata using behavioral and other informal cues. In: Proceedings of the ACM on Human-Computer Interaction, vol. 2(CSCW), pp. 1–18 (2018)
Yang, K.C., Varol, O., Davis, C.A., Ferrara, E., Flammini, A., Menczer, F.: Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 1(1), 48–61 (2019)
Dickerson, J.P., Kagan, V., Subrahmanian, V.S.: Using sentiment to detect bots on twitter: are humans more opinionated than bots? In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), p. 620–7. IEEE (2014)
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: Are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–24 (2012)
Ashraf S, Iqbal HR, Nawab RMA. Cross-Genre Author Profile Prediction Using Stylometry-Based Approach. In: CLEF (Working Notes). Citeseer, pp. 992–9 (2016)
Bartle A, Zheng J. Gender classification with deep learning. Stanfordcs, 224d Course Project Report. 2015:1–7
Safara, F., Mohammed, A.S., Potrus, M.Y., Ali, S., Tho, Q.T., Souri, A., et al.: An Author Gender Detection Method Using Whale Optimization Algorithm and Artificial Neural Network. IEEE Access. 8, 48428–48437 (2020)
Flekova, L., Preoţiuc-Pietro, D., Ungar, L.: Exploring stylistic variation with age and income on Twitter. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 313–319 (2016)
Kovács G, Balogh V, Mehta P, Shridhar K, Alonso P, Liwicki M. Author Profiling using Semantic and Syntactic Features. In: CLEF (Working Notes); 2019
Fkih, F., Omri, M.N.: Estimation of a priori decision threshold for collocations extraction: an empirical study. Int. J. Inf. Technol. Web Eng. (IJITWE) 8(3), 34–49 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ouni, S., Fkih, F., Omri, M.N. (2022). Bots and Gender Detection on Twitter Using Stylistic Features. In: Bădică, C., Treur, J., Benslimane, D., Hnatkowska, B., Krótkiewicz, M. (eds) Advances in Computational Collective Intelligence. ICCCI 2022. Communications in Computer and Information Science, vol 1653. Springer, Cham. https://doi.org/10.1007/978-3-031-16210-7_53
Download citation
DOI: https://doi.org/10.1007/978-3-031-16210-7_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16209-1
Online ISBN: 978-3-031-16210-7
eBook Packages: Computer ScienceComputer Science (R0)