Abstract
In this paper, with respect to reviewing and comparing existing social networks’ datasets, we introduce SNEFL dataset: the first social network dataset that includes the level of users’ likes (fuzzy like) data in addition to the likes between users. With users’ privacy in mind, the data has been collected from a social network. It includes several additional features including age, gender, marital status, height, weight, educational level and religiosity of the users. We have described its structure, analysed its features and evaluated its advantages in comparison with other social network datasets. On top of that, using unique feature of SNEFL dataset (fuzzy like) for the first time a rule-based algorithm has been developed to detect involuntary celibates (Incels) in social networks. Despite Incels activities in online social networks, until now no study on computer science has been performed to identify them. This study is the first step to address this challenge that society is facing today. Experimental results show that the accuracy of the proposed algorithm in identifying Incels among all social network users is 23.21% and among users who have fuzzy like data is 68.75%. In addition to the Incel detection, SNEFL dataset can be used by researchers in different fields to produce more accurate results. Some study areas that SNEFL dataset can be used in are network analysis, frequent pattern mining, classification and clustering.
Similar content being viewed by others
References
Althoff T, Leskovec J (2015) Donor retention in online crowdfunding communities: A case study of donorschoose.org. In: Proceedings of the 24th International Conference on World Wide Web. ACM, p 34–44
Anderson A, Huttenlocher D, Kleinberg J, Leskovec J, Tiwari M (2015) Global diffusion via cascading invitations: Structure, growth, and homophily. In: Proceedings of the 24th International Conference on World Wide Web. ACM, p 66–76
Bachrach Y, Graepel T, Kohli P, Kosinski M, Stillwell D (2014) Your digital image: factors behind demographic and psychometric predictions from social network profiles. In: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, p 1649–1650
Bello-Orgaz G, Jung JJ, Camacho D (2016) Social big data: Recent achievements and new challenges. Information Fusion 28:45–59
Bi B, Shokouhi M, Kosinski M, Graepel T (2013) Inferring the demographics of search users: Social data meets search queries. In: Proceedings of the 22nd international conference on World Wide Web. ACM, p 131–140
Blommaert J (2017) Online-offline modes of identity and community: Elliot Rodger’s twisted world of masculine victimhood
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10):P10008
Buccafurri F, Fotia L, Lax G (2013) Allowing privacy-preserving analysis of social network likes. In: Privacy, Security and Trust (PST), 2013 Eleventh Annual International Conference on. IEEE, p 36–43
Burke M, Marlow C, Lento T (2010) Social network activity and social well-being. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, p 1909–1912
Burrow AL, Rainone N (2017) How many likes did I get?: Purpose moderates links between positive social media feedback and self-esteem. J Exp Soc Psychol 69:232–236
Cacioppo S, Grippo AJ, London S, Goossens L, Cacioppo JT (2015) Loneliness: Clinical import and interventions. Perspect Psychol Sci 10(2):238–249
Cheng J, Danescu-Niculescu-Mizil C, Leskovec J (2015) Antisocial behavior in online discussion communities. arXiv preprint arXiv:1504.00680
Correa T, Hinsley AW, De Zuniga HG (2010) Who interacts on the Web?: The intersection of users’ personality and social media use. Comput Hum Behav 26(2):247–253
Developer.twitter.com (2018) Pricing. Available at: https://developer.twitter.com/en/pricing. Accessed 22 May 2018
Domènech-Abella J, Lara E, Rubio-Valera M, Olaya B, Moneta MV, Rico-Uribe LA, Ayuso-Mateos JL, Mundó J, Haro JM (2017) Loneliness and depression in the elderly: the role of social network. Soc Psychiatry Psychiatr Epidemiol 52(4):381–390
Erlandsson F, Bródka P, Boldt M, Johnson H (2017) Do we really need to catch them all? A new User-guided Social Media Crawling method. Entropy 19(12):686
Erlandsson F, Bródka P, Borg A, Johnson H (2016) Finding influential users in social media using association rule learning. Entropy 18(5):164
Erlandsson F, Nia R, Boldt M, Johnson H, Wu SF (2015) Crawling online social networks. In: Network Intelligence Conference (ENIC), 2015 Second European. IEEE, p 9–16
Ferrara E, Interdonato R, Tagarelli A (2014) Online popularity and topical interests through the lens of instagram. In: Proceedings of the 25th ACM conference on Hypertext and social media. ACM, p 24–34
Fortna VP (2015) Do Terrorists Win? Rebels' Use of Terrorism and Civil War Outcomes. Int Organ 69(3):519–556
Ging, D. (2017). Alphas, betas, and incels: Theorizing the masculinities of the manosphere. Men and Masculinities. https://doi.org/10.1177/1097184X17706401
Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2017) Introducing fuzzy like in social networks and its effects on advertising profits and human behavior. Comput Hum Behav 77:282–293
Hallac D, Leskovec J, Boyd S (2015) Network lasso: Clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 387–396
Jin X, Wu L, Zhao G, Zhou X, Zhang X, Li X (2018) IDEA: a new dataset for image aesthetic scoring. Multimed Tools Appl:1–15
Khandelwal A, Yang Z, Ye E, Agarwal R, Stoica I (2017). ZipG: a memory-efficient graph store for interactive queries. In: Proceedings of the 2017 ACM International Conference on Management of Data. ACM, p 1149–1164
Kim AY, Escobedo-Land A (2015) OkCupid data for introductory statistics and data science courses. J Stat Educ 23(2)
Kunegis J, Lommatzsch A, Bauckhage C (2009) The slashdot zoo: mining a social network with negative edges. In Proceedings of the 18th international conference on World wide web (pp. 741–750). ACM
Leskovec J, Huttenlocher D, Kleinberg J (2010a). Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web. ACM, p 641–650
Leskovec J, Huttenlocher D, Kleinberg J (2010b) Signed networks in social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, p 1361–1370.
Matz SC, Kosinski M, Nave G, Stillwell DJ (2017) Psychological targeting as an effective approach to digital mass persuasion. Proc Natl Acad Sci 201710966
McAuley J, Pandey R, Leskovec J (2015) Inferring networks of substitutable and complementary products. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 785–794
Meyffret S, Guillot E, Médini L, Laforest F (2012) RED: a rich epinions dataset for recommender systems (Doctoral dissertation, LIRIS)
Nagle A (2016) The New Man of 4chan. The Baffler (30):64–76
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Security and Privacy, 2009 30th IEEE Symposium on. IEEE, p 173–187
Nazir F, Ghazanfar MA, Maqsood M, Aadil F, Rho S, Mehmood I (2018) Social media signal detection using tweets volume, hashtag, and sentiment analysis. Multimed Tools Appl 1–34
NBC News (2018) After Toronto attack, online misogynists praise suspect as ‘new saint’. [online] Available at: https://www.nbcnews.com/news/us-news/after-toronto-attack-online-misogynists-praise-suspect-new-saint-n868821. Accessed 20 May 2018
New Scientist (2018) Huge new Facebook data leak exposed intimate details of 3m users. [online] Available at: https://www.newscientist.com/article/2168713-huge-new-facebook-data-leak-exposed-intimate-details-of-3m-users/. Accessed 22 May 2018
Nia R, Erlandsson F, Bhattacharyya P, Rahman MR, Johnson H, Wu SF (2012) Sin: A platform to make interactions in social networks accessible. In Social Informatics (SocialInformatics), 2012 International Conference on (p 205–214). IEEE
Parand FA, Rahimi H, Gorzin M (2016) Combining fuzzy logic and eigenvector centrality measure in social network analysis
Pittman M, Reich B (2016) Social media and loneliness: Why an Instagram picture may be worth more than a thousand Twitter words. Comput Hum Behav 62:155–167
Pizzato L, Rej T, Akehurst J, Koprinska I, Yacef K, Kay J (2013) Recommending people to people: the nature of reciprocal recommenders with a case study in online dating. User Model User-Adap Inter 23(5):447–488
Popescu A, Hildebrandt M, Papadopoulos S, Claeys L, Lund D, Michalareas T, Kastrinogiannis T, Pierson J, Padyab AM (2015) October. User empowerment for enhanced online presence management–use cases and tools. In: Amsterdam Privacy Conference. p 23–26
Projet.liris.cnrs.fr (2018) Rich Epinions Dataset. [online] Available at: https://projet.liris.cnrs.fr/red/. Accessed 22 May 2018
Raj ED, Babu LD (2017) An enhanced trust prediction strategy for online social networks using probabilistic reputation features. Neurocomputing 219:412–421
Rozemberczki B, Davies R, Sarkar R, Sutton C (2018) GEMSEC: Graph Embedding with Self Clustering arXiv preprint arXiv 1802:03997
Ruan Z, Miao Y, Pan L, Xiang Y, Zhang J (2018) Big network traffic data visualization. Multimed Tools Appl 77(9):11459–11487
Sites.google.com (2018) myPersonality.org. Available at: https://sites.google.com/michalkosinski.com/mypersonality. Accessed 22 May 2018
Stillwell DJ, Kosinski M (2012) myPersonality project: Example of successful utilization of online social networks for large-scale social research. Am Psychol 59(2):93–104
Subbian K, Aggarwal C, Srivastava J (2016) Mining influencers using information flows in social streams. ACM Transactions on Knowledge Discovery from Data (TKDD) 10(3):26
Time (2018) The Toronto Van Attack Suspect Was Obsessed With Rejection From Women. [online] Available at: http://time.com/5254929/toronto-van-attack-suspect-Incel-women-rejection/. Accessed 28 April 2018
Tiwari A, Weth CVD, Kankanhalli MS (2018) Multimodal Multiplatform Social Media Event Summarization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 14(2s):38
West R, Paranjape A, Leskovec J (2015) Mining missing hyperlinks from human navigation traces: A case study of Wikipedia. In: Proceedings of the 24th international conference on World Wide Web. ACM, p 1242–1252.
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213
Youyou W, Kosinski M, Stillwell D (2015) Computer-based personality judgments are more accurate than those made by humans. Proc Natl Acad Sci 112(4):1036–1040
Zhao Q, Erdogdu MA, He HY, Rajaraman A, Leskovec J 2015. Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 1513–1522
Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2019) A personalized gamification method for increasing user engagement in social networks. Social Network Analysis and Mining 9(1)
Acknowledgments
The authors would like to express their deepest gratitude to Dr. Anahita Hajarian for her contribution to this article.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Hajarian, M., Bastanfard, A., Mohammadzadeh, J. et al. SNEFL: Social network explicit fuzzy like dataset and its application for Incel detection. Multimed Tools Appl 78, 33457–33486 (2019). https://doi.org/10.1007/s11042-019-08057-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08057-3