Abstract
The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks.
Nuno Guimaraes thanks the Fundação para a Ciência e Tecnologia (FCT), Portugal for the Ph.D. Grant (SFRH/BD/129708/2017).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
“junksci” is an acronym for “junk science”.
- 4.
References
Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. Working Paper 23089, National Bureau of Economic Research, January 2017. https://doi.org/10.3386/w23089. http://www.nber.org/papers/w23089
Antoniadis, S., Litou, I., Kalogeraki, V.: A model for identifying misinformation in online social networks. In: Proceedings of the Confederated International Conferences: CoopIS, ODBASE, and C&TC 2015, Rhodes, Greece, 26–30 October 2015, vol. 9415, pp. 473–482 (2015). https://doi.org/10.1007/978-3-319-26148-5. http://link.springer.com/10.1007/978-3-319-26148-5
BBC: Public perceptions of the impartiality and trustworthiness of the BBC (2015). Accessed 31 May 2017
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.5340
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web (2011)
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secur. Comput. 9(6), 811–824 (2012). https://doi.org/10.1109/TDSC.2012.75
Dickerson, J.P., Kagan, V., Subrahmanian, V.S.: Using sentiment to detect bots on Twitter: are humans more opinionated than bots? In: ASONAM 2014 - Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 620–627 (2014). https://doi.org/10.1109/ASONAM.2014.6921650
Figueira, A., Sandim, M., Fortuna, P.: An approach to relevancy detection: contributions to the automatic detection of relevance in social networks. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Mendonça Teixeira, M. (eds.) New Advances in Information Systems and Technologies. AISC, vol. 444, pp. 89–99. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31232-3_9
Forster, K.: Revealed: How dangerous fake health news conquered facebook (2018). https://www.independent.co.uk/life-style/health-and-families/health-news/fake-news-health-facebook-cruel-damaging-social-media-mike-adams-natural-health-ranger-conspiracy-a7498201.html. Accessed 22 May 2018
Gottfried, B.Y.J., Shearer, E.: News Use Across Social Media Platforms 2017. Pew Research Center, September 2017 (News Use Across Social Media Platforms 2017), 17 (2017). http://www.journalism.org/2017/09/07/news-use-across-social-media-platforms-2017/
Guimarães., N., Álvaro Figueira., Torgo., L.: Contributions to the detection of unreliable Twitter accounts through analysis of content and behaviour. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1, KDIR, pp. 92–101. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006932800920101
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278. http://doi.acm.org/10.1145/1656274.1656278
Haynie, D.: The U.S. and U.K. are the world’s most influential countries, survey finds (2015). www.usnews.com/news/best-countries/best-international-influence. Accessed 23 May 2016
Help, T.: About verified accounts (2018). https://help.twitter.com/en/managing-your-account/about-twitter-verified-accounts. Accessed 14 May 2018
Hern, A.: Google acts against fake news on search engine (2017). https://www.theguardian.com/technology/2017/apr/25/google-launches-major-offensive-against-fake-news. Accessed 13 Apr 2018
Hern, A.: New facebook controls aim to regulate political ads and fight fake news (2018). https://www.theguardian.com/technology/2018/apr/06/facebook-launches-controls-regulate-ads-publishers. Accessed 13 Apr 2018
Hutto, C.J., Gilbert, E.: Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A.H. (eds.) ICWSM. The AAAI Press (2014). http://dblp.uni-trier.de/db/conf/icwsm/icwsm2014.html#HuttoG14
Klein, R.: An army of sophisticated bots is influencing the debate around education (2017). https://www.huffingtonpost.com/entry/common-core-debate-bots_us_58bc8bf3e4b0d2821b4ee059. Accessed 07 May 2018
Lazer, D.M.J., et al.: The science of fake news. Science 359(6380), 1094–1096 (2018). https://doi.org/10.1126/science.aao2998. http://science.sciencemag.org/content/359/6380/1094
Loper, E., Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1, ETMTNLP 2002, pp. 63–70. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1118108.1118117
McCord, M., Chuah, M.: Spam detection on twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13
Nichols, L.: Poll: Majority find major media outlets credible (2016). https://morningconsult.com/2016/12/07/poll-majority-find-major-media-outlets-credible/. Accessed 31 May 2017
OpenSources: Media bias/fact check -the most comprehensive media bias resources. https://mediabiasfactcheck.com/. Accessed 03 May 2018
OpenSources: Opensources - professionally curated lists of online sources, available free for public use. http://www.opensources.co/. Accessed 03 May 2018
dos Reis, J.C., Benevenuto, F., de Melo, P.O.S.V., Prates, R.O., Kwak, H., An, J.: Breaking the news: first impressions matter on online news. CoRR abs/1503.07921 (2015). http://arxiv.org/abs/1503.07921
Shao, C., Ciampaglia, G.L., Flammini, A., Menczer, F.: Hoaxy: a platform for tracking online misinformation. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 745–750 (2016). https://dl.acm.org/doi/10.1145/2872518.2890098
Souppouris, A.: Clickbait, fake news and the power of feeling (2016). https://www.engadget.com/2016/11/21/clickbait-fake-news-and-the-power-of-feeling/. Accessed 07 May 2018
Statista: Most popular social networks worldwide as of April 2018, ranked by number of active users (2018). (in millions). https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/. Accessed 17 May 2018
Sysmonos: An in-depth look at the most active Twitter user data (2009). https://sysomos.com/inside-twitter/most-active-twitter-user-data/. Accessed 17 May 2018
Tacchini, E., Ballarin, G., Della Vedova, M.L., Moret, S., de Alfaro, L.: Some Like it Hoax: automated fake news detection in social networks. arXiv e-prints, April 2017
Tambuscio, M., Ruffo, G., Flammini, A., Menczer, F.: Fact-checking effect on viral hoaxes: a model of misinformation spread in social networks. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Companion, pp. 977–982. ACM, New York (2015). https://doi.org/10.1145/2740908.2742572
Twitter: Twitter - search api (2018). https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets. Accessed 07 Mar 2018
Twitter: Twitter verified (2018). https://twitter.com/verified. Accessed 17 Mar 2018
Vargo, C.J., Guo, L., Amazeen, M.A.: The agenda-setting power of fake news: a big data analysis of the online media landscape from 2014 to 2016. New Media Soc., 146144481771208 (2017). https://doi.org/10.1177/1461444817712086
Weisman, J.: Anti-semitism is rising. why aren’t American jews speaking up? (2018). https://www.nytimes.com/2018/03/17/sunday-review/anti-semitism-american-jews.html. Accessed 07 May 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Guimaraes, N., Figueira, A., Torgo, L. (2020). Analysis and Detection of Unreliable Users in Twitter: Two Case Studies. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2018. Communications in Computer and Information Science, vol 1222. Springer, Cham. https://doi.org/10.1007/978-3-030-49559-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-49559-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49558-9
Online ISBN: 978-3-030-49559-6
eBook Packages: Computer ScienceComputer Science (R0)