Analysis and Detection of Unreliable Users in Twitter: Two Case Studies

Guimaraes, Nuno; Figueira, Alvaro; Torgo, Luis

doi:10.1007/978-3-030-49559-6_3

Analysis and Detection of Unreliable Users in Twitter: Two Case Studies

Conference paper
First Online: 26 June 2020

328 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1222))

Abstract

The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks.

Nuno Guimaraes thanks the Fundação para a Ciência e Tecnologia (FCT), Portugal for the Ph.D. Grant (SFRH/BD/129708/2017).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.buzzfeednews.com/article/craigsilverman/top-fake-news-of-2016.
2.
https://www.usnews.com/news/national-news/articles/2016-11-14/avoid-these-fake-news-sites-at-all-costs.
3.
“junksci” is an acronym for “junk science”.
4.
https://mediabiasfactcheck.com/bb4sp/.

References

Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. Working Paper 23089, National Bureau of Economic Research, January 2017. https://doi.org/10.3386/w23089. http://www.nber.org/papers/w23089
Antoniadis, S., Litou, I., Kalogeraki, V.: A model for identifying misinformation in online social networks. In: Proceedings of the Confederated International Conferences: CoopIS, ODBASE, and C&TC 2015, Rhodes, Greece, 26–30 October 2015, vol. 9415, pp. 473–482 (2015). https://doi.org/10.1007/978-3-319-26148-5. http://link.springer.com/10.1007/978-3-319-26148-5
BBC: Public perceptions of the impartiality and trustworthiness of the BBC (2015). Accessed 31 May 2017
Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.5340
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web (2011)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secur. Comput. 9(6), 811–824 (2012). https://doi.org/10.1109/TDSC.2012.75
Article Google Scholar
Dickerson, J.P., Kagan, V., Subrahmanian, V.S.: Using sentiment to detect bots on Twitter: are humans more opinionated than bots? In: ASONAM 2014 - Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 620–627 (2014). https://doi.org/10.1109/ASONAM.2014.6921650
Figueira, A., Sandim, M., Fortuna, P.: An approach to relevancy detection: contributions to the automatic detection of relevance in social networks. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Mendonça Teixeira, M. (eds.) New Advances in Information Systems and Technologies. AISC, vol. 444, pp. 89–99. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31232-3_9
Chapter Google Scholar
Forster, K.: Revealed: How dangerous fake health news conquered facebook (2018). https://www.independent.co.uk/life-style/health-and-families/health-news/fake-news-health-facebook-cruel-damaging-social-media-mike-adams-natural-health-ranger-conspiracy-a7498201.html. Accessed 22 May 2018
Gottfried, B.Y.J., Shearer, E.: News Use Across Social Media Platforms 2017. Pew Research Center, September 2017 (News Use Across Social Media Platforms 2017), 17 (2017). http://www.journalism.org/2017/09/07/news-use-across-social-media-platforms-2017/
Guimarães., N., Álvaro Figueira., Torgo., L.: Contributions to the detection of unreliable Twitter accounts through analysis of content and behaviour. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1, KDIR, pp. 92–101. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006932800920101
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278. http://doi.acm.org/10.1145/1656274.1656278
Article Google Scholar
Haynie, D.: The U.S. and U.K. are the world’s most influential countries, survey finds (2015). www.usnews.com/news/best-countries/best-international-influence. Accessed 23 May 2016
Help, T.: About verified accounts (2018). https://help.twitter.com/en/managing-your-account/about-twitter-verified-accounts. Accessed 14 May 2018
Hern, A.: Google acts against fake news on search engine (2017). https://www.theguardian.com/technology/2017/apr/25/google-launches-major-offensive-against-fake-news. Accessed 13 Apr 2018
Hern, A.: New facebook controls aim to regulate political ads and fight fake news (2018). https://www.theguardian.com/technology/2018/apr/06/facebook-launches-controls-regulate-ads-publishers. Accessed 13 Apr 2018
Hutto, C.J., Gilbert, E.: Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A.H. (eds.) ICWSM. The AAAI Press (2014). http://dblp.uni-trier.de/db/conf/icwsm/icwsm2014.html#HuttoG14
Klein, R.: An army of sophisticated bots is influencing the debate around education (2017). https://www.huffingtonpost.com/entry/common-core-debate-bots_us_58bc8bf3e4b0d2821b4ee059. Accessed 07 May 2018
Lazer, D.M.J., et al.: The science of fake news. Science 359(6380), 1094–1096 (2018). https://doi.org/10.1126/science.aao2998. http://science.sciencemag.org/content/359/6380/1094
Article Google Scholar
Loper, E., Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1, ETMTNLP 2002, pp. 63–70. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1118108.1118117
McCord, M., Chuah, M.: Spam detection on twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13
Chapter Google Scholar
Nichols, L.: Poll: Majority find major media outlets credible (2016). https://morningconsult.com/2016/12/07/poll-majority-find-major-media-outlets-credible/. Accessed 31 May 2017
OpenSources: Media bias/fact check -the most comprehensive media bias resources. https://mediabiasfactcheck.com/. Accessed 03 May 2018
OpenSources: Opensources - professionally curated lists of online sources, available free for public use. http://www.opensources.co/. Accessed 03 May 2018
dos Reis, J.C., Benevenuto, F., de Melo, P.O.S.V., Prates, R.O., Kwak, H., An, J.: Breaking the news: first impressions matter on online news. CoRR abs/1503.07921 (2015). http://arxiv.org/abs/1503.07921
Shao, C., Ciampaglia, G.L., Flammini, A., Menczer, F.: Hoaxy: a platform for tracking online misinformation. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 745–750 (2016). https://dl.acm.org/doi/10.1145/2872518.2890098
Souppouris, A.: Clickbait, fake news and the power of feeling (2016). https://www.engadget.com/2016/11/21/clickbait-fake-news-and-the-power-of-feeling/. Accessed 07 May 2018
Statista: Most popular social networks worldwide as of April 2018, ranked by number of active users (2018). (in millions). https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/. Accessed 17 May 2018
Sysmonos: An in-depth look at the most active Twitter user data (2009). https://sysomos.com/inside-twitter/most-active-twitter-user-data/. Accessed 17 May 2018
Tacchini, E., Ballarin, G., Della Vedova, M.L., Moret, S., de Alfaro, L.: Some Like it Hoax: automated fake news detection in social networks. arXiv e-prints, April 2017
Google Scholar
Tambuscio, M., Ruffo, G., Flammini, A., Menczer, F.: Fact-checking effect on viral hoaxes: a model of misinformation spread in social networks. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Companion, pp. 977–982. ACM, New York (2015). https://doi.org/10.1145/2740908.2742572
Twitter: Twitter - search api (2018). https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets. Accessed 07 Mar 2018
Twitter: Twitter verified (2018). https://twitter.com/verified. Accessed 17 Mar 2018
Vargo, C.J., Guo, L., Amazeen, M.A.: The agenda-setting power of fake news: a big data analysis of the online media landscape from 2014 to 2016. New Media Soc., 146144481771208 (2017). https://doi.org/10.1177/1461444817712086
Weisman, J.: Anti-semitism is rising. why aren’t American jews speaking up? (2018). https://www.nytimes.com/2018/03/17/sunday-review/anti-semitism-american-jews.html. Accessed 07 May 2018

Download references

Author information

Authors and Affiliations

CRACS/INESCTEC, University of Porto, Rua do Campo Alegre 1021/1055, Porto, Portugal
Nuno Guimaraes & Alvaro Figueira
Faculty of Computer Science, Dalhousie University, Halifax, Canada
Luis Torgo

Authors

Nuno Guimaraes
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro Figueira
View author publications
You can also search for this author in PubMed Google Scholar
Luis Torgo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nuno Guimaraes .

Editor information

Editors and Affiliations

Instituto de Telecomunicações, University of Lisbon, Lisbon, Portugal
Ana Fred
Federal University of Pernambuco, Recife, Brazil
Ana Salgado
University of Madeira, Funchal, Portugal
David Aveiro
Delft University of Technology, Delft, The Netherlands
Jan Dietz
University of Coimbra, Coimbra, Portugal
Jorge Bernardino
Polytechnic Institute of Setúbal/INSTIC, Setúbal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guimaraes, N., Figueira, A., Torgo, L. (2020). Analysis and Detection of Unreliable Users in Twitter: Two Case Studies. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2018. Communications in Computer and Information Science, vol 1222. Springer, Cham. https://doi.org/10.1007/978-3-030-49559-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-49559-6_3
Published: 26 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49558-9
Online ISBN: 978-3-030-49559-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics