Skip to main content

Analysis and Detection of Unreliable Users in Twitter: Two Case Studies

  • Conference paper
  • First Online:
  • 328 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1222))

Abstract

The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks.

Nuno Guimaraes thanks the Fundação para a Ciência e Tecnologia (FCT), Portugal for the Ph.D. Grant (SFRH/BD/129708/2017).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.buzzfeednews.com/article/craigsilverman/top-fake-news-of-2016.

  2. 2.

    https://www.usnews.com/news/national-news/articles/2016-11-14/avoid-these-fake-news-sites-at-all-costs.

  3. 3.

    “junksci” is an acronym for “junk science”.

  4. 4.

    https://mediabiasfactcheck.com/bb4sp/.

References

  1. Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. Working Paper 23089, National Bureau of Economic Research, January 2017. https://doi.org/10.3386/w23089. http://www.nber.org/papers/w23089

  2. Antoniadis, S., Litou, I., Kalogeraki, V.: A model for identifying misinformation in online social networks. In: Proceedings of the Confederated International Conferences: CoopIS, ODBASE, and C&TC 2015, Rhodes, Greece, 26–30 October 2015, vol. 9415, pp. 473–482 (2015). https://doi.org/10.1007/978-3-319-26148-5. http://link.springer.com/10.1007/978-3-319-26148-5

  3. BBC: Public perceptions of the impartiality and trustworthiness of the BBC (2015). Accessed 31 May 2017

    Google Scholar 

  4. Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.5340

  5. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web (2011)

    Google Scholar 

  6. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secur. Comput. 9(6), 811–824 (2012). https://doi.org/10.1109/TDSC.2012.75

    Article  Google Scholar 

  7. Dickerson, J.P., Kagan, V., Subrahmanian, V.S.: Using sentiment to detect bots on Twitter: are humans more opinionated than bots? In: ASONAM 2014 - Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 620–627 (2014). https://doi.org/10.1109/ASONAM.2014.6921650

  8. Figueira, A., Sandim, M., Fortuna, P.: An approach to relevancy detection: contributions to the automatic detection of relevance in social networks. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Mendonça Teixeira, M. (eds.) New Advances in Information Systems and Technologies. AISC, vol. 444, pp. 89–99. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31232-3_9

    Chapter  Google Scholar 

  9. Forster, K.: Revealed: How dangerous fake health news conquered facebook (2018). https://www.independent.co.uk/life-style/health-and-families/health-news/fake-news-health-facebook-cruel-damaging-social-media-mike-adams-natural-health-ranger-conspiracy-a7498201.html. Accessed 22 May 2018

  10. Gottfried, B.Y.J., Shearer, E.: News Use Across Social Media Platforms 2017. Pew Research Center, September 2017 (News Use Across Social Media Platforms 2017), 17 (2017). http://www.journalism.org/2017/09/07/news-use-across-social-media-platforms-2017/

  11. Guimarães., N., Álvaro Figueira., Torgo., L.: Contributions to the detection of unreliable Twitter accounts through analysis of content and behaviour. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1, KDIR, pp. 92–101. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006932800920101

  12. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278. http://doi.acm.org/10.1145/1656274.1656278

    Article  Google Scholar 

  13. Haynie, D.: The U.S. and U.K. are the world’s most influential countries, survey finds (2015). www.usnews.com/news/best-countries/best-international-influence. Accessed 23 May 2016

  14. Help, T.: About verified accounts (2018). https://help.twitter.com/en/managing-your-account/about-twitter-verified-accounts. Accessed 14 May 2018

  15. Hern, A.: Google acts against fake news on search engine (2017). https://www.theguardian.com/technology/2017/apr/25/google-launches-major-offensive-against-fake-news. Accessed 13 Apr 2018

  16. Hern, A.: New facebook controls aim to regulate political ads and fight fake news (2018). https://www.theguardian.com/technology/2018/apr/06/facebook-launches-controls-regulate-ads-publishers. Accessed 13 Apr 2018

  17. Hutto, C.J., Gilbert, E.: Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A.H. (eds.) ICWSM. The AAAI Press (2014). http://dblp.uni-trier.de/db/conf/icwsm/icwsm2014.html#HuttoG14

  18. Klein, R.: An army of sophisticated bots is influencing the debate around education (2017). https://www.huffingtonpost.com/entry/common-core-debate-bots_us_58bc8bf3e4b0d2821b4ee059. Accessed 07 May 2018

  19. Lazer, D.M.J., et al.: The science of fake news. Science 359(6380), 1094–1096 (2018). https://doi.org/10.1126/science.aao2998. http://science.sciencemag.org/content/359/6380/1094

    Article  Google Scholar 

  20. Loper, E., Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1, ETMTNLP 2002, pp. 63–70. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1118108.1118117

  21. McCord, M., Chuah, M.: Spam detection on twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13

    Chapter  Google Scholar 

  22. Nichols, L.: Poll: Majority find major media outlets credible (2016). https://morningconsult.com/2016/12/07/poll-majority-find-major-media-outlets-credible/. Accessed 31 May 2017

  23. OpenSources: Media bias/fact check -the most comprehensive media bias resources. https://mediabiasfactcheck.com/. Accessed 03 May 2018

  24. OpenSources: Opensources - professionally curated lists of online sources, available free for public use. http://www.opensources.co/. Accessed 03 May 2018

  25. dos Reis, J.C., Benevenuto, F., de Melo, P.O.S.V., Prates, R.O., Kwak, H., An, J.: Breaking the news: first impressions matter on online news. CoRR abs/1503.07921 (2015). http://arxiv.org/abs/1503.07921

  26. Shao, C., Ciampaglia, G.L., Flammini, A., Menczer, F.: Hoaxy: a platform for tracking online misinformation. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 745–750 (2016). https://dl.acm.org/doi/10.1145/2872518.2890098

  27. Souppouris, A.: Clickbait, fake news and the power of feeling (2016). https://www.engadget.com/2016/11/21/clickbait-fake-news-and-the-power-of-feeling/. Accessed 07 May 2018

  28. Statista: Most popular social networks worldwide as of April 2018, ranked by number of active users (2018). (in millions). https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/. Accessed 17 May 2018

  29. Sysmonos: An in-depth look at the most active Twitter user data (2009). https://sysomos.com/inside-twitter/most-active-twitter-user-data/. Accessed 17 May 2018

  30. Tacchini, E., Ballarin, G., Della Vedova, M.L., Moret, S., de Alfaro, L.: Some Like it Hoax: automated fake news detection in social networks. arXiv e-prints, April 2017

    Google Scholar 

  31. Tambuscio, M., Ruffo, G., Flammini, A., Menczer, F.: Fact-checking effect on viral hoaxes: a model of misinformation spread in social networks. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Companion, pp. 977–982. ACM, New York (2015). https://doi.org/10.1145/2740908.2742572

  32. Twitter: Twitter - search api (2018). https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets. Accessed 07 Mar 2018

  33. Twitter: Twitter verified (2018). https://twitter.com/verified. Accessed 17 Mar 2018

  34. Vargo, C.J., Guo, L., Amazeen, M.A.: The agenda-setting power of fake news: a big data analysis of the online media landscape from 2014 to 2016. New Media Soc., 146144481771208 (2017). https://doi.org/10.1177/1461444817712086

  35. Weisman, J.: Anti-semitism is rising. why aren’t American jews speaking up? (2018). https://www.nytimes.com/2018/03/17/sunday-review/anti-semitism-american-jews.html. Accessed 07 May 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nuno Guimaraes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guimaraes, N., Figueira, A., Torgo, L. (2020). Analysis and Detection of Unreliable Users in Twitter: Two Case Studies. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2018. Communications in Computer and Information Science, vol 1222. Springer, Cham. https://doi.org/10.1007/978-3-030-49559-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49559-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49558-9

  • Online ISBN: 978-3-030-49559-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics