Skip to main content

Clustering Analysis of Website Usage on Twitter During the COVID-19 Pandemic

  • Conference paper
  • First Online:
Information Management and Big Data (SIMBig 2020)

Abstract

In this study we analyzed patterns of external website usage on Twitter during the COVID-19 pandemic. We used a multi-view clustering technique, which is able to incorporate multiple views of the data, to cluster the websites’ URLs based on their usage patterns and tweet text that occurs with the URLs. The results of the multi-view clustering of URLs used during the COVID-19 pandemic, from 29 January to 22 June 2020, revealed three, main clusters of URL usage. These three clusters differed significantly in terms of using information from different politically-biased, fake news, and conspiracy theory websites. Our results suggest that there are political biases in how information, to include misinformation, about the COVID-19 pandemic is used on Twitter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A Python implementation of this algorithm is available on the lead author’s GitHub page: https://github.com/ijcruic/Multi-view-Clustering-of-Social-Based-Data.

  2. 2.

    https://developer.Twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters.

  3. 3.

    These bias and fact checking websites are: https://mediabiasfactcheck.com/, http://www.fakenewscodex.com/, and https://www.snopes.com/. For transparency, these labels along with the associated URLs are available in a public repository: https://figshare.com/articles/conference_contribution/Clustering_Analysis_of_Website_ Usage_on_Twitter_during_the_COVID-19_Pandemic/13079657.

References

  1. Article 19: Viral lies: Misinformation and the coronavirus. Technical report, March 2020. https://www.article19.org/wp-content/uploads/2020/03/Coronavirus-briefing.pdf

  2. Bai, S., Sun, S., Bai, X., Zhang, Z., Tian, Q.: Improving context-sensitive similarity via smooth neighborhood for object retrieval. Pattern Recogn. 83, 353–364 (2018). https://doi.org/10.1016/j.patcog.2018.06.001. http://www.sciencedirect.com/science/article/pii/S0031320318302115

  3. Baltrusaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. CoRR abs/1705.09406 (2017). http://arxiv.org/abs/1705.09406

  4. Baltrušaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2019). https://doi.org/10.1109/TPAMI.2018.2798607

    Article  Google Scholar 

  5. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), 10008 (2008). https://doi.org/10.1088/1742-5468/2008/10/P10008

    Article  MATH  Google Scholar 

  6. Boberg, S., Quandt, T., Schatto-Eckrodt, T., Frischlich, L.: Pandemic populism: Facebook pages of alternative news media and the corona crisis - a computational content analysis. arXiv e-prints arXiv:2004.02566, April 2020

  7. Chen, E., Lerman, K., Ferrara, E.: COVID-19: the first public coronavirus Twitter dataset. arXiv e-prints arXiv:2003.07372, March 2020

  8. Cinelli, M., et al.: The COVID-19 social media infodemic. arXiv e-prints arXiv:2003.05004, March 2020

  9. Cruickshank, I.J.: Multi-view clustering of social-based data. Ph.D. thesis, Carnegie Mellon University, July 2020

    Google Scholar 

  10. Cruickshank, I.J., Carley, K.M.: Characterizing communities of hashtag usage on Twitter during the 2020 COVID-19 pandemic by multi-view clustering. Appl. Netw. Sci. 5(66) (2020). https://doi.org/10.1007/s41109-020-00317-8. https://appliednetsci.springeropen.com/articles/10.1007/s41109-020-00317-8

  11. Ferrara, E.: #COVID-19 on Twitter: bots, conspiracies, and social media activism. arXiv e-prints arXiv:2004.09531, April 2020

  12. Figueiredo, F., Jorge, A.: Identifying topic relevant hashtags in Twitter streams. Inf. Sci. 505, 65–83 (2019). https://doi.org/10.1016/j.ins.2019.07.062. http://www.sciencedirect.com/science/article/pii/S0020025519306668

  13. Fortunato, S., Barthelemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. 104(1), 36–41 (2007). https://doi.org/10.1073/pnas.0605965104

    Article  Google Scholar 

  14. Gallotti, R., Valle, F., Castaldo, N., Sacco, P., De Domenico, M.: Assessing the risks of “infodemics” in response to COVID-19 epidemics. arXiv e-prints arXiv:2004.03997, April 2020

  15. Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B., Lazer, D.: Fake news on Twitter during the 2016 U.S. presidential election. Science 363 (2019). https://doi.org/10.1126/science.aau2706. https://pubmed.ncbi.nlm.nih.gov/30679368/

  16. Huang, B.: Learning user latent attributes on social media. Ph.D. thesis, Carnegie Mellon University, May 2020

    Google Scholar 

  17. Huang, S., Chaudhary, K., Garmire, L.X.: More is better: recent progress in multi-omics data integration methods. Front. Genet. 8, 84 (2017). https://doi.org/10.3389/fgene.2017.00084. https://www.frontiersin.org/article/10.3389/fgene.2017.00084

  18. Hussain, W.: Role of social media in COVID-19 pandemic 4 (2020). https://doi.org/10.37978/tijfs.v4i2.144. http://publie.frontierscienceassociates.com/index.php/tijfs/article/view/144

  19. Kantis, C., Kiernan, S., Bardi, J.: Timeline of the coronavirus: think global health. https://www.thinkglobalhealth.org/article/updated-timeline-coronavirus

  20. Lancichinetti, A., Fortunato, S.: Limits of modularity maximization in community detection 84, 066122 (2011). https://doi.org/10.1103/PhysRevE.84.066122

  21. Maier, M., Hein, M., von Luxburg, U.: Optimal construction of k-nearest neighbor graphs for identifying noisy clusters. arXiv e-prints arXiv:0912.3408, December 2009

  22. Maier, M., von Luxburg, U., Hein, M.: How the result of graph clustering methods depends on the construction of the graph. arXiv e-prints arXiv:1102.2075, February 2011

  23. Majmundar, A., Allem, J.P., Boley Cruz, T., Unger, J.B.: The why we retweet scale. PLoS ONE 13(10), 1–12 (2018). https://doi.org/10.1371/journal.pone.0206076

    Article  Google Scholar 

  24. Newman, M.E.J.: Community detection in networks: modularity optimization and maximum likelihood are equivalent. arXiv e-prints arXiv:1606.02319, June 2016

  25. Pamfil, A.R., Howison, S.D., Lambiotte, R., Porter, M.A.: Relating modularity maximization and stochastic block models in multilayer networks. CoRR abs/1804.01964 (2018). http://arxiv.org/abs/1804.01964

  26. Qiao, L., Zhang, L., Chen, S., Shen, D.: Data-driven graph construction and graph learning: a review. Neurocomputing 312, 336–351 (2018). https://doi.org/10.1016/j.neucom.2018.05.084. http://www.sciencedirect.com/science/article/pii/S0925231218306696

  27. Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74, 016110 (2006). https://doi.org/10.1103/PhysRevE.74.016110

    Article  MathSciNet  Google Scholar 

  28. Traag, V.A., Waltman, L., van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Nat. Sci. Rep. 9 (2019). https://doi.org/10.1038/s41598-019-41695-z. https://www.nature.com/articles/s41598-019-41695-z

  29. Vicient, C., Moreno, A.: Unsupervised topic discovery in micro-blogging networks. Expert Syst. Appl. 42(17), 6472–6485 (2015). https://doi.org/10.1016/j.eswa.2015.04.014. http://www.sciencedirect.com/science/article/pii/S0957417415002444

  30. Yang, K.C., Torres-Lugo, C., Menczer, F.: Prevalence of low-credibility information on Twitter during the COVID-19 outbreak. arXiv e-prints arXiv:2004.14484, April 2020

  31. Yang, Y., Wang, H.: Multi-view clustering: a survey. Big Data Min. Anal. 1(2), 83–107 (2018)

    Article  Google Scholar 

  32. Ye, F., Chen, Z., Qian, H., Li, R., Chen, C., Zheng, Z.: New approaches in multi-view clustering. In: Recent Applications in Data Clustering (2018). https://doi.org/10.5772/intechopen.75598. https://www.intechopen.com/books/recent-applications-in-data-clustering/new-approaches-in-multi-view-clustering

  33. Zhu, X., Loy, C.C., Gong, S.: Constructing robust affinity graphs for spectral clustering. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1450–1457, June 2014. https://doi.org/10.1109/CVPR.2014.188

  34. Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A., Hoffman, M.M.: Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. arXiv e-prints arXiv:1807.00123, June 2018

Download references

Acknowledgement

This work is supported in part by the Office of Naval Research under the Multidisciplinary University Research Initiatives (MURI) Program award number N000141712675, Near Real Time Assessment of Emergent Complex Systems of Confederates, the Minerva program under grant number N000141512797, Dynamic Statistical Network Informatics, a National Science Foundation Graduate Research Fellowship (DGE 1745016), and by the center for Computational Analysis of Social and Organizational Systems (CASOS). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the ONR or the U.S. government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iain J. Cruickshank .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cruickshank, I.J., Carley, K.M. (2021). Clustering Analysis of Website Usage on Twitter During the COVID-19 Pandemic. In: Lossio-Ventura, J.A., Valverde-Rebaza, J.C., DĂ­az, E., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2020. Communications in Computer and Information Science, vol 1410. Springer, Cham. https://doi.org/10.1007/978-3-030-76228-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76228-5_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76227-8

  • Online ISBN: 978-3-030-76228-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics