Abstract
In this study we analyzed patterns of external website usage on Twitter during the COVID-19 pandemic. We used a multi-view clustering technique, which is able to incorporate multiple views of the data, to cluster the websites’ URLs based on their usage patterns and tweet text that occurs with the URLs. The results of the multi-view clustering of URLs used during the COVID-19 pandemic, from 29 January to 22 June 2020, revealed three, main clusters of URL usage. These three clusters differed significantly in terms of using information from different politically-biased, fake news, and conspiracy theory websites. Our results suggest that there are political biases in how information, to include misinformation, about the COVID-19 pandemic is used on Twitter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A Python implementation of this algorithm is available on the lead author’s GitHub page: https://github.com/ijcruic/Multi-view-Clustering-of-Social-Based-Data.
- 2.
- 3.
These bias and fact checking websites are: https://mediabiasfactcheck.com/, http://www.fakenewscodex.com/, and https://www.snopes.com/. For transparency, these labels along with the associated URLs are available in a public repository: https://figshare.com/articles/conference_contribution/Clustering_Analysis_of_Website_ Usage_on_Twitter_during_the_COVID-19_Pandemic/13079657.
References
Article 19: Viral lies: Misinformation and the coronavirus. Technical report, March 2020. https://www.article19.org/wp-content/uploads/2020/03/Coronavirus-briefing.pdf
Bai, S., Sun, S., Bai, X., Zhang, Z., Tian, Q.: Improving context-sensitive similarity via smooth neighborhood for object retrieval. Pattern Recogn. 83, 353–364 (2018). https://doi.org/10.1016/j.patcog.2018.06.001. http://www.sciencedirect.com/science/article/pii/S0031320318302115
Baltrusaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. CoRR abs/1705.09406 (2017). http://arxiv.org/abs/1705.09406
Baltrušaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2019). https://doi.org/10.1109/TPAMI.2018.2798607
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), 10008 (2008). https://doi.org/10.1088/1742-5468/2008/10/P10008
Boberg, S., Quandt, T., Schatto-Eckrodt, T., Frischlich, L.: Pandemic populism: Facebook pages of alternative news media and the corona crisis - a computational content analysis. arXiv e-prints arXiv:2004.02566, April 2020
Chen, E., Lerman, K., Ferrara, E.: COVID-19: the first public coronavirus Twitter dataset. arXiv e-prints arXiv:2003.07372, March 2020
Cinelli, M., et al.: The COVID-19 social media infodemic. arXiv e-prints arXiv:2003.05004, March 2020
Cruickshank, I.J.: Multi-view clustering of social-based data. Ph.D. thesis, Carnegie Mellon University, July 2020
Cruickshank, I.J., Carley, K.M.: Characterizing communities of hashtag usage on Twitter during the 2020 COVID-19 pandemic by multi-view clustering. Appl. Netw. Sci. 5(66) (2020). https://doi.org/10.1007/s41109-020-00317-8. https://appliednetsci.springeropen.com/articles/10.1007/s41109-020-00317-8
Ferrara, E.: #COVID-19 on Twitter: bots, conspiracies, and social media activism. arXiv e-prints arXiv:2004.09531, April 2020
Figueiredo, F., Jorge, A.: Identifying topic relevant hashtags in Twitter streams. Inf. Sci. 505, 65–83 (2019). https://doi.org/10.1016/j.ins.2019.07.062. http://www.sciencedirect.com/science/article/pii/S0020025519306668
Fortunato, S., Barthelemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. 104(1), 36–41 (2007). https://doi.org/10.1073/pnas.0605965104
Gallotti, R., Valle, F., Castaldo, N., Sacco, P., De Domenico, M.: Assessing the risks of “infodemics” in response to COVID-19 epidemics. arXiv e-prints arXiv:2004.03997, April 2020
Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B., Lazer, D.: Fake news on Twitter during the 2016 U.S. presidential election. Science 363 (2019). https://doi.org/10.1126/science.aau2706. https://pubmed.ncbi.nlm.nih.gov/30679368/
Huang, B.: Learning user latent attributes on social media. Ph.D. thesis, Carnegie Mellon University, May 2020
Huang, S., Chaudhary, K., Garmire, L.X.: More is better: recent progress in multi-omics data integration methods. Front. Genet. 8, 84 (2017). https://doi.org/10.3389/fgene.2017.00084. https://www.frontiersin.org/article/10.3389/fgene.2017.00084
Hussain, W.: Role of social media in COVID-19 pandemic 4 (2020). https://doi.org/10.37978/tijfs.v4i2.144. http://publie.frontierscienceassociates.com/index.php/tijfs/article/view/144
Kantis, C., Kiernan, S., Bardi, J.: Timeline of the coronavirus: think global health. https://www.thinkglobalhealth.org/article/updated-timeline-coronavirus
Lancichinetti, A., Fortunato, S.: Limits of modularity maximization in community detection 84, 066122 (2011). https://doi.org/10.1103/PhysRevE.84.066122
Maier, M., Hein, M., von Luxburg, U.: Optimal construction of k-nearest neighbor graphs for identifying noisy clusters. arXiv e-prints arXiv:0912.3408, December 2009
Maier, M., von Luxburg, U., Hein, M.: How the result of graph clustering methods depends on the construction of the graph. arXiv e-prints arXiv:1102.2075, February 2011
Majmundar, A., Allem, J.P., Boley Cruz, T., Unger, J.B.: The why we retweet scale. PLoS ONE 13(10), 1–12 (2018). https://doi.org/10.1371/journal.pone.0206076
Newman, M.E.J.: Community detection in networks: modularity optimization and maximum likelihood are equivalent. arXiv e-prints arXiv:1606.02319, June 2016
Pamfil, A.R., Howison, S.D., Lambiotte, R., Porter, M.A.: Relating modularity maximization and stochastic block models in multilayer networks. CoRR abs/1804.01964 (2018). http://arxiv.org/abs/1804.01964
Qiao, L., Zhang, L., Chen, S., Shen, D.: Data-driven graph construction and graph learning: a review. Neurocomputing 312, 336–351 (2018). https://doi.org/10.1016/j.neucom.2018.05.084. http://www.sciencedirect.com/science/article/pii/S0925231218306696
Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74, 016110 (2006). https://doi.org/10.1103/PhysRevE.74.016110
Traag, V.A., Waltman, L., van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Nat. Sci. Rep. 9 (2019). https://doi.org/10.1038/s41598-019-41695-z. https://www.nature.com/articles/s41598-019-41695-z
Vicient, C., Moreno, A.: Unsupervised topic discovery in micro-blogging networks. Expert Syst. Appl. 42(17), 6472–6485 (2015). https://doi.org/10.1016/j.eswa.2015.04.014. http://www.sciencedirect.com/science/article/pii/S0957417415002444
Yang, K.C., Torres-Lugo, C., Menczer, F.: Prevalence of low-credibility information on Twitter during the COVID-19 outbreak. arXiv e-prints arXiv:2004.14484, April 2020
Yang, Y., Wang, H.: Multi-view clustering: a survey. Big Data Min. Anal. 1(2), 83–107 (2018)
Ye, F., Chen, Z., Qian, H., Li, R., Chen, C., Zheng, Z.: New approaches in multi-view clustering. In: Recent Applications in Data Clustering (2018). https://doi.org/10.5772/intechopen.75598. https://www.intechopen.com/books/recent-applications-in-data-clustering/new-approaches-in-multi-view-clustering
Zhu, X., Loy, C.C., Gong, S.: Constructing robust affinity graphs for spectral clustering. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1450–1457, June 2014. https://doi.org/10.1109/CVPR.2014.188
Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A., Hoffman, M.M.: Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. arXiv e-prints arXiv:1807.00123, June 2018
Acknowledgement
This work is supported in part by the Office of Naval Research under the Multidisciplinary University Research Initiatives (MURI) Program award number N000141712675, Near Real Time Assessment of Emergent Complex Systems of Confederates, the Minerva program under grant number N000141512797, Dynamic Statistical Network Informatics, a National Science Foundation Graduate Research Fellowship (DGE 1745016), and by the center for Computational Analysis of Social and Organizational Systems (CASOS). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the ONR or the U.S. government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cruickshank, I.J., Carley, K.M. (2021). Clustering Analysis of Website Usage on Twitter During the COVID-19 Pandemic. In: Lossio-Ventura, J.A., Valverde-Rebaza, J.C., DĂaz, E., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2020. Communications in Computer and Information Science, vol 1410. Springer, Cham. https://doi.org/10.1007/978-3-030-76228-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-76228-5_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76227-8
Online ISBN: 978-3-030-76228-5
eBook Packages: Computer ScienceComputer Science (R0)