Abstract
Social media play a significant role in shaping and spreading societal views, including anti-vaccine sentiments that can undermine public health efforts. Understanding the extent of these views and predicting their future trends is challenging but essential. Social media posts, often semi-structured and laden with irony, are difficult to process with traditional methods. To address this, this study has developed a system to monitor the popularity of antivaccine misinformation and predict its future direction. A key feature of this research is the creation of a custom dataset. Instead of using a generic sentiment analyzer, Turkish tweets containing the word ”vaccine” were collected and categorized to create a specialized data set. The collected data have been analyzed using several advanced deep learning networks, including different BERT architectures, LSTM, and BART. These models were trained on the categorized dataset to classify the remaining tweets. This classification provided a metric indicating the prevalence of anti-vaccine sentiment on social media. The output from the top-performing model was subsequently used to train and test a range of time series forecasting models, which included the naive forecaster, AutoARIMA, AutoETS, Croston’s method, polynomial trend forecaster, unobserved components model, and Facebook’s Prophet. The goal was to pinpoint the most effective algorithm for predicting the future trends of anti-vaccine sentiment. This research stands out for its dual focus on tracking and predicting public sentiment, providing a potential early warning system for public health authorities. The best results in the classification task were achieved by BERT base with F1 scores of 0.851, 0.731, 0.779, and 0.720 for each respective class, indicating its superior ability to capture and classify sentiment in the data. In the subsequent task of forecasting future trends, Prophet emerged as the top-performing model, demonstrating a mean absolute error of 6.01, signifying its accuracy in predicting anti-vaccine sentiment trends. The use of various deep learning networks for sentiment analysis, different forecasting models for trend prediction, and a custom-made dataset highlights this research’s novelty in social media discourse analysis and prediction.




















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The data presented in this study are available upon request from the corresponding author.
References
Velavan TP, Meyer CG (2020) The covid-19 epidemic. Trop Med Int Health 25(3):278
Lopez-Leon S, Wegman-Ostrosky T, Perelman C, Sepulveda R, Rebolledo PA, Cuapio A, Villapol S (2021) More than 50 long-term effects of covid-19: a systematic review and meta-analysis. Sci Rep 11(1):1–12
Dong E, Du H, Gardner L (2020) An interactive web-based dashboard to track covid-19 in real time. Lancet Infect Dis 20(5):533–534
Nezhad ZB, Deihimi MA (2022) Twitter sentiment analysis from Iran about covid 19 vaccine. Diabetes Metab Syndr Clin Res Rev 16(1):102367
Mir AA, Sevukan R (2022) Sentiment analysis of indian tweets about covid-19 vaccines. J Inform Sci 56:01655515221118049
Nguyen A, Catalan-Matamoros D (2022) Anti-vaccine discourse on social media: an exploratory audit of negative tweets about vaccines and their posters. Vaccines 10(12):2067
Aljedaani W, Abuhaimed I, Rustam F, Mkaouer MW, Ouni A, Jenhani I (2022) Automatically detecting and understanding the perception of covid-19 vaccination: a middle east case study. Soc Netw Anal Min 12(1):128
Bonnevie E, Gallegos-Jeffrey A, Goldbarg J, Byrd B, Smyser J (2021) Quantifying the rise of vaccine opposition on twitter during the covid-19 pandemic. J Commun Healthc 14(1):12–19
Gunaratne K, Coomes EA, Haghbayan H (2019) Temporal trends in anti-vaccine discourse on twitter. Vaccine 37(35):4867–4871
Nasralah T, Elnoshokaty A, El-Gayar O, Al-Ramahi M, Wahbeh A (2022) A comparative analysis of anti-vax discourse on twitter before and after covid-19 onset. Health Inform J 28(4):14604582221135832
Wicke P, Bolognesi MM (2021) Covid-19 discourse on twitter: how the topics, sentiments, subjectivity, and figurative frames changed over time. Front Commun 6:651997
O’Leary DE, Storey VC (2020) A google-wikipedia-twitter model as a leading indicator of the numbers of coronavirus deaths. Intell Syst Account Financ Manag 27(3):151–158
Kouzis-Loukas D (2016) Learning scrapy. Packt Publishing Ltd, Mumbai.
Sağlık Bakanı Fahrettin Koca kritik tabloyu paylaştı! https://www.sozcu.com.tr/2021/gundem/saglik-bakani-koca-acikladi-iste-28-temmuz-2021-guncel-corona-virusu-koronavirus-verileri-6561588/. Accessed: 2023-05-17
16 Ağustos koronavirüs tablosu AÇIKLANDI! Son dakika bugünkü corona vaka sayısı belli oldu. https://www.haberturk.com/16-agustos-koronavirus-tablosu-aciklaniyor-son-dakika-bugunku-corona-vaka-sayisi-kac-3164277/. Accessed: 2023-05-17
CNN Türk Cumhurbaşkanı Özel Yayını. https://www.tccb.gov.tr/mulakatlar/1709/128960/kanal-d-cnn-turk-cumhurbaskani-ozel-yayini-. Accessed: 2023-05-17
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Swathi T, Kasiviswanath N, Rao AA (2022) An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis. Appl Intell 52(12):13675–13688
Poomka P, Pongsena W, Kerdprasop N, Kerdprasop K (2019) Sms spam detection based on long short-term memory and gated recurrent unit. Int J Fut Comput Commun 8(1):11–15
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318. PMLR
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2019) Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461
Löning M, Bagnall A, Ganesh S, Kazakov V, Lines J, Király FJ (2019) sktime: A unified interface for machine learning with time series. arXiv preprint arXiv:1909.07872
Hyndman RJ, Athanasopoulos G (2018) Forecasting: principles and practice. OTexts, Melbourne.
Taylor SJ, Letham B (2018) Forecasting at scale. Am Stat 72(1):37–45
Jiang Z-H, Yu W, Zhou D, Chen Y, Feng J, Yan S (2020) Convbert: improving bert with span-based dynamic convolution. Adv Neural Inform Process Syst 33:12837–12848
Sanh V, Debut L, Chaumond J, Wolf T (2019) Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this work. All authors have read and agreed to this manuscript.
Corresponding authors
Ethics declarations
Conflicts of Interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Biri, I., Kucuktas, U.T., Uysal, F. et al. Forecasting the future popularity of the anti-vax narrative on Twitter with machine learning. J Supercomput 80, 2917–2947 (2024). https://doi.org/10.1007/s11227-023-05567-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05567-8