Skip to main content

Virality Prediction for News Tweets Using RoBERTa

  • Conference paper
  • First Online:
Advances in Soft Computing (MICAI 2021)

Abstract

The virality of a tweet is essential to convey its message to a broader audience and, eventually, to generate influence. This is especially important for news outlets as they struggle to transition from traditional media to online formats. As their usual readers will not migrate directly to digital news outlets need to gather new audiences from the spaces where real-time information and discussions are happening; this is Social Media and in particular Twitter. Since the news websites and Twitter languages differ greatly news outlets need to write their tweets properly to maximize their impact on Twitter. We propose a method to predict if a tweet will be influential or not influential based on its text using a variant of Google BERT named RoBERTa, and a corpus of 5000 high-quality and automatically labeled highly-influential and non-influential tweets to train and classify tweets in these categories. Our method reaches an F1 of 0.873, improving  4 and  9 over approaches using LSTMs and n-grams respectively.

This work was supported by the CONACYT, Mexico, under Grant A1-S-47854 and by the Secretaría de Investigación y Posgrado of the Instituto Politécnico Nacional under Grants 20200859, 20211784, 20211884, and 20211178.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Twitter power: tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 60(11), 2169–2188 (2009)

    Article  Google Scholar 

  2. Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600 (2010)

    Google Scholar 

  3. Hendriks, P.: Epilogue the myth of the death of newspapers. In: Newspapers: A Lost Cause?, pp. 195–201. Springer, Cham (1999)

    Google Scholar 

  4. Johnson, T.J., Kaye, B.K.: Blog day afternoon: are blogs stealing audiences away from traditional media sources? In: CYBERMEDIA, p. 320 (2006)

    Google Scholar 

  5. Minuti, D.: Journalism and ethics-ethics in journalism in the era of prolific sources. Academicus Int. Sci. J. 109–119 (2010)

    Google Scholar 

  6. Liu, Y., Chen, W., Li, J.: Transformation and development of traditional media in new media environment. In: Xie, Y. (ed.) New Media and China’s Social Development. RSCDCDP, pp. 25–46. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-3994-2_3

    Chapter  Google Scholar 

  7. Goel, S., Anderson, A., Hofman, J., Watts, D.J.: The structural virality of online diffusion. Manag. Sci. 62(1), 180–196 (2016)

    Google Scholar 

  8. Maldonado, C.E.: How to improve the reach and impact of social media content. Res. Comput. Sci. 127, 59–68 (2016)

    Article  Google Scholar 

  9. Yang, Q., Tufts, C., Ungar, L., Guntuku, S., Merchant, R.: To retweet or not to retweet: understanding what features of cardiovascular tweets influence their retransmission. J. Health Commun. 23(12), 1026–1035 (2018)

    Article  Google Scholar 

  10. Keib, K., Himelboim, I., Han, J.Y.: Important tweets matter: predicting retweets in the# blacklivesmatter talk on twitter. Comput. Hum. Behav. 85, 106–115 (2018)

    Article  Google Scholar 

  11. Lee, C.H., Yu, H.: The impact of language on retweeting during acute crises: uncertainty reduction and language expectancy perspectives. Ind. Manag. Data Syst. Forthcoming (2019)

    Google Scholar 

  12. Bandari, R., Asur, S., Huberman, B.: The pulse of news in social media: forecasting popularity. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 6 (2012)

    Google Scholar 

  13. Kowalczyk, D.K., Larsen, J.: Scalable privacy-compliant virality prediction on twitter. arXiv preprint arXiv:1812.06034 (2018)

  14. Xiao, C., Liu, C., Ma, Y., Li, Z., Luo, X.: Time sensitivity-based popularity prediction for online promotion on twitter. Inf. Sci. 525, 82–92 (2020)

    Article  Google Scholar 

  15. Rosé, C., et al.: Analyzing collaborative learning processes automatically: exploiting the advances of computational linguistics in computer-supported collaborative learning. Int. J. Comput.-Supp. Collab. Learn. 3(3), 237–271 (2008)

    Google Scholar 

  16. Witten, I.H., Frank, E., Hall, M.A., Pal, C., Data, M.: Practical machine learning tools and techniques. In: DATA MINING. vol. 2, p. 4 (2005)

    Google Scholar 

  17. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of KDD-2013, pp. 847–855 (2013)

    Google Scholar 

  18. Molino, P., Dudin, Y., Miryala, S.S.: Ludwig: a type-based declarative deep learning toolbox. arXiv preprint arXiv:1909.07930 (2019)

  19. Pachón, V., Vázquez, J.M., Olmedo, J.L.D.: Identification of profession & occupation in health-related social media using tweets in spanish. In: Proceedings of the Sixth Social Media Mining for Health (# SMM4H) Workshop and Shared Task, pp. 105–107 (2021)

    Google Scholar 

  20. Wolf, T., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)

  21. Desai, A., Sunil, R.: Analysis of machine learning algorithms using Weka. Int. J. Comput. Appl. 975, 8887 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian E. Maldonado-Sifuentes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Maldonado-Sifuentes, C.E., Angel, J., Sidorov, G., Kolesnikova, O., Gelbukh, A. (2021). Virality Prediction for News Tweets Using RoBERTa. In: Batyrshin, I., Gelbukh, A., Sidorov, G. (eds) Advances in Soft Computing. MICAI 2021. Lecture Notes in Computer Science(), vol 13068. Springer, Cham. https://doi.org/10.1007/978-3-030-89820-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89820-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89819-9

  • Online ISBN: 978-3-030-89820-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics