Skip to main content

Advertisement

Log in

Analyzing user-generated content using natural language processing: a case study of public satisfaction with healthcare systems

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

While user-generated online content (UGC) is increasingly available, public opinion studies are yet to fully exploit the abundance and richness of online data. This study contributes to the practical knowledge of user-generated online content and machine learning techniques that can be used for the analysis of UGC. For this purpose, we explore the potential of user-generated content and present an application of natural language pre-processing, text mining and sentiment analysis to the question of public satisfaction with healthcare systems. Concretely, we analyze 634 online comments reflecting attitudes towards healthcare services in different countries. Our analysis identifies the frequency of topics related to healthcare services in textual content of the comments and attempts to classify and rank national healthcare systems based on the respondents’ sentiment scores. In this paper, we describe our approach, summarize our main findings, and compare them with the results from cross-national surveys. Finally, we outline the typical limitations inherent in the analysis of user-generated online content and suggest avenues for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

Not applicable.

Code availability

Not applicable.

Notes

  1. For a comprehensive comparison between social media and surveys, see Schober et al. [37].

  2. Aaron Carol is a health services researcher and professor of pediatrics at Indiana University School of Medicine.

  3. Austin Frakt is a director of the Partnered Evidence-Based Policy Resource Center at the V. A. Boston Healthcare System, associate professor with Boston University’s School of Public Health; and adjunct associate professor with the Harvard T. H. Chan School of Public Health.

  4. A common notion in a computing: ‘garbage in garbage out’ highlights the importance of the quality of input data.

  5. For an excellent discussion of the importance of validation see Grimmer and Stewart [15].

References

  1. Badawy, A., & Ferrara, E. (2018). The rise of jihadist propaganda on social networks. Journal of Computational Social Science, 1(2), 453–470.

    Article  Google Scholar 

  2. Blendon, R. J., Benson, J., Donelan, K., Leitman, R., Taylor, H., Koeck, C., & Gitterman, D. (1995). Who has the best health care system? A second look. Health Affairs, 14(4), 220–230.

    Article  Google Scholar 

  3. Bleich, S. N., Özaltin, E., & Murray, C. J. (2009). How does satisfaction with the health-care system relate to patient experience? Bulletin of the World Health Organization, 87, 271–278.

    Article  Google Scholar 

  4. Bonikowski, B. (2017). Big data: challenges and opportunities for comparative historical sociology. Trajectories Newsletter of the ASA Comparative and Historical Section, 28(2), 29–32.

    Google Scholar 

  5. Bonoli, G., & Palier, B. (1998). Changing the politics of social programmes: Innovative change in British and French welfare reforms. Journal of European Social Policy, 8(4), 317–330.

    Article  Google Scholar 

  6. Cammett, M., Lynch, J., & Bilev, G. (2015). The influence of private health care financing on citizen trust in government. Perspectives on Politics, 13(4), 938–957.

    Article  Google Scholar 

  7. Caren (2012). https://nealcaren.github.io/.

  8. Cohen, G. (1996). Age and health status in a patient satisfaction survey. Social Science & Medicine, 42(7), 1085–1093.

    Article  Google Scholar 

  9. Couper, M. P. (2011). The future of modes of data collection. Public Opinion Quarterly, 75(5), 889–908.

    Article  Google Scholar 

  10. Enghoff, O., & Aldridge, J. (2019). The value of unsolicited online data in drug policy research. International Journal of Drug Policy, 73, 210–218.

    Article  Google Scholar 

  11. Feinerer, I. (2008). An introduction to text mining in R. The Newsletter of the R Project volume 8/2, October 2008 8 (2008):19.

  12. Gelissen, J. (2000). Popular support for institutionalised solidarity: A comparison between European welfare states. International Journal of Social Welfare, 9(4), 285–300.

    Article  Google Scholar 

  13. Gevers, J., Gelissen, J., Arts, W., & Muffels, R. (2000). Public health care in the balance: Exploring popular support for health care systems in the European Union. International Journal of Social Welfare, 9(4), 301–321.

    Article  Google Scholar 

  14. Golato, A. (2017). Naturally occurring data. The Routledge Handbook of Pragmatics (pp. 21–26). Routledge.

    Book  Google Scholar 

  15. Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297.

    Article  Google Scholar 

  16. Groves, R. M. (2011). Three eras of survey research. Public Opinion Quarterly, 75(5), 861–871.

    Article  Google Scholar 

  17. Hall, J. A., & Dornan, M. C. (1990). Patient sociodemographic characteristics as predictors of satisfaction with medical care: A meta-analysis. Social Science & Medicine, 30(7), 811–818.

    Article  Google Scholar 

  18. Harford, T. (2014). Big data: A big mistake? Significance, 11(5), 14–19.

    Article  Google Scholar 

  19. Havey, N. F. (2020). Partisan public health: How does political ideology influence support for COVID-19 related misinformation? Journal of Computational Social Science, 3(2), 319–342.

    Article  Google Scholar 

  20. He, W., Tian, X., Tao, R., Zhang, W., Yan, G., & Akula, V. (2017). Application of social media analytics: a case of analyzing online hotel reviews. Online Information Review, 41, 921–935.

    Article  Google Scholar 

  21. Hutto, C. J., & Eric, G. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media. 2014.

  22. Japec, L., Kreuter, F., Berg, M., Biemer, P., Decker, P., Lampe, C., Lane, J., Cathy, O., & Usher, A. (2015). Big data in survey research: AAPOR task force report. Public Opinion Quarterly, 79(4), 839–880.

    Article  Google Scholar 

  23. Jensen, C., & Naumann, E. (2016). Increasing pressures and support for public healthcare in Europe. Health Policy, 120(6), 698–705.

    Article  Google Scholar 

  24. Kleinberg, B., van der Isabelle, V., & Paul, G. (2021). The temporal evolution of a far-right forum. Journal of Computational Social Science, 4(1), 1–23.

    Article  Google Scholar 

  25. Kohl, J., & Wendt, C. (2004). Satisfaction with health care systems. A comparison of EU countries. In W. Glatzer, S. V. Below, & M. Stoffregen (Eds.), Challenges for Quality of Life in the Contemporary World (pp. 311–331). Kluwer Academic Publishers.

    Chapter  Google Scholar 

  26. Kurian, J. C. (2015). Facebook use by the open access repository users. Online Information Review., 39, 903–922.

    Article  Google Scholar 

  27. Manosevitch, E., & Walker, D. (2009, April). Reader comments to online opinion journalism: A space of public deliberation. In International Symposium on Online Journalism Vol. 10, pp. 1–30.

  28. Missinne, S., Meuleman, B., & Bracke, P. (2013). The popular legitimacy of European healthcare systems: A multilevel analysis of 24 countries. Journal of European Social Policy, 23(3), 231–247.

    Article  Google Scholar 

  29. Mossialos, E. (1997). Citizens’ views on health care systems in the 15 member states of the European Union. Health Economics, 6(2), 109–116.

    Article  Google Scholar 

  30. Naeem, B., Khan, A., Beg, M. O., & Mujtaba, H. (2020). A deep learning framework for clickbait detection on social area network using natural language cues. Journal of Computational Social Science, 3, 1–13.

    Article  Google Scholar 

  31. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.

    Article  Google Scholar 

  32. Piña-García, C. A., Mario-Siqueiros-García, J., Robles-Belmont, E., Carreón, G., Gershenson, C., & Amador-Díaz-López, J. (2018). From neuroscience to computer science: a topical approach on Twitter. Journal of Computational Social Science, 1(1), 187–208.

    Article  Google Scholar 

  33. Rahmqvist, M., & Bara, A. C. (2010). Patient characteristics and quality dimensions related to patient satisfaction. International Journal for Quality in Health Care, 22(2), 86–92.

    Article  Google Scholar 

  34. Robinson, K. M. (2001). Unsolicited narratives from the Internet: A rich source of qualitative data. Qualitative Health Research, 11(5), 706–714.

    Article  Google Scholar 

  35. Ryan, G., & Bernard, H. (2003). Techniques to identify themes. Field Methods, 15(1), 85–109.

    Article  Google Scholar 

  36. Santana, A. D. (2011). Online readers’ comments represent new opinion pipeline. Newspaper Research Journal, 32(3), 66–81.

    Article  Google Scholar 

  37. Schober, M. F., Pasek, J., Guggenheim, L., Lampe, C., & Conrad, F. G. (2016). Social media analyses for social measurement. Public Opinion Quarterly, 80(1), 180–211.

    Article  Google Scholar 

  38. Shahsavari, S., Holur, P., Wang, T., Tangherlini, T. R., & Roychowdhury, V. (2020). Conspiracy in the time of corona: Automatic detection of emerging COVID-19 conspiracy theories in social media and the news. Journal of Computational Social Science, 3(2), 279–317.

    Article  Google Scholar 

  39. Souma, W., Vodenska, I., & Aoyama, H. (2019). Enhanced news sentiment analysis using deep learning methods. Journal of Computational Social Science, 2(1), 33–46.

    Article  Google Scholar 

  40. van der Vegt, I., Maximilian, M., Paul, G., & Bennett, K. (2021). Online influence, offline violence: language use on YouTube surrounding the ‘Unite the Right’rally. Journal of Computational Social Science, 4(1), 333–354.

    Article  Google Scholar 

  41. Uyheng, J., & Carley, K. M. (2020). Bots and online hate during the COVID-19 pandemic: Case studies in the United States and the Philippines. Journal of Computational Social Science, 3(2), 445–468.

    Article  Google Scholar 

  42. Wang, A.H.-E., Mei-chun, L., Min-Hsuan, W., & Puma, S. (2020). Influencing overseas Chinese by tweets: text-images as the key tactic of Chinese propaganda. Journal of Computational Social Science, 3(2), 469–486.

    Article  Google Scholar 

  43. Wendt, C., Kohl, J., Mischke, M., & Pfeifer, M. (2010). How do Europeans perceive their healthcare system? Patterns of satisfaction and preference for state involvement in the field of healthcare. European Sociological Review, 26(2), 177–192.

    Article  Google Scholar 

Download references

Acknowledgements

The author would like to thank Seppe vanden Broucke for his valuable suggestions in the earlier stages of the paper.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Ruelens.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ruelens, A. Analyzing user-generated content using natural language processing: a case study of public satisfaction with healthcare systems. J Comput Soc Sc 5, 731–749 (2022). https://doi.org/10.1007/s42001-021-00148-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-021-00148-2

Keywords