Skip to main content

Opinion Mining for Measuring the Social Perception of Infectious Diseases. An Infodemiology Approach

  • Conference paper
  • First Online:
Technologies and Innovation (CITI 2018)

Abstract

Prior to the digital era, knowing the perception of society towards the health-system was done through face-to-face questionnaires and interviews. With this knowledge, governments and public organizations have designed effective action plans in order to improve our quality of life. Nowadays, as a result of the irruption of computer networks, it is possible to reach a higher number of people with a minor cost and perform automatic analysis of the collected data. Infodemiology is the research discipline oriented to the study of health information on the Internet. In this work, we explore the reliability of Opinion Mining to measure the subjective perception of people towards infectious diseases during times of high risk of contagion. In short, linguistic characteristics, among other relevant data, were extracted from tweets written in the Spanish Language by the end of 2017 in Ecuador. The built model contains the most relevant linguistics characteristics related to determine positive and negative pieces of text regarding infectious diseases. In addition, the corpus used in this analysis has been published for other researchers to use it in future experiments in this area. The results showed Support Vector Machines achieved the best results with a precision of 86.5%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Eysenbach, G.: Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. J. Med. Internet Res. 11(1), 11 (2009)

    Article  Google Scholar 

  2. Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PLoS ONE 5(11), e14118 (2010)

    Article  Google Scholar 

  3. Jamison, D.T., Breman, J.G., Measham, A.R., Alleyne, G., Claeson, M., Evans, D.B., Musgrove, P.: Disease Control Priorities in Developing Countries. World Bank Publications, Herndon (2006)

    Google Scholar 

  4. Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012 (2009)

    Article  Google Scholar 

  5. Althouse, B.M., Ng, Y.Y., Cummings, D.A.: Prediction of dengue incidence using search query surveillance. PLoS Negl. Trop. Dis. 5(8), e1258 (2011)

    Article  Google Scholar 

  6. Prieto, V.M., Matos, S., Alvarez, M., Cacheda, F., Oliveira, J.L.: Twitter: a good place to detect health conditions. PLoS ONE 9(1), e8619 (2014)

    Google Scholar 

  7. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  8. Banerjee, D., Mondal, B., Chakraborty, S.: A new framework for sentiment analysis with six-tuples. Communications 2, 25–29 (2015)

    Google Scholar 

  9. Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Proceedings of the 2008 ACM Symposium on Applied Computing, pp. 1556–1560. ACM (2008)

    Google Scholar 

  10. Salas-Zárate, M.D.P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodríguez-García, M.Á., Valencia-García, R.: Sentiment analysis on tweets about diabetes: an aspect-level approach. Comput. Math. Methods Med. 2017, 9 pages (2017). Article ID 5140631. https://doi.org/10.1155/2017/5140631

    Article  Google Scholar 

  11. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count: LIWC 2001, vol. 71. Lawrence Erlbaum Associates, Mahwah (2001)

    Google Scholar 

  12. del Pilar Salas-Zarate, M., Paredes-Valverde, M.A., Rodriguez-García, M.A., Valencia-García, R., Alor-Hernández, G.: Automatic detection of satire in Twitter: a psycholinguistic-based approach. Knowl. Based Syst. 128, 20–33 (2017)

    Article  Google Scholar 

  13. del Pilar Salas-Zarate, M., Paredes-Valverde, M.A., Limon, J., Tlapa, D.A., Báez, Y.A.: Sentiment classification of spanish reviews: an approach based on feature selection and machine learning methods. J. UCS 22(5), 691–708 (2016)

    MathSciNet  Google Scholar 

  14. Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36–44. Association for Computational Linguistics (2010)

    Google Scholar 

  15. Liu, K.L., Li, W.J., Guo, M.: Emoticon smoothed language models for twitter sentiment analysis. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)

    Google Scholar 

  16. García-Díaz, J.A., Salas-Zárate, M.P., Hernández-Alcaraz, M.L., Valencia-García, R., Gómez-Berbís, J.M.: Machine learning based sentiment analysis on Spanish Financial Tweets. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds.) WorldCIST’18 2018. AISC, vol. 745, pp. 305–311. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77703-0_31

    Chapter  Google Scholar 

  17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  18. Brinton, L.J.: The historical present in Charlotte Bronte’s novels: some discourse functions. Style 26, 221–244 (1992)

    Google Scholar 

  19. Kanaris, I., Kanaris, K., Houvardas, I., Stamatatos, E.: Words versus character n-grams for anti-spam filtering. Int. J. Artif. Intell. Tools 16(06), 1047–1067 (2007)

    Article  Google Scholar 

  20. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)

    Google Scholar 

  21. Di Lucca, G.A., Di Penta, M., Fasolino, A.R.: An approach to identify duplicated web pages. In: Proceedings of 26th Annual International Computer Software and Applications Conference, COMPSAC 2002, pp. 481–486. IEEE (2002)

    Google Scholar 

Download references

Acknowledgements

This work has been funded by the Universidad de Guayaquil (Ecuador) through the project entitled “Tecnologías inteligentes para la autogestión de la salud”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Antonio García-Díaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

García-Díaz, J.A. et al. (2018). Opinion Mining for Measuring the Social Perception of Infectious Diseases. An Infodemiology Approach. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2018. Communications in Computer and Information Science, vol 883. Springer, Cham. https://doi.org/10.1007/978-3-030-00940-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00940-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00939-7

  • Online ISBN: 978-3-030-00940-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics