Abstract
Prior to the digital era, knowing the perception of society towards the health-system was done through face-to-face questionnaires and interviews. With this knowledge, governments and public organizations have designed effective action plans in order to improve our quality of life. Nowadays, as a result of the irruption of computer networks, it is possible to reach a higher number of people with a minor cost and perform automatic analysis of the collected data. Infodemiology is the research discipline oriented to the study of health information on the Internet. In this work, we explore the reliability of Opinion Mining to measure the subjective perception of people towards infectious diseases during times of high risk of contagion. In short, linguistic characteristics, among other relevant data, were extracted from tweets written in the Spanish Language by the end of 2017 in Ecuador. The built model contains the most relevant linguistics characteristics related to determine positive and negative pieces of text regarding infectious diseases. In addition, the corpus used in this analysis has been published for other researchers to use it in future experiments in this area. The results showed Support Vector Machines achieved the best results with a precision of 86.5%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Eysenbach, G.: Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. J. Med. Internet Res. 11(1), 11 (2009)
Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PLoS ONE 5(11), e14118 (2010)
Jamison, D.T., Breman, J.G., Measham, A.R., Alleyne, G., Claeson, M., Evans, D.B., Musgrove, P.: Disease Control Priorities in Developing Countries. World Bank Publications, Herndon (2006)
Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012 (2009)
Althouse, B.M., Ng, Y.Y., Cummings, D.A.: Prediction of dengue incidence using search query surveillance. PLoS Negl. Trop. Dis. 5(8), e1258 (2011)
Prieto, V.M., Matos, S., Alvarez, M., Cacheda, F., Oliveira, J.L.: Twitter: a good place to detect health conditions. PLoS ONE 9(1), e8619 (2014)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)
Banerjee, D., Mondal, B., Chakraborty, S.: A new framework for sentiment analysis with six-tuples. Communications 2, 25–29 (2015)
Strapparava, C., Mihalcea, R.: Learning to identify emotions in text. In: Proceedings of the 2008 ACM Symposium on Applied Computing, pp. 1556–1560. ACM (2008)
Salas-Zárate, M.D.P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodríguez-García, M.Á., Valencia-García, R.: Sentiment analysis on tweets about diabetes: an aspect-level approach. Comput. Math. Methods Med. 2017, 9 pages (2017). Article ID 5140631. https://doi.org/10.1155/2017/5140631
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count: LIWC 2001, vol. 71. Lawrence Erlbaum Associates, Mahwah (2001)
del Pilar Salas-Zarate, M., Paredes-Valverde, M.A., Rodriguez-García, M.A., Valencia-García, R., Alor-Hernández, G.: Automatic detection of satire in Twitter: a psycholinguistic-based approach. Knowl. Based Syst. 128, 20–33 (2017)
del Pilar Salas-Zarate, M., Paredes-Valverde, M.A., Limon, J., Tlapa, D.A., Báez, Y.A.: Sentiment classification of spanish reviews: an approach based on feature selection and machine learning methods. J. UCS 22(5), 691–708 (2016)
Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36–44. Association for Computational Linguistics (2010)
Liu, K.L., Li, W.J., Guo, M.: Emoticon smoothed language models for twitter sentiment analysis. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)
García-Díaz, J.A., Salas-Zárate, M.P., Hernández-Alcaraz, M.L., Valencia-García, R., Gómez-Berbís, J.M.: Machine learning based sentiment analysis on Spanish Financial Tweets. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds.) WorldCIST’18 2018. AISC, vol. 745, pp. 305–311. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77703-0_31
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Brinton, L.J.: The historical present in Charlotte Bronte’s novels: some discourse functions. Style 26, 221–244 (1992)
Kanaris, I., Kanaris, K., Houvardas, I., Stamatatos, E.: Words versus character n-grams for anti-spam filtering. Int. J. Artif. Intell. Tools 16(06), 1047–1067 (2007)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)
Di Lucca, G.A., Di Penta, M., Fasolino, A.R.: An approach to identify duplicated web pages. In: Proceedings of 26th Annual International Computer Software and Applications Conference, COMPSAC 2002, pp. 481–486. IEEE (2002)
Acknowledgements
This work has been funded by the Universidad de Guayaquil (Ecuador) through the project entitled “Tecnologías inteligentes para la autogestión de la salud”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
García-Díaz, J.A. et al. (2018). Opinion Mining for Measuring the Social Perception of Infectious Diseases. An Infodemiology Approach. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2018. Communications in Computer and Information Science, vol 883. Springer, Cham. https://doi.org/10.1007/978-3-030-00940-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-00940-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00939-7
Online ISBN: 978-3-030-00940-3
eBook Packages: Computer ScienceComputer Science (R0)