Skip to main content

Mining Twitter for Measuring Social Perception Towards Diabetes and Obesity in Central America

  • Conference paper
  • First Online:
  • 555 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1124))

Abstract

For a long time, diabetes and obesity have been considered a menace only in developed countries. Nevertheless, the proliferation of unhealthy habits, such as fast-food chains and sedentary lifestyles, have caused diabetes and obesity to spread worldwide causing many and costly complications. Since citizens use of the Internet to search, learn, and share their daily personal experiences, the social networks have become popular data-sources that facilitate a deeper understanding of public health concerns. However, the exploitation of this data requires labelled resources and examples; however, as far as our knowledge, these resources do not exist in Spanish. Consequently, (1) we compile a balanced multi-class corpus with tweets regarding diabetes and obesity written in Spanish in Central-America; and, (2) we use the aforementioned corpus to train and test a machine-learning classifier capable of determining whether the texts related to diabetes or obesity are positive, negative, or neutral. The experimental results show that the best result was obtained through the Bag of Words model with an accuracy of 84.30% with the LIBLinear library. As a final contribution, the compiled corpus is released.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.who.int/news-room/fact-sheets/detail/diabetes.

  2. 2.

    https://developer.twitter.com/en/docs/tweets/search/overview/standard.

  3. 3.

    https://semantics.inf.um.es/joseagd/diabetes-and-obesity-positive-neutral-negative.rar.

  4. 4.

    https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases.

References

  1. Apolinardo-Arzube, Ó., García-Díaz, J.A., Medina-Moreira, J., Luna-Aveiga, H., Valencia-García, R.: Evaluating information-retrieval models and machine-learning classifiers for measuring the social perception towards infectious diseases. Appl. Sci. 9(14), 2858 (2019)

    Article  Google Scholar 

  2. Apolinario-Arzube, Ó., Medina-Moreira, J.A., Lagos-Ortiz, K., Luna-Aveiga, H., García-Díaz, J.A., Valencia-García, R.: Tecnologías inteligentes para la autogestión de la salud. Procesamiento del Lenguaje Natural 61, 159–162 (2018)

    Google Scholar 

  3. Araujo, M., Reis, J., Pereira, A., Benevenuto, F.: An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 1140–1145. ACM (2016)

    Google Scholar 

  4. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec, vol. 10, pp. 2200–2204 (2010)

    Google Scholar 

  5. Barbieri, F., Ronzano, F., Saggion, H.: Is this tweet satirical? a computational approach for satire detection in spanish. Procesamiento del Lenguaje Natural 55, 135–142 (2015)

    Google Scholar 

  6. Cho, N., et al.: Idf diabetes atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res. Clin. Pract. 138, 271–281 (2018)

    Article  Google Scholar 

  7. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  8. García-Sánchez, F., Paredes-Valverde, M., Valencia-García, R., Alcaraz-Mármol, G., Almela, Á.: Kbs4fia: leveraging advanced knowledge-based systems for financial information analysis. Procesamiento del Lenguaje Nat. 59, 145–148 (2017)

    Google Scholar 

  9. Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1(12), 2009 (2009)

    Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  11. Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)

    Google Scholar 

  12. Huang, M., ElTayeby, O., Zolnoori, M., Yao, L.: Public opinions toward diseases: infodemiological study on news media data. J. Med. Internet Res. 20(5), e10047 (2018)

    Article  Google Scholar 

  13. Ishijima, H., Kazumi, T., Maeda, A.: Sentiment analysis for the japanese stock market. Global Bus. Econ. Rev. 17(3), 237–255 (2015)

    Article  Google Scholar 

  14. Jianqiang, Z., Xiaolin, G.: Comparison research on text pre-processing methods on twitter sentiment analysis. IEEE Access 5, 2870–2879 (2017)

    Article  Google Scholar 

  15. Koppel, M., Schler, J.: The importance of neutral examples for learning sentiment. Comput. Intell. 22(2), 100–109 (2006)

    Article  MathSciNet  Google Scholar 

  16. Martínez-Cámara, E., Martín-Valdivia, M.T., Urena-López, L.A., Montejo-Ráez, A.R.: Sentiment analysis in twitter. Nat. Lang. Eng. 20(1), 1–28 (2014)

    Article  Google Scholar 

  17. Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Paredes, R., Valencia-García, R.: Usage of diabetes self-management mobile technology: options for ecuador. In: Valencia-García, R., Lagos-Ortiz, K., Alcaraz-Mármol, G., del Cioppo, J., Vera-Lucio, N. (eds.) CITI 2016. CCIS, vol. 658, pp. 79–89. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48024-4_7

    Chapter  Google Scholar 

  18. Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  19. Moghaddam, S.: Beyond sentiment analysis: mining defects and improvements from customer feedback. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 400–410. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16354-3_44

    Chapter  Google Scholar 

  20. Moraes, R., Valiati, J.F., Neto, W.P.G.: Document-level sentiment classification: an empirical comparison between svm and ann. Expert Syst. Appl. 40(2), 621–633 (2013)

    Article  Google Scholar 

  21. Ochoa, J.L., Valencia-García, R., Perez-Soltero, A., Barceló-Valenzuela, M.: A semantic role labelling-based framework for learning ontologies from spanish documents. Expert Syst. Appl. 40(6), 2058–2068 (2013)

    Article  Google Scholar 

  22. Orces, C.H., Lorenzo, C.: Prevalence of prediabetes and diabetes among older adults in ecuador: analysis of the sabe survey. Diab. Metab. Syndr. Clin. Res. Rev. 12(2), 147–153 (2018)

    Google Scholar 

  23. Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retrieval 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  24. Peñalver-Martinez, I., et al.: Feature-based opinion mining through ontologies. Expert Syst. Appl. 41(13), 5995–6008 (2014)

    Article  Google Scholar 

  25. Powers, M.A., et al.: Diabetes self-management education and support in type 2 diabetes: a joint position statement of the american diabetes association, the american association of diabetes educators, and the academy of nutrition and dietetics. Diabetes Educ. 43(1), 40–53 (2017)

    Article  Google Scholar 

  26. Ramírez-Esparza, N., Pennebaker, J.W., García, F.A., Suriá, R.: La psicología del uso de las palabras: Un programa de computadora que analiza textos en español. Rev. Mex. Psicología 24(1), 85–99 (2007)

    Google Scholar 

  27. Salas-Zárate, M.P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodriguez-Garcia, M.A., Valencia-Garcia, R.: Sentiment analysis on tweets about diabetes: an aspect-level approach. Comput. math. methods med. 2017, 9 (2017)

    Article  Google Scholar 

  28. Salas-Zárate, M.P., Paredes-Valverde, M.A., Rodriguez-García, M.Á., Valencia-García, R., Alor-Hernández, G.: Automatic detection of satire in twitter: a psycholinguistic-based approach. Knowl.-Based Syst. 128, 20–33 (2017)

    Article  Google Scholar 

  29. Salas-Zárate, M.P., Valencia-García, R., Ruiz-Martínez, A., Colomo-Palacios, R.: Feature-based opinion mining in financial news: an ontology-driven approach. J. Inf. Sci. 43(4), 458–479 (2017)

    Article  Google Scholar 

  30. Schouten, K., Frasincar, F.: Survey on aspect-level sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(3), 813–830 (2015)

    Article  Google Scholar 

  31. Shaw Jr., G., Karami, A.: Computational content analysis of negative tweets for obesity, diet, diabetes, and exercise. Proc. Assoc. Inf. Sci. Technol. 54(1), 357–365 (2017)

    Article  Google Scholar 

  32. Suttles, J., Ide, N.: Distant supervision for emotion classification with discrete binary values. In: Gelbukh, A. (ed.) CICLing 2013. LNCS, vol. 7817, pp. 121–136. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_11

    Chapter  Google Scholar 

  33. Wilson, T., Raaijmakers, S.: Comparing word, character, and phoneme n-grams for subjective utterance recognition. In: Ninth Annual Conference of the International Speech Communication Association (2008)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through project KBS4FIA (TIN2016-76323-R).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Antonio García-Díaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Medina-Moreira, J., García-Díaz, J.A., Apolinardo-Arzube, O., Luna-Aveiga, H., Valencia-García, R. (2019). Mining Twitter for Measuring Social Perception Towards Diabetes and Obesity in Central America. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2019. Communications in Computer and Information Science, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-030-34989-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34989-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34988-2

  • Online ISBN: 978-3-030-34989-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics