Abstract
The topic of people’s health has always attracted the attention of public and private structures, the patients themselves and, therefore, researchers. Social networks provide an immense amount of data for analysis of health-related issues; however it is not always the case that researchers have enough data to build sophisticated models. In the paper, we artificially create this limitation to test performance and stability of different popular algorithms on small samples of texts. There are two specificities in this research apart from the size of a sample: (a) here, instead of usual 5-star classification, we use combined classes reflecting a more practical view on medicines and treatments; (b) we consider both original and noisy data. The experiments were carried out using data extracted from the popular forum AskaPatient. For tuning parameters, GridSearchCV technique was used. The results show that in dealing with small and noisy data samples, GMDH Shell is superior to other methods. The work has a practical orientation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kaplan, A.M., Haenlein, M.: Users of the world, unite! The challenges and opportunities of Social Media (2007). https://doi.org/10.1016/j.bushor.2009.09.003
Ventola, C.L.: Social media and health care professionals: benefits, risks, and best practices. P T 39, 491–520 (2014)
Lehne, R.A., Rosenthal, L.D.: Pharmacology for Nursing Care. Elsevier Health Sciences (2013)
Struik, L.L., Baskerville, N.B.: The role of Facebook in crush the crave, a mobile- and social media-based smoking cessation intervention: qualitative framework analysis of posts. J. Med Int. Res. 16(7), e170 (2014). https://doi.org/10.2196/jmir.3189
Sarker, A., O’Connor, K., Ginn, R., Scotch, M., Smith, K., Malone, D., Gonzalez, G.: Social media mining for toxicovigilance: automatic monitoring of prescription medication abuse from Twitter. Drug Saf. 39, 231–240 (2016)
Nakhasi, A., Passarella, R.J., Bell, S.J., Paul, M.J., Dredze, M., Pronovost P.J.: Malpractice and Malcontent: analyzing medical complaints in Twitter. In: AAAI Technical Report FS-12-05, Information Retrieval and Knowledge Discovery in Biomedical Text, pp. 84–85 (2012)
Alexandrov, M., Skitalinskaya, G., Cardiff, J., Koshulko, O., Shushkevich, E.: Classifiers for Yelp-reviews based on GMDH-algorithms. In: Proceedings of the Conference in Intelligent Text Processing and Comput. Linguistics (CICLing-2018). LNCS, pp. 1–18. Springer (2018)
Stepashko, V.S.: Method of critical variances as analytical tool of theory of inductive modeling. J. Autom. Inf. Sci. 40, 4–22 (2008). https://doi.org/10.1615/J.AutomatInfScien.v40.i3.20
Huynh, T., He, Y., Willis, A., Uger, S.: Adverse drug reaction classification with deep neural networks. In: Proceedings of 26-th International Conference on Computational Linguistics (COLING-2016), pp. 877–887 (2016)
Akhtyamova, L., Ignatov, A., Cardiff, J.: A Large-scale CNN ensemble for medication safety analysis. In: Proceedings of 22th International Conference on Applications of Natural Language to Information Systems (NLDB 2017). LNCS, pp. 1–6. Springer (2017)
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Schakel, A.M.J., Wilson, B.J.: Measuring word significance using distributed representations of words, CoRR, abs/1508.02297 (2015)
Madala, H.R., Ivakhnenko, A.G.: Inductive Learning Algorithms for Complex Systems Modelling. CRC Press, New York (1994)
Farlow, S.J.: Self-Organizing methods in modeling: GMDH type algorithms. In: Statistics: A Series of Textbooks and Monographs, Book 54, 1-st edn. Marcel Decker Inc., New York, Basel (1984)
Stepashko, V.: Developments and prospects of GMDH-based inductive modeling. In: Shakhovska, N., Stepashko, V. (eds.) Advances in Intelligent Systems and Computing II / AISC book series, vol. 689, pp. 346–360. Springer, Cham (2017)
Platform GMDH Shell. www.gmdhshell.com
Resource GMDH in IRTC ITS NAS of Ukraine. mgua.irtc.org.ua/
Alexandrov, M., Blanco, X., Catena, A., Ponomareva, N.: Inductive modeling in subjectivity/sentiment analysis (case study: dialog processing). In: Proceedings of 3-rd International Workshop on Inductive Modeling (IWIM-2009), pp. 40–43 (2009)
Kaurova, O., Alexandrov, M., Koshulko, O.: Classifiers of medical records presented in free text form (GMDH shell application). In: Proceedings of 4-th International Conference on Inductive Modeling (ICIM-2013), pp. 273–278 (2013)
Alexandrov, M., Danilova, V., Koshulko, A., Tejada, J.: Models for opinion classification of blogs taken from Peruvian Facebook. In: Proceedings of 4-th International Conference on Inductive Modeling, pp. 241–246 (2013)
Tax, D.M.J., Duin, R.P.W.: Using two-class classifiers for multiclass classification. In: Proceedings of 16-th International Conference on Pattern Recognition, pp. 1051–1054. IEEE (2002)
Akhtyamova, L., Alexandrov, M., Cardiff, J., Koshulko, O.: Building classifiers with GMDH for health social networks (DB AskaPatient). In: Proceedings of the International Workshop on Inductive Modelling (IWIM-2018). IEEE (2018). [to be published]
Sarker, A., Gonzalez, G.: Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J. Biomed. Inform. 53, 196–207 (2015). https://doi.org/10.1016/j.jbi.2014.11.002
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of 16th International Conference on Artificial Intelligence, pp. 2266–2273 (2015)
Stojanovski, D., Strezoski, G., Madjarov, G., Dimitrovski, I.: Finki at SemEval-2016 Task 4: deep learning architecture for Twitter sentiment analysis. In: Proceedings of SemEval-2016, pp. 149–154 (2016)
Miftahutdinov, Z., Tutubalina, E., Tropsha, A.: Identifying disease-related expressions in reviews using conditional random fields. In: Proceedings of International Conference on Computational Linguistics and Intellectual Technologies (Dialog-2017), pp. 155–166 (2017)
Draper, N., Smith, H.: Applied Regression Analysis. Wiley, New York (1981)
Gelbukh, A., Sidorov, G., Lavin-Villa E., Chanova-Hernandez, L.: Automatic term extraction using Log-likelihood based comparison with General Reference Corpus. In: Proceedings of 15-th International Conference on Applications of Natural Language to Information Systems (NLDB-2010). LNCS, vol. 6177, pp. 248–255. Springer (2010)
Lopez, R., Alexandrov, M., Barreda, D., Tejada, J.: LexisTerm – the program for term selection by the criterion of specificity. In: Artificial Intelligence Application to Business and Engineering Domain, vol. 24, pp. 8–15. ITHEA Publ., Rzeszov-Sofia (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Akhtyamova, L., Alexandrov, M., Cardiff, J., Koshulko, O. (2019). Opinion Mining on Small and Noisy Samples of Health-Related Texts. In: Shakhovska, N., Medykovskyy, M. (eds) Advances in Intelligent Systems and Computing III. CSIT 2018. Advances in Intelligent Systems and Computing, vol 871. Springer, Cham. https://doi.org/10.1007/978-3-030-01069-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-01069-0_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01068-3
Online ISBN: 978-3-030-01069-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)