Abstract
With the rapid increase in the number as well as quality of online medical forums, patients are increasingly using the Internet for health information and support. Online health forums play an important role in addressing consumers health information needs. However, given the large number of queries, and limited number of experts, a significant fraction of the questions remains unanswered. Automatic question classifiers can overcome this issue by directing questions to specific experts according to their topic preferences to get quick and better responses.
In this paper, we aim to classify health forum questions where classes of questions mainly focus on capturing user intentions. We strongly believe that a good estimate of user intentions will help direct their questions to the best responders. We propose a novel approach of combining medical domain based features with deep learning models for question classification task. To further improve performance of the data-hungry deep learning models, we resort to weak supervision strategies. We propose a new variant of the existing self-training method called “Self-Training with Lookups” for weak supervision. Our results demonstrate that combining features generated from biomedical entities along with other language representation features for deep learning networks can lead to substantial improvement in modeling user generated health content. Weak supervision further enhances the accuracy. The proposed model outperforms the state-of-the-art method on a benchmark dataset of 11000 questions with a margin of 3.13%.
M. Gupta—A Principal Applied Scientist at Microsoft.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: AMIA, p. 17 (2001)
Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. IEEE Trans. Neural Netw. 20, 542–542 (2009)
Christopher, D.M., Prabhakar, R., Hinrich, S.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). 151, 177
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Guo, H., Na, X., Hou, L., Li, J.: Classifying Chinese questions related to health care posted by consumers via the internet. J. Med. Internet Res. 19(6), e220 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML (2014)
Liu, F., Antieau, L.D., Yu, H.: Toward automated consumer question answering: automatically separating consumer questions from professional questions in the healthcare domain. J. Bio. Info. 44, 1032–1038 (2011)
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Interspeech (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)
Moen, S., Ananiadou, T.S.S.: Distributional Semantics Resources for Biomedical Text Processing (2013)
Mrabet, Y., Kilicoglu, H., Roberts, K., Demner-Fushman, D.: Combining open-domain and biomedical knowledge for topic recognition in consumer health questions. In: AMIA, vol. 2016, p. 914 (2016)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Roberts, K., Rodriguez, L., Shooshan, S.E., Demner-Fushman, D.: Resource classification for medical questions. In: AMIA (2016)
Ruder, S., Ghaffari, P., Breslin, J.G.: A hierarchical model of reviews for aspect-based sentiment analysis. arXiv preprint arXiv:1609.02745 (2016)
Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: Tutorial Abstracts of ACL 2012, pp. 5–5. Association for Computational Linguistics (2012)
Tan, M., Santos, C.D., Xiang, B., Zhou, B.: LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108 (2015)
Verma, J., Kwon, B.C., Cheng, Y., Ghosh, S., Ng, K.: Classification of healthcare forum messages. In: ICHI (2016)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL, pp. 1480–1489 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Jalan, R., Gupta, M., Varma, V. (2018). Medical Forum Question Classification Using Deep Learning. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-76941-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76940-0
Online ISBN: 978-3-319-76941-7
eBook Packages: Computer ScienceComputer Science (R0)