Abstract
Word embedding has been used in many NLP tasks and showed some capability to capture semantic features. It has also been used in several recent studies in IR. However, word embeddings trained in unsupervised manner may fail to capture some of the semantic relations in a specific area (e.g. healthcare). In this paper, we leverage the existing knowledge (word relations) in the medical domain to constrain word embeddings using the principle that related words should have similar embeddings. The resulting constrained word embeddings are used to rerank documents, showing superior effectiveness to unsupervised word embeddings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of AMIA Symposium, pp. 17–21 (2001)
Babashzadeh, A., Huang, J., Daoud, M.: Exploiting semantics for improving clinical information retrieval. In: SIGIR (2013)
Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32, D267–D270 (2004)
Bian, J., Gao, B., Liu, T-Y.: Knowledge-powered deep learning for word embedding. ECML-PKDD, pp. 132–148 (2014)
De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., Bruza, P.: Medical semantic similarity with a neural language model. In: CIKM (2014)
Dinu, G., Baroni, M.: How to make words with vectors: phrase generation in distributional semantics. In: Proceedings of ACL, pp. 624–633
Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: NAACL (2015)
Ganguly, D., Roy, D., Mitra, M., Jones, J.F.: A word embedding based generalized language model for information retrieval. In: SIGIR, pp. 795–798 (2015)
Goeuriot, L., Kelly, L., Li, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Jones, G.J.F.: ShARe/CLEF eHealth evaluation lab 2014, task 3: user-centred health information retrieval. In: CLEF 2014 Online Working Note, pp. 43–61 (2014)
Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM, pp. 2333–2338 (2013)
Hersh, W., Buckley, C., Leone, T.J., Hickam, D.: OHSUMED: an interactive retrieval evaluation and new large test collection for research. In: SIGIR, pp. 192–201 (1994)
Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., Lawley, M.: Information retrieval as semantic inference: a graph inference model applied to medical search. Inf. Ret. 19(1), 6–37 (2016)
Limsopatham, N., Macdonald, G., Ounis, I.: Inferring conceptual relationships to improve medical records search. In: Proceedings of Conference on Open Research Areas in IR, pp. 1–8 (2015)
Martinez, D., Otegi, A., Soroa, A., Agirre, E.: Improving search over electronic health records using UMLS-based query expansion through random walks. J. Biomed. Inf. 51, 100–106 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)
Mitra, B.: Exploring session context using distributed representations of queries and reformulations. In: SIGIR, pp. 3–12 (2015)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G.J.F., Lupu, M., Pecina, P.: CLEF eHealth evaluation lab 2015, task 2: retrieving information about medical symptoms. In: CLEF 2015 Online Working Notes, pp. 32–55 (2015)
Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Deep Learning and Unsupervised Feature Learning Workshop – NIPS (2010)
Sordoni, A., Bengio, Y., Vahabi, H., Lioma, C., Simonsen, J.G., Nie, J.-Y.: A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: CIKM (2015)
Shen, W., Nie, J.-Y., Liu, X.-J.: An investigation of the effectiveness of concept-based approach in medical information retrieval GRIUM@CLEF2014eHealthTask3. User-centred health information retrieval. In: Proceedings of CLEF 2014 (2014)
Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: A latent semantic model with convolutional-pooling structure for information retrieval. In: CIKM, pp. 101–110 (2014)
Severyn, A., Moschitti, A.: Learning to rank short text pairs with convolutional deep neural networks. In: SIGIR, pp. 373–382 (2015)
Vulic, I., Moens, M.-F.: Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: SIGIR, pp. 363–372 (2015)
Wang, Y., Liu, X., Fang, H.: A study of concept-based weighting regularization for medical records search. In: ACL (2014)
Xu, C., Bai, Y., Bian, J., Gao, B., Wang, G., Liu, X., Liu, T.-Y.: RC-NET: a general framework for incorporating knowledge into word representations. In: CIKM (2014)
Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: ACL, pp. 545–555 (2014)
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
Zheng, G., Callan, J.: Learning to reweight terms with distributed representations. In: SIGIR (2015)
Zuccon, G., Koopman, B., Bruza, P., Azzopardi, L.: Integrating and evaluating neural word embeddings in information retrieval. In: Proceedings of Australasian Document Computing Symposium (2015)
Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., Butt, L.: Exploiting medical hierarchies for concept-based information retrieval. In: Proceedings of Australasian Document Computing Symposium (2012)
Acknowledgement
This work is partly supported by an NSERC Discovery research grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Liu, X., Nie, JY., Sordoni, A. (2016). Constraining Word Embeddings by Prior Knowledge – Application to Medical Information Retrieval. In: Ma, S., et al. Information Retrieval Technology. AIRS 2016. Lecture Notes in Computer Science(), vol 9994. Springer, Cham. https://doi.org/10.1007/978-3-319-48051-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-48051-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48050-3
Online ISBN: 978-3-319-48051-0
eBook Packages: Computer ScienceComputer Science (R0)