Abstract
The Web provides a plethora of contents about diseases, symptoms and treatments. Most notably, users turn to health forums to seek advice from doctors and from peers with similar cases. However, the benefit of forums mostly lies in community QA and browsing. Expressive querying for patient-centric needs is poorly supported by search engines. This paper overcomes this issue by enriching user queries with judiciously chosen entities and classes from a large knowledge graph. Candidate entities are extracted from the full text of user posts. To counter topical drift that would arise from picking all entities, we devise ECO, a novel method that computes a focused entity core for query expansion. Experiments with contents from health forums and clinical trials demonstrate substantial gains that ECO achieves over state-of-the-art baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abrahamson, J.A., Fisher, K.E., Turner, A.G., Durrance, J.C., Turner, T.C.: Lay information mediary behavior uncovered: exploring how nonprofessionals seek health information for themselves and others online. J. Med. Library Assoc. JMLA 96(4), 310 (2008)
Alsentzer, E., et al.: Publicly available clinical bert embeddings. arXiv preprint arXiv:1904.03323 (2019)
Balaneshinkordan, S., Kotov, A.: An empirical comparison of term association and knowledge graphs for query expansion. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 761–767. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_65
Balog, K.: Entity-Oriented Search. Springer Nature, Cham (2018). https://doi.org/10.1007/978-3-319-93935-3
Barros, J.M., Buitelaar, P., Duggan, J., Rebholz-Schuhmann, D.: Unsupervised classification of health content on reddit. In: Proceedings of the 9th International Conference on Digital Public Health, pp. 85–89 (2019)
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1), 1–50 (2012)
Chamberlin, S.R., et al.: A query taxonomy describes performance of patient-level retrieval from electronic health record data. medRxiv, p. 19012294 (2019)
Dalton, J., Dietz, L., Allan, J.: Entity query feature expansion using knowledge base links. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 365–374 (2014)
De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., Bruza, P.: Medical semantic similarity with a neural language model. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1819–1822 (2014)
Dirkson, A., Verberne, S., Kraaij, W.: Narrative detection in online patient communities. In: Texts@ECIR, pp. 21–28 (2019)
Dragoni, M.: Semantic ai for healthcare: The horus. ai platform. In: Second International Workshop on Semantic Web Meets Health Data Management (SWH 2019) co-located with the 18th International Semantic Web Conference (ISWC 2019). vol. 2515, pp. 1–4. CEUR-WS. org (2019)
Ernst, P., et al.: DeepLife: an entity-aware search, analytics and exploration platform for health and life sciences. In: ACL, pp. 19–24 (2016)
Ernst, P., Siu, A., Weikum, G.: Knowlife: a versatile approach for constructing a large knowledge graph for biomedical sciences. BMC Bioinform. 16(1), 157 (2015)
Ernst, P., Terolli, E., Weikum, G.: LongLife: a platform for personalized searchfor health and life sciences. In: 18th Semantic Web Conference, pp. 237–240. ceur-ws. org (2019)
Fang, H., Zhai, C.: Semantic term matching in axiomatic approaches to information retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–122 (2006)
Hazimeh, H., Zhai, C.: Axiomatic analysis of smoothing methods in language models for pseudo-relevance feedback. In: ICTIR, pp. 141–150. ACM (2015)
Hegde, C., Indyk, P., Schmidt, L.: A nearly-linear time framework for graph-structured sparsity. In: ICML (2015)
Jimmy, Zuccon, G., Palotti, J.R.M., Goeuriot, L., Kelly, L.: Overview of the CLEF 2018 consumer health search task. In: Working Notes of CLEF (2018)
Jin, Q., Dhingra, B., Liu, Z., Cohen, W.W., Lu, X.: PubMedQA: a dataset for biomedical research question answering. arXiv preprint arXiv:1909.06146 (2019)
Johnson, D.S., Minkoff, M., Phillips, S.: The prize collecting steiner tree problem: theory and practice. In: SODA, pp. 760–769 (2000)
Kanthawala, S., Vermeesch, A., Given, B., Huh, J.: Answers to health questions: internet search results versus online health community responses. J. Med. Internet Res. 18(4), e95 (2016)
Khanpour, H., Caragea, C.: Fine-grained information identification in health related posts. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1001–1004 (2018)
Kondylakis, H., et al.: Semantically-enabled personal medical information recommender. In: ISWC (2015)
Koopman, B., Zuccon, G.: WSDM 2019 tutorial on health search (HS2019): a full-day from consumers to clinicians. In: WSDM, pp. 838–839 (2019)
Koopman, B., Zuccon, G., Bruza, P.: What makes an effective clinical query and querier? JASIST 68(11), 2557–2571 (2017)
Krithara, A., et al.: iASiS: towards heterogeneous big data analysis for personalized medicine. In: 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), pp. 106–111. IEEE (2019)
Kuzi, S., Carmel, D., Libov, A., Raviv, A.: Query expansion for email search. In: SIGIR, pp. 849–852. ACM (2017)
Kuzi, S., Shtok, A., Kurland, O.: Query expansion using word embeddings. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 1929–1932 (2016)
Lee, J., et al.: Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Liu, X., Chen, F., Fang, H., Wang, M.: Exploiting entity relationship for query expansion in enterprise search. Inf. Retrieval 17(3), 265–294 (2014)
Luo, G., Tang, C.: On iterative intelligent medical search. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–10 (2008)
Luo, G., Tang, C., Yang, H., Wei, X.: MedSearch: a specialized search engine for medical information retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 143–152 (2008)
Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs: credibility of user statements in health communities. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 65–74 (2014)
Pang, P.C.I., Verspoor, K., Pearce, J., Chang, S.: Better health explorer: designing for health information seekers. In: OzCHI, pp. 588–597. ACM (2015)
Patel, C., et al.: Matching patient records to clinical trials using ontologies. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 816–829. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_59
Roberts, K., et al.: Overview of the trec 2017 precision medicine track. In: TREC (2017)
Role, F., Nadif, M.: Handling the impact of low frequency events on co-occurrence based measures of word similarity. In: Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2011). Scitepress, pp. 218–223 (2011)
Rospocher, M., Corcoglioniti, F., Dragoni, M.: Boosting document retrieval with knowledge extraction and linked data. Semantic Web 10(4), 753–778 (2019)
Siu, A., Nguyen, D.B., Weikum, G.: Fast entity recognition in biomedical text. In: Proceedings of Workshop on Data Mining for Healthcare (DMH) at Conference on Knowledge Discovery and Data Mining (KDD). ACM Press, New York (2013)
Soldaini, L., Yates, A., Goharian, N.: Learning to reformulate long queries for clinical decision support. JAIST 68(11), 2602–2619 (2017)
Soto, A.J., Przybyla, P., Ananiadou, S.: Thalia: semantic search engine for biomedical abstracts. Bioinformatics 35(10), 1799–1801 (2019)
Suominen, H., et al.: Overview of the CLEF eHealth evaluation lab 2018. In: Bellot, P., et al. (eds.) CLEF 2018. LNCS, pp. 286–301. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98932-7_26
White, R.W., Horvitz, E.: From health search to healthcare: explorations of intention and utilization via query logs and user surveys. JAMIA 21(1), 49–55 (2013)
Wu, H., et al.: SemEHR: a general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J. Am. Med. Inform. Assoc. 25(5), 530–537 (2018)
Zhu, D., Wu, S., Carterette, B., Liu, H.: Using large clinical corpora for query expansion in text-based cohort identification. J. Biomed. Inform. 49, 275–281 (2014)
Zuccon, G., Koopman, B., et al.: Payoffs and pitfalls in using knowledge-bases for consumer health search. Inf. Retrieval J. 22(3–4), 350–394 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Terolli, E., Ernst, P., Weikum, G. (2020). Focused Query Expansion with Entity Cores for Patient-Centric Health Search. In: Pan, J.Z., et al. The Semantic Web – ISWC 2020. ISWC 2020. Lecture Notes in Computer Science(), vol 12506. Springer, Cham. https://doi.org/10.1007/978-3-030-62419-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-62419-4_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62418-7
Online ISBN: 978-3-030-62419-4
eBook Packages: Computer ScienceComputer Science (R0)