Abstract
Diseases and their symptoms are a frequent information need for Web users. Diseases often are categorized into sub-types, manifested through different symptoms. Extracting such information from textual corpora is inherently difficult. Yet, this can be easily extracted from semi-structured resources like tables. We propose an approach for identifying tables that contain information about sub-type classifications and their attributes. Often tables have diverse and redundant schemas, hence, we align equivalent columns in disparate schemas s.t. information about diseases are accessible through a unified and a common schema. Experimental evaluation shows that we can accurately identify tables containing disease sub-type classifications and additionally align equivalent columns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Abacha, A.B., Demner-Fushman, D.: A question-entailment approach to question answering. arXiv e-prints, January 2019. https://arxiv.org/abs/1901.08079
Abacha, A.B., Shivade, C., Demner-Fushman, D.: Overview of the MEDIQA 2019 shared task on textual inference, question entailment and question answering. In: BioNLP (2019)
Biswas, R., Koutraki, M., Sack, H.: Exploiting equivalence to infer type subsumption in linked graphs. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 72–76. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_14
Biswas, R., Koutraki, M., Sack, H.: Predicting Wikipedia infobox type information using word embeddings on categories. In: Proceedings of the EKAW 2018 Posters and Demonstrations Session (2018)
Biswas, R., Türker, R., Moghaddam, F.B., Koutraki, M., Sack, H.: Wikipedia infobox type prediction using embeddings. In: Proceedings of the First Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies (DL4KGS) (2018)
Fetahu, B., Anand, A., Koutraki, M.: Tablenet: An approach for determining fine-grained relations for Wikipedia tables. In: The World Wide Web Conference, WWW 2019 (2019)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: KDD (2016)
Koutraki, M., Preda, N., Vodislav, D.: SOFYA: semantic on-the-fly relation alignment. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, 15–16 March 2016 (2016)
Koutraki, M., Preda, N., Vodislav, D.: Online relation alignment for linked datasets. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10249, pp. 152–168. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58068-5_10
Zitnik, M., Rok Sosič, S.M., Leskovec, J.: BioSNAP Datasets: Stanford biomedical network dataset collection, August 2018. http://snap.stanford.edu/biodata
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Koutraki, M., Fetahu, B. (2020). MedTable: Extracting Disease Types from Web Tables. In: Harth, A., et al. The Semantic Web: ESWC 2020 Satellite Events. ESWC 2020. Lecture Notes in Computer Science(), vol 12124. Springer, Cham. https://doi.org/10.1007/978-3-030-62327-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-62327-2_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62326-5
Online ISBN: 978-3-030-62327-2
eBook Packages: Computer ScienceComputer Science (R0)