demonstration

DIDO: a disease-determinants ontology from web sources

Authors:
Victoria Nebot Romero

Universitat Jaume I, Castellón, Germany

Universitat Jaume I, Castellón, Germany
View Profile

,
Min Ye

Max-Planck Institute Computer Science, Saarbruecken, Germany

Max-Planck Institute Computer Science, Saarbruecken, Germany
View Profile

,
Mario Albrecht

Max-Planck Institute Computer Science, Saarbruecken, Germany

Max-Planck Institute Computer Science, Saarbruecken, Germany
View Profile

,
Jae-Hong Eom

Max-Planck Institute Computer Science, Saarbruecken, Germany

Max-Planck Institute Computer Science, Saarbruecken, Germany
View Profile

,
Gehard Weikum

Max-Planck Institute Computer Science, Saarbruecken, Germany

Max-Planck Institute Computer Science, Saarbruecken, Germany
View Profile

WWW '11: Proceedings of the 20th international conference companion on World wide webMarch 2011Pages 237–240https://doi.org/10.1145/1963192.1963298

Published:28 March 2011Publication History

WWW '11: Proceedings of the 20th international conference companion on World wide web

Pages 237–240

ABSTRACT

This paper introduces DIDO, a system providing convenient access to knowledge about factors involved in human diseases, automatically extracted from textual Web sources. The knowledge base is bootstrapped by integrating entities from hand-crafted sources like MeSH and OMIM. As these are short on relationships between dierent types of biomedical entities, DIDO employs flexible and robust pattern learning and constraint-based reasoning methods to automatically extract new relational facts from textual sources. These facts can then be iteratively added to the knowledge base. The result is a semantic graph of typed entities and relations between diseases, their symptoms, and their factors, with emphasis on environmental factors but covering also molecular determinants. We demonstrate the value of DIDO for knowledge discovery about causal factors and properties of complex diseases, including factor-disease chains.

References

GO: The gene ontology. http://www.geneontology.org/.Google Scholar
KEGG:. http://www.genome.jp/kegg/.Google Scholar
MeSH: Medical Sub ject Headings. http://www.nlm.nih.gov/mesh/.Google Scholar
MIPS: The Mammalian Protein-Protein Interaction Database. http://www.test.org/doe/.Google Scholar
OMIM: Online Mendelian Inheritance in Man. http://www.ncbi.nlm.nih.gov/omim/.Google Scholar
Stanford Log-linear Part-Of-Speech Tagger. http://nlp.stanford.edu/software/tagger.shtml.Google Scholar
UMLS: Unified Medical Language System. http://www.nlm.nih.gov/research/umls/.Google Scholar
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: a nucleus for a web of open data. In Proc. of ISWC '07, 2007. Google ScholarDigital Library
A. Doan and et al. (Eds.). Special Issue on Information Extraction. ACM SIGMOD Record, 37(4), 2008. Google ScholarDigital Library
J.-H. Kim, A. Mitchell, T. K. Attwood, and M. Hilario. Learning to extract relations for protein annotation. Bioinformatics, 23(13):i256--63, 2007. Google ScholarDigital Library
Y. I. Liu, P. H. Wise, and A. J. Butte. The "etiome": identification and clustering of human disease etiological factors. BMC Bioinf., 10(S2):S14, 2009.Google Scholar
N. Nakashole, M. Theobald, and G. Weikum. Find your Advisor: Robust Knowledge Gathering from the Web. In Proc. of WebDB '10. Google ScholarDigital Library
F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: A Core of Semantic Knowledge. In Proc. of WWW '07. Google ScholarDigital Library
F. M. Suchanek, M. Sozio, and G. Weikum. SOFIE: A Self-Organizing Framework for Information Extraction. In Proc. of WWW '09. Google ScholarDigital Library
L. Tari, S. Anwar, S. Liang, J. Cai, and C. Baral. Discovering drug-drug interactions: a text-mining and reasoning approach based on properties of drug metabolism. Bioinformatics, 26(18):i547--53, 2010. Google ScholarDigital Library
G. Weikum and M. Theobald. From Information to Knowledge: Harvesting Entities and Relationships from Web Sources. In Proc. of PODS '10, 2010. Google ScholarDigital Library

Index Terms

DIDO: a disease-determinants ontology from web sources
1. Information systems

Recommendations

A Flexible Text Mining System for Entity and Relation Extraction in PubMed
DTMBIO '15: Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical Informatics

Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means ...
Read More
Application of Domain Ontologies to Natural Language Processing: A Case Study for Drug-Drug Interactions

Natural Language Processing NLP techniques can provide an interesting way to mine the growing biomedical literature, and a promising approach for new knowledge discovery. However, the major bottleneck in this area is that these systems rely on specific ...
Read More
Two evaluations on Ontology-style relation annotations
Abstract
In this paper, we propose an Ontology-Style Relation (OSR) annotation approach. In conventional Relation Extraction (RE) datasets, relations are annotated as a link between two entity mentions. In contrast, in our OSR annotation, a relation is ...
Highlights
- The relation annotations can be easily converted to Resource Description Framework (RDF) triples to populate an Ontology.
- Some part of conventional RE tasks can be tackled as Named Entity Recognition (NER) tasks. The relation classes ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '11: Proceedings of the 20th international conference companion on World wide web
March 2011
552 pages
ISBN:9781450306379
DOI:10.1145/1963192
General Chairs:
S. Sadagopan
IIIT-Bangalore, India
,
Krithi Ramamritham
IIT-Bombay, India
,
Arun Kumar
IBM Research, India
,
M. P. Ravindra
Infosys E & R, India
,
Program Chairs:
Elisa Bertino
Purdue University, USA
,
Ravi Kumar
Yahoo! Research, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
biomedical knowledge base
disease factors
ontology
relation extraction
Qualifiers
- demonstration
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 187
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DIDO: a disease-determinants ontology from web sources

WWW '11: Proceedings of the 20th international conference companion on World wide web

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Flexible Text Mining System for Entity and Relation Extraction in PubMed

Application of Domain Ontologies to Natural Language Processing: A Case Study for Drug-Drug Interactions

Two evaluations on Ontology-style relation annotations