Abstract:
Diseases/Chemical play central roles in many areas of biomedical research and healthcare. Consequently, aggregating the disease knowledge and treatment research reports b...Show MoreMetadata
Abstract:
Diseases/Chemical play central roles in many areas of biomedical research and healthcare. Consequently, aggregating the disease knowledge and treatment research reports becomes an extremely critical issue, especially in rapid-growth knowledge bases (e.g., PubMed). Thus, a framework of disease/chemical named entity recognition and normalization has become increasingly important for biomedical text mining. In this work, we not only define five diversities of disease names but also develop a system for disease/chemical mention recognition and normalization in biomedical texts. Our system utilizes an order 2 conditional random fields (CRFs) model to develop a recognition system and optimize the results by customizing several post-processing, including abbreviation resolution, consistency improvement, stopwords filtering, and adjectives reorganization. After evaluation, we obtained the best performance (86.9% of F-score) on disease normalization and (89.95% of Precision) on chemical normalization. These results suggest that our system is a high-performance and state of the art recognition system for disease/chemical recognition and normalization from biomedical literature.
Date of Conference: 13-16 November 2017
Date Added to IEEE Xplore: 18 December 2017
ISBN Information: