skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Information Extraction from Cancer Pathology Reports with Graph Convolution Networks for Natural Language Texts

Conference ·

Graph-of-words is a flexible and efficient text representation which addresses well-known challenges, such as word ordering and variation of expressions, to natural language processing. In this paper, we consider the latest graph-based convolutional neural network technique, the Text Graph Convolutional Network (Text GCN), in the context of performing classification tasks on free-form natural language texts. To do this, we designed a study of multi-task information extraction from medical text documents. We implemented multi-task learning in the Text GCN, performed hyperparameter optimization, and measured the clinical task performances. We evaluated micro and macro-F1 scores of four information extraction tasks, including subsite, laterality, behavior, and histological grades from cancer pathology reports. The scores for the Text GCN significantly outperformed our previous studies with convolutional neural networks, suggesting that the Text GCN model is superior to traditional models in task performance.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1606856
Resource Relation:
Conference: 2019 IEEE International Conference on Big Data - Los Angeles, California, United States of America - 12/9/2019 5:00:00 AM-12/12/2019 5:00:00 AM
Country of Publication:
United States
Language:
English