Inverse Regression for Extraction of Tumor Site from Cancer Pathology Reports
- ORNL
Pathology reports are the primary source of information for cancer diagnosis of millions of the cancer patients across the United States. Cancer registries label these reports every year. The coded labels incorporate pertinent information such as cancer location, behavior, and histology. This information when combined with clinical information, medical imaging and even genomic information have a great potential to fuel discoveries in cancer research. The coding process is manual and requires many human experts to label the large volume of pathology reports in a timely manner. In this study, we have developed a supervised inverse regression based auto-labeler to automate the task. The experiments were conducted on a set of 942 pathology reports with human expert labels as the ground truth. We observed that the inverse regression based auto-labeler consistently performed better than or comparable to the best performing state-of-the-art method. These results demonstrate the potential of inverse regression for reliable information extraction from the pathology reports.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1607215
- Resource Relation:
- Conference: IEEE EMBS International Conference on Biomedical & Health Informatics (IEE-EMBS BHI 2019) - Chicago, Illinois, United States of America - 5/19/2019 4:00:00 PM-5/22/2019 4:00:00 PM
- Country of Publication:
- United States
- Language:
- English
Similar Records
Extraction of Tumor Site from Cancer Pathology Reports using Deep Filters
Deep Learning for Automated Extraction of Primary Sites from Cancer Pathology Reports