poster

Evaluation of Five Sentence Similarity Models on Electronic Medical Records

Authors:
Qingyu Chen

National Institutes of Health, Bethesda, MD, USA

National Institutes of Health, Bethesda, MD, USA
View Profile

,
Jingcheng Du

National Institutes of Health & University of Texas Health, Bethesda, MD, USA

National Institutes of Health & University of Texas Health, Bethesda, MD, USA
View Profile

,
Sun Kim

National Institutes of Health, Bethesda, MD, USA

National Institutes of Health, Bethesda, MD, USA
View Profile

,
W. John Wilbur

National Institutes of Health, Bethesda, MD, USA

National Institutes of Health, Bethesda, MD, USA
View Profile

,
Zhiyong Lu

National Institutes of Health, Bethesda, MD, USA

National Institutes of Health, Bethesda, MD, USA
View Profile

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health InformaticsSeptember 2019Pages 533https://doi.org/10.1145/3307339.3343239

Published:04 September 2019Publication History

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Pages 533

ABSTRACT

Capturing the semantic similarity between sentences plays a vital role in several primary applications in biomedical and clinical domains: biomedical sentence search, evidence attribution, question-answering and text summarization. In this pilot study, we evaluated the effectiveness of five representative sentence similarity models, ranging from traditional machine learning methods to the latest bidirectional transformers in the clinical domain. The evaluation was performed on a dataset consisting of over 1K sentence pairs from EMRs - the largest public dataset in this domain by far. The results show that embeddings on large biomedical corpora are the most effective methods. It also demonstrates that CNN and BERT are effective to capture sentence similarity under relatively small datasets.

References

Chen Q, Du J, Kim S, Wilbur WJ, Lu Z. Combining rich features and deep learning for finding similar sentences in electronic medical records. Proceedings of Biocreative/OHNLP challenge 2018 2018.Google Scholar
Shao Y. HCTI at SemEval-2017 Task 1: Use convolutional neural network to evaluate semantic textual similarity. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) 2017:130--3.Google ScholarCross Ref
Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity. AAAI 2016;16:2786--92. Google ScholarDigital Library
Chen Q, Peng Y, Lu Z. BioSentVec: creating sentence embeddings for biomedical texts. arXiv preprint arXiv:1810.09302 2018.Google Scholar
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv:1901.08746 2019.Google Scholar
Wang Y, Afzal N, Fu S, Wang L, Shen F, Rastegar-Mojarad M, et al. MedSTS: A Resource for Clinical Semantic Textual Similarity. arXiv preprint arXiv:1808.09397 2018.Google Scholar

Index Terms

Evaluation of Five Sentence Similarity Models on Electronic Medical Records
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Sentence Similarity Measures Revisited: Ranking Sentences in PubMed Documents
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

While various measures are available for computing sentence similarity, few studies have examined their performance in the biomedical domain. Motivated by BIOSSES, an earlier study for biomedical sentence similarity, we here explore the effectiveness of ...
Read More
A New Sentence Similarity Method Based on a Three-Layer Sentence Representation
WI-IAT '14: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01

Sentence similarity methods are used to assess the degree of likelihood between phrases. Many natural language applications such as text summarization, information retrieval, text categorization, and machine translation employ measures of sentence ...
Read More
Towards Human-Machine Collaboration in Creating an Evaluation Corpus for Adverse Drug Events in Discharge Summaries of Electronic Medical Records

Adverse drug events (ADEs) contribute significantly to morbidity and mortality in the healthcare system. The availability of digitalised hospitals' narrative clinical data offers a potentially rich resource to enhance pharmacovigilance efforts to manage ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
September 2019
716 pages
ISBN:9781450366663
DOI:10.1145/3307339
General Chairs:
Xinghua (Mindy) Shi
Temple University, USA
,
Michael Buck
University of Buffalo, USA
,
Program Chairs:
Jian Ma
Carnegie Mellon University, USA
,
Pierangelo Veltri
University Magna Graecia of Catanzaro, Italy
Copyright © 2019 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 September 2019
Check for updates
Author Tags
biomedicine
ehr
natural language processing
textual similarity
Qualifiers
- poster
Conference

Acceptance Rates
BCB '19 Paper Acceptance Rate42of157submissions,27%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 186
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evaluation of Five Sentence Similarity Models on Electronic Medical Records

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sentence Similarity Measures Revisited: Ranking Sentences in PubMed Documents

A New Sentence Similarity Method Based on a Three-Layer Sentence Representation

Towards Human-Machine Collaboration in Creating an Evaluation Corpus for Adverse Drug Events in Discharge Summaries of Electronic Medical Records

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Evaluation of Five Sentence Similarity Models on Electronic Medical Records

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sentence Similarity Measures Revisited: Ranking Sentences in PubMed Documents

A New Sentence Similarity Method Based on a Three-Layer Sentence Representation

Towards Human-Machine Collaboration in Creating an Evaluation Corpus for Adverse Drug Events in Discharge Summaries of Electronic Medical Records

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media