poster

A Text-Mining System for Concept Annotation in Biomedical Full Text Articles

Authors:

Chih-Hsuan Wei,

Alexis Allot,

Robert Leaman,

Zhiyong LuAuthors Info & Claims

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Page 540

https://doi.org/10.1145/3307339.3343246

Published: 04 September 2019 Publication History

Get Access

Abstract

PubTator Central (https://www.ncbi.nlm.nih.gov/research/pubtator/) [1] is a web service for exploring and retrieving bioconcept annotations in full text biomedical articles. PubTator Central (PTC) provides automated annotations from state-of-the-art text mining systems for genes/proteins, genetic variants, diseases, chemicals, species and cell lines, all available for immediate download. PTC annotates PubMed (30 million abstracts), the PMC Open Access Subset and the Author Manuscript Collection (3 million full text articles). These full text articles increase the total number of annotations nearly four-fold. The new PTC web interface features semantic search and faceted shortcuts to improve navigation in full text. Increased throughput and speed despite a huge increase in data volume is permitted by a significantly redesigned back end that heavily exploits nonrelational data. Updated entity identification methods and a new disambiguation module based on cutting-edge deep learning techniques provide increased accuracy. The PTC web interface allows users to easily navigate through bioentities present in full-text articles, build full text document collections and visualize concept annotations in each document. Annotations are downloadable in multiple formats (XML, JSON and tab delimited) via the online interface, a RESTful web service and bulk FTP. PTC is synchronized with PubMed and PubMed Central, with new articles added daily. The original PubTator [2] service has served annotated abstracts for ~300 million requests, enabling third-party research in use cases such as biocuration support, gene prioritization, genetic disease analysis, and literature-based knowledge discovery. We demonstrate the full text results in PTC significantly increase biomedical concept coverage and anticipate this expansion will both enhance existing downstream applications and enable new use cases.

References

[1]

Wei,C.H., Allot, A., Leaman,R., and Lu,Z. PubTator Central: Automated Concept Annotation for Biomedical Full Text Articles. Nucleic Acids Research, 2019 (Web Server issue)

Google Scholar

[2]

Wei,C.H., Kao,H.Y., and Lu,Z. (2013) PubTator: a Web-based text mining tool for assisting Biocuration. Nucleic Acids Res., 41, W518-W522

Crossref

Google Scholar

Cited By

View all

Wei CLee KLeaman RLu ZShi XBuck MMa JVeltri P(2019)Biomedical Mention Disambiguation using a Deep Learning ApproachProceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics10.1145/3307339.3342162(307-313)Online publication date: 7-Sep-2019
https://doi.org/10.1145/3307339.3342162

Index Terms

A Text-Mining System for Concept Annotation in Biomedical Full Text Articles
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Text mining tools for assisting literature curation
BCB '14: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Today's biomedical research has become heavily dependent on the access to biological knowledge encoded in expert curated biological databases (e.g. Swiss-Prot). As the volume of biological literature grows rapidly, it becomes increasingly difficult for ...
Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles

Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many of them represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of those terms. On the ...
Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles
CBMS '04: Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems

Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many abbreviations represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of biomedical ...

Comments

Information & Contributors

Information

Published In

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

September 2019

716 pages

ISBN:9781450366663

DOI:10.1145/3307339

General Chairs:
Xinghua (Mindy) Shi
Temple University, USA
,
Michael Buck
University of Buffalo, USA
,
Program Chairs:
Jian Ma
Carnegie Mellon University, USA
,
Pierangelo Veltri
University Magna Graecia of Catanzaro, Italy

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2019

Check for updates

Author Tags

Qualifiers

Poster

Conference

BCB '19

Sponsor:

SIGBio

BCB '19: 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

September 7 - 10, 2019

NY, Niagara Falls, USA

Acceptance Rates

BCB '19 Paper Acceptance Rate 42 of 157 submissions, 27%;

Overall Acceptance Rate 254 of 885 submissions, 29%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
106
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wei CLee KLeaman RLu ZShi XBuck MMa JVeltri P(2019)Biomedical Mention Disambiguation using a Deep Learning ApproachProceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics10.1145/3307339.3342162(307-313)Online publication date: 7-Sep-2019
https://doi.org/10.1145/3307339.3342162

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Text mining tools for assisting literature curation

Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles

Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Text mining tools for assisting literature curation

Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles

Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations