skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Unleashing the Power of Knowledge Extraction from Scientific Literature in Catalysis

Journal Article · · Journal of Chemical Information and Modeling

Valuable knowledge of catalysis is often hidden in a large amount of scientific literature. There is an urgent need to extract useful knowledge to facilitate scientific discovery. Here this work takes the first step toward the goal in the field of catalysis. Specifically, we construct the first information extraction benchmark data set that covers the field of catalysis and also develop a general extraction framework that can accurately extract catalysis-related entities from scientific literature with 90% extraction accuracy. We further demonstrate the feasibility of leveraging the extracted knowledge to help users better access relevant information in catalysis through an entity-aware search engine and a correlation analysis system.

Research Organization:
Univ. of Delaware, Newark, DE (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
Grant/Contract Number:
SC0021166
OSTI ID:
1977907
Journal Information:
Journal of Chemical Information and Modeling, Vol. 62, Issue 14; ISSN 1549-9596
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (39)

Production, use, and fate of all plastics ever made journal July 2017
Effect of Various Pretreatments on the Structure and Properties of Ruthenium Catalysts journal June 1996
A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land? preprint January 2020
Identifying Chemical Reactions and Their Associated Attributes in Patents journal July 2021
ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature journal October 2016
Correction to Automated Chemical Reaction Extraction from Scientific Literature journal July 2021
Tandem Heterogeneous Catalysis for Polyethylene Depolymerization via an Olefin-Intermediate Process journal January 2021
ChemicalTagger: A tool for semantic text-mining in chemistry journal May 2011
Chemical entity extraction using CRF and an ensemble of extractors journal January 2015
Catalytic upcycling of high-density polyethylene via a processive mechanism journal October 2020
Data-driven materials research enabled by natural language processing and information extraction journal December 2020
Opportunities and challenges of text mining in materials research journal March 2021
Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature journal July 2019
Biomedical and clinical English model packages for the Stanza Python NLP library journal June 2021
Scalable Synthesis of Pt/SrTiO 3 Hydrogenolysis Catalysts in Pursuit of Manufacturing-Relevant Waste Plastic Solutions journal December 2021
Polyolefin plastic waste hydroconversion to fuels, lubricants, and waxes: a comparative study journal December 2021
Text-mined dataset of inorganic materials synthesis recipes journal October 2019
Polyethylene upcycling to long-chain alkylaromatics by tandem hydrogenolysis/aromatization journal October 2020
Inferring experimental procedures from text-based representations of chemical reactions journal May 2021
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages conference January 2020
ChemSpot: a hybrid system for chemical named entity recognition journal April 2012
Active Learning Yields Better Training Data for Scientific Named Entity Recognition conference September 2019
OSCAR4: a flexible architecture for chemical text-mining journal October 2011
Polyethylene Hydrogenolysis at Mild Conditions over Ruthenium on Tungstated Zirconia journal August 2021
Low-temperature catalytic upgrading of waste polyolefinic plastics into liquid fuels and waxes journal May 2021
tmChem: a high performance approach for chemical named entity recognition and normalization journal January 2015
ChEMU 2020: Natural Language Processing Methods Are Effective for Information Extraction From Chemical Patents journal March 2021
Efficient upgrading of polyolefin plastics into C5–C12 gasoline alkanes over a Pt/W/Beta catalyst journal January 2022
The CHEMDNER corpus of chemicals and drugs and its annotation principles journal January 2015
Machine-learned and codified synthesis parameters of oxide materials journal September 2017
Single Pot Catalyst Strategy to Branched Products via Adhesive Isomerization and Hydrocracking of Polyethylene over Platinum Tungstated Zirconia journal June 2021
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning journal October 2017
Polypropylene Plastic Waste Conversion to Lubricants over Ru/TiO 2 Catalysts journal June 2021
Developing Advanced Catalysts for the Conversion of Polyolefinic Waste Plastics into Fuels and Chemicals journal August 2012
Upcycling Single-Use Polyethylene into High-Quality Liquid Products journal September 2019
Conversion of Polyolefin Waste to Liquid Alkanes with Ru-Based Catalysts under Mild Conditions journal December 2020
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain conference January 2020
Entity linking for biomedical literature journal May 2015
Dataset for "Unleashing the Power of Knowledge Extraction from Scientific Literature in Catalysis" dataset January 2022

Similar Records

Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge
Journal Article · Tue Apr 14 00:00:00 EDT 2020 · Journal of Chemical Information and Modeling · OSTI ID:1977907

Unsupervised word embeddings capture latent knowledge from materials science literature
Journal Article · Wed Jul 03 00:00:00 EDT 2019 · Nature (London) · OSTI ID:1977907

Mapping scientific frontiers : the quest for knowledge visualization.
Journal Article · Fri Aug 01 00:00:00 EDT 2003 · Proposed for publication in DLIB Magazine. · OSTI ID:1977907