Authors:
Vayianos Pertsas
and
Panos Constantopoulos
Affiliation:
Department of Informatics, Athens University of Economics and Business, Athens, Greece
Keyword(s):
Ontology-Driven Information Extraction, Information Extraction from Text, Transformer-Based Methods, Relation Extraction, Knowledge Graph Creation.
Abstract:
We present transformer-based methods for extracting information about research processes from scholarly publications. We developed a two-stage pipeline comprising a transformer-based text classifier that predicts whether a sentence contains the entities sought in tandem with a transformer-based entity recogniser for finding the boundaries of the entities inside the sentences that contain them. This is applied to extracting two different types of entities: i) research activities, representing the acts performed by researchers, which are entities of complex lexico-syntactic structure, and ii) research methods, representing the procedures used in performing research activities, which are named entities of variable length. We also developed a system that assigns semantic context to the extracted entities by: i) linking them according to the relation employs(Activity,Method) using a transformer-based binary classifier for relation extraction; ii) associating them with information extracte
d from publication metadata; and iii) encoding the contextualized information at the output into an RDF Knowledge Graph. The entire workflow is ontology-driven, based on Scholarly Ontology, specifically designed for documenting scholarly work. Our methods are trained and evaluated on a dataset comprising 12,626 sentences, manually annotated for the task at hand, and shown to surpass simpler transformer-based methods and baselines.
(More)