"Linked data as background knowledge for information extraction on the web" by Ziqi Zhang, Anna Lisa Gentile and Isabelle Augenstein with Martin Vesely as coordinator

Published: 01 July 2014


Information Extraction (IE) is the technique for transforming textual data into structured representation that can be understood by machines. It is a crucial technique in enabling the Semantic Web, where increasing interest has been seen in recent years. This article reports recent progress in the LODIE project - Linked Open Data for Information Extraction, aimed at advancing Web IE to a new frontier by exploiting largely available, semantically annotated, Linked Open Data as background knowledge. We cover topics of wrapper induction, IE from semi-structured content such as tables and lists, and IE from free-text. We describe new challenges in the research and methods proposed to address them, together with summaries of recent evaluations showing encouraging results.


