Authors:
Dennis Medved
;
Fangyuan Jiang
;
Peter Exner
;
Magnus Oskarsson
;
Pierre Nugues
and
Kalle Aström
Affiliation:
Lund University, Sweden
Keyword(s):
Semantic Parsing, Relation Extraction from Images, Machine Learning.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Classification
;
Computer Vision, Visualization and Computer Graphics
;
Image Understanding
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Natural Language Processing
;
Object Recognition
;
Pattern Recognition
;
Software Engineering
;
Symbolic Systems
;
Theory and Methods
Abstract:
In this paper, we describe a novel system that identifies relations between the objects extracted from an image.
We started from the idea that in addition to the geometric and visual properties of the image objects, we could
exploit lexical and semantic information from the text accompanying the image. As experimental set up, we
gathered a corpus of images from Wikipedia as well as their associated articles. We extracted two types of
objects: human beings and horses and we considered three relations that could hold between them: Ride,
Lead, or None. We used geometric features as a baseline to identify the relations between the entities and we
describe the improvements brought by the addition of bag-of-word features and predicate–argument structures
we derived from the text. The best semantic model resulted in a relative error reduction of more than 18%
over the baseline.