Towards Bridged Vision and Language: Learning Cross-Modal Knowledge Representation for Relation Extraction | IEEE Journals & Magazine | IEEE Xplore