research-article

Cross-modal Knowledge Transfer: Improving the Word Embedding of Apple by Looking at Oranges

Authors:

Achim RettingerAuthors Info & Claims

K-CAP '17: Proceedings of the 9th Knowledge Capture Conference

Article No.: 18, Pages 1 - 8

https://doi.org/10.1145/3148011.3148026

Published: 04 December 2017 Publication History

Abstract

Capturing knowledge via learned latent vector representations of words, images and knowledge graph (KG) entities has shown state-of-the-art performance in computer vision, computational linguistics and KG tasks. Recent results demonstrate that the learning of such representations across modalities can be beneficial, since each modality captures complementary information. However, those approaches are limited to concepts with cross-modal alignments in the training data which are only available for just a few concepts. Especially for visual objects exist far fewer embeddings than for words or KG entities. We investigate whether a word embedding (e.g., for "apple") can still capture information from other modalities even if there is no matching concept within the other modalities (i.e., no images or KG entities of apples but of oranges as pictured in the title analogy). The empirical results of our knowledge transfer approach demonstrate that word embeddings do benefit from extrapolating information across modalities even for concepts that are not represented in the other modalities. Interestingly, this applies most to concrete concepts (e.g., dragonfly) while abstract concepts (e.g., animal) benefit most if aligned concepts are available in the other modalities.

References

[1]

Simon Baker, Roi Reichart, and Anna Korhonen. 2014. An Unsupervised Model for Instance Level Subcategorization Acquisition. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar. 278--289.

[2]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems. 2787--2795.

Digital Library

[3]

Elia Bruni, Nam-Khanh Tran, and Marco Baroni. 2014. Multimodal Distributional Semantics. Journal of Artificial Intelligence Research (JAIR) 49 (2014), 1--47.

Digital Library

[4]

Manaal Faruqui and Chris Dyer. 2014. Community Evaluation and Exchange of Word Vectors at wordvectors.org. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 19--24.

[5]

Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing Search in Context: The Concept Revisited. In Proceedings of the 10th international conference on World Wide Web. 406--414.

Digital Library

[6]

Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et al. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In Advances in neural information processing systems. 2121--2129.

Digital Library

[7]

Josu Goikoetxea, Eneko Agirre, and Aitor Soroa. 2016. Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet. In Thirtieth AAAI Conference on Artificial Intelligence. 2608--2614.

Digital Library

[8]

Yu Gong, Kaiqi Zhao, and Kenny Qili Zhu. 2016. Representing Verbs as Argument Concepts. In AAAI. 2615--2621.

Digital Library

[9]

Guy Halawi, Gideon Dror, Evgeniy Gabrilovich, and Yehuda Koren. 2012. Large-Scale Learning of Word Relatedness with Constraints. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 1406--1414.

Digital Library

[10]

Felix Hill, Roi Reichart, and Anna Korhonen. 2015. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation. Computational Linguistics 41, 4 (2015), 665--695.

Digital Library

[11]

Douwe Kiela and Léon Bottou. 2014. Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics. In Empirical Methods in Natural Language Processing (EMNLP). 36--45.

[12]

Denis Krompaß, Stephan Baier, and Volker Tresp. 2015. Type-Constrained Representation Learning in Knowledge Graphs. In The Semantic Web-ISWC 2015. Springer, 640--655.

Digital Library

[13]

Thang Luong, Richard Socher, and Christopher Manning. 2013. Better Word Representations with Recursive Neural Networks for Morphology. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, 104--113.

[14]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in neural information processing systems. 3111--3119.

Digital Library

[15]

George A Miller and Walter G Charles. 1991. Contextual Correlates of Semantic Similarity. Language and cognitive processes 6, 1 (1991), 1--28.

[16]

Aditya Mogadala and Achim Rettinger. 2015. Multi-modal Correlated Centroid Space for Multi-lingual Cross-Modal Retrieval. In Advances in Information Retrieval: 37th European Conference on IR Research, ECIR 2015. Springer International Publishing, Cham, 68--79.

[17]

Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal Deep Learning. In Proceedings of the 28th international conference on machine learning (ICML-11). 689--696.

Digital Library

[18]

Maximilian Nickel, Lorenzo Rosasco, and Tomaso A. Poggio. 2016. Holographic Embeddings of Knowledge Graphs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA., Dale Schuurmans and Michael P. Wellman (Eds.). AAAI Press, 1955--1961. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12484

Digital Library

[19]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.

[20]

Kira Radinsky, Eugene Agichtein, Evgeniy Gabrilovich, and Shaul Markovitch. 2011. A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis. In Proc. of the 20th international conference on World wide web. 337--346.

Digital Library

[21]

Herbert Rubenstein and John B Goodenough. 1965. Contextual Correlates of Synonymy. Commun. ACM 8, 10 (1965), 627--633.

Digital Library

[22]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.

Digital Library

[23]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society, 2818--2826.

[24]

Steffen Thoma, Achim Rettinger, and Fabian Both. 2017. Knowledge Fusion via Embeddings from Text, Knowledge Graphs, and Images. arXiv preprint arXiv:1704.06084 (2017).

[25]

Steffen Thoma, Achim Rettinger, and Fabian Both. 2017. Towards Holistic Concept Representations: Embedding Relational Knowledge, Visual Attributes, and Distributional Word Semantics. The Semantic Web - ISWC 2017.

[26]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph and Text Jointly Embedding. In Empirical Methods in Natural Language Processing (EMNLP). 1591--1601.

[27]

Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang, Xiaoguang Liu, and Tie-Yan Liu. 2014. RC-NET: A General Framework for Incorporating Knowledge into Word Representations. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 1219--1228.

Digital Library

[28]

Dongqiang Yang and David M. W. Powers. 2006. Verb Similarity on the Taxonomy of WordNet. In Proceedings of the Third International WordNet Conference --- GWC 2006. Masaryk University, 121--128.

Cited By

Jana AHaldar SGoyal P(2022)Network embeddings from distributional thesauri for improving static word representationsExpert Systems with Applications10.1016/j.eswa.2021.115868187(115868)Online publication date: Jan-2022
https://doi.org/10.1016/j.eswa.2021.115868

Index Terms

Cross-modal Knowledge Transfer: Improving the Word Embedding of Apple by Looking at Oranges
1. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

Chinese and Vietnamese cross-lingual topic discovery based on word similarity of comparable corpus

In order to solve the problem of the scarcity of Chinese-Vietnamese comparable corpus and limited scale of bilingual dictionaries, we propose a method for cross-language topic discovery based on the similarity between Chinese and Vietnamese. Firstly, we ...
Disseminative capacity, organizational structure and knowledge transfer

Knowledge transfer is an iterative process between knowledge senders and knowledge recipients. Absorptive capacity of knowledge senders has been examined in the field of organizational learning and innovation literature, and many accumulated findings ...
Tacit Knowledge Transfer within Enterprises during Industry Conversion
ICIII '08: Proceedings of the 2008 International Conference on Information Management, Innovation Management and Industrial Engineering - Volume 03

Enterprises are confronted with the challenge of reengineering technological capability during industry conversion. Tacit knowledge accumulated in the former industry is vital for reengineering new technological capability. The research aims at the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

K-CAP '17: Proceedings of the 9th Knowledge Capture Conference

December 2017

271 pages

ISBN:9781450355537

DOI:10.1145/3148011

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

K-CAP 2017

Sponsor:

SIGAI

K-CAP 2017: Knowledge Capture Conference

December 4 - 6, 2017

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
169
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jana AHaldar SGoyal P(2022)Network embeddings from distributional thesauri for improving static word representationsExpert Systems with Applications10.1016/j.eswa.2021.115868187(115868)Online publication date: Jan-2022
https://doi.org/10.1016/j.eswa.2021.115868

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents