skip to main content
10.1145/3148011.3148026acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

Cross-modal Knowledge Transfer: Improving the Word Embedding of Apple by Looking at Oranges

Published: 04 December 2017 Publication History

Abstract

Capturing knowledge via learned latent vector representations of words, images and knowledge graph (KG) entities has shown state-of-the-art performance in computer vision, computational linguistics and KG tasks. Recent results demonstrate that the learning of such representations across modalities can be beneficial, since each modality captures complementary information. However, those approaches are limited to concepts with cross-modal alignments in the training data which are only available for just a few concepts. Especially for visual objects exist far fewer embeddings than for words or KG entities. We investigate whether a word embedding (e.g., for "apple") can still capture information from other modalities even if there is no matching concept within the other modalities (i.e., no images or KG entities of apples but of oranges as pictured in the title analogy). The empirical results of our knowledge transfer approach demonstrate that word embeddings do benefit from extrapolating information across modalities even for concepts that are not represented in the other modalities. Interestingly, this applies most to concrete concepts (e.g., dragonfly) while abstract concepts (e.g., animal) benefit most if aligned concepts are available in the other modalities.

References

[1]
Simon Baker, Roi Reichart, and Anna Korhonen. 2014. An Unsupervised Model for Instance Level Subcategorization Acquisition. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar. 278--289.
[2]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems. 2787--2795.
[3]
Elia Bruni, Nam-Khanh Tran, and Marco Baroni. 2014. Multimodal Distributional Semantics. Journal of Artificial Intelligence Research (JAIR) 49 (2014), 1--47.
[4]
Manaal Faruqui and Chris Dyer. 2014. Community Evaluation and Exchange of Word Vectors at wordvectors.org. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 19--24.
[5]
Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing Search in Context: The Concept Revisited. In Proceedings of the 10th international conference on World Wide Web. 406--414.
[6]
Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et al. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In Advances in neural information processing systems. 2121--2129.
[7]
Josu Goikoetxea, Eneko Agirre, and Aitor Soroa. 2016. Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet. In Thirtieth AAAI Conference on Artificial Intelligence. 2608--2614.
[8]
Yu Gong, Kaiqi Zhao, and Kenny Qili Zhu. 2016. Representing Verbs as Argument Concepts. In AAAI. 2615--2621.
[9]
Guy Halawi, Gideon Dror, Evgeniy Gabrilovich, and Yehuda Koren. 2012. Large-Scale Learning of Word Relatedness with Constraints. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 1406--1414.
[10]
Felix Hill, Roi Reichart, and Anna Korhonen. 2015. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation. Computational Linguistics 41, 4 (2015), 665--695.
[11]
Douwe Kiela and Léon Bottou. 2014. Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics. In Empirical Methods in Natural Language Processing (EMNLP). 36--45.
[12]
Denis Krompaß, Stephan Baier, and Volker Tresp. 2015. Type-Constrained Representation Learning in Knowledge Graphs. In The Semantic Web-ISWC 2015. Springer, 640--655.
[13]
Thang Luong, Richard Socher, and Christopher Manning. 2013. Better Word Representations with Recursive Neural Networks for Morphology. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, 104--113.
[14]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in neural information processing systems. 3111--3119.
[15]
George A Miller and Walter G Charles. 1991. Contextual Correlates of Semantic Similarity. Language and cognitive processes 6, 1 (1991), 1--28.
[16]
Aditya Mogadala and Achim Rettinger. 2015. Multi-modal Correlated Centroid Space for Multi-lingual Cross-Modal Retrieval. In Advances in Information Retrieval: 37th European Conference on IR Research, ECIR 2015. Springer International Publishing, Cham, 68--79.
[17]
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal Deep Learning. In Proceedings of the 28th international conference on machine learning (ICML-11). 689--696.
[18]
Maximilian Nickel, Lorenzo Rosasco, and Tomaso A. Poggio. 2016. Holographic Embeddings of Knowledge Graphs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA., Dale Schuurmans and Michael P. Wellman (Eds.). AAAI Press, 1955--1961. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12484
[19]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.
[20]
Kira Radinsky, Eugene Agichtein, Evgeniy Gabrilovich, and Shaul Markovitch. 2011. A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis. In Proc. of the 20th international conference on World wide web. 337--346.
[21]
Herbert Rubenstein and John B Goodenough. 1965. Contextual Correlates of Synonymy. Commun. ACM 8, 10 (1965), 627--633.
[22]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.
[23]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. IEEE Computer Society, 2818--2826.
[24]
Steffen Thoma, Achim Rettinger, and Fabian Both. 2017. Knowledge Fusion via Embeddings from Text, Knowledge Graphs, and Images. arXiv preprint arXiv:1704.06084 (2017).
[25]
Steffen Thoma, Achim Rettinger, and Fabian Both. 2017. Towards Holistic Concept Representations: Embedding Relational Knowledge, Visual Attributes, and Distributional Word Semantics. The Semantic Web - ISWC 2017.
[26]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph and Text Jointly Embedding. In Empirical Methods in Natural Language Processing (EMNLP). 1591--1601.
[27]
Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang, Xiaoguang Liu, and Tie-Yan Liu. 2014. RC-NET: A General Framework for Incorporating Knowledge into Word Representations. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 1219--1228.
[28]
Dongqiang Yang and David M. W. Powers. 2006. Verb Similarity on the Taxonomy of WordNet. In Proceedings of the Third International WordNet Conference --- GWC 2006. Masaryk University, 121--128.

Cited By

View all
  • (2022)Network embeddings from distributional thesauri for improving static word representationsExpert Systems with Applications10.1016/j.eswa.2021.115868187(115868)Online publication date: Jan-2022

Index Terms

  1. Cross-modal Knowledge Transfer: Improving the Word Embedding of Apple by Looking at Oranges

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    K-CAP '17: Proceedings of the 9th Knowledge Capture Conference
    December 2017
    271 pages
    ISBN:9781450355537
    DOI:10.1145/3148011
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 December 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Knowledge Transfer
    2. Multi-Modality
    3. Word Similarity

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    K-CAP 2017
    Sponsor:
    K-CAP 2017: Knowledge Capture Conference
    December 4 - 6, 2017
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 55 of 198 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Network embeddings from distributional thesauri for improving static word representationsExpert Systems with Applications10.1016/j.eswa.2021.115868187(115868)Online publication date: Jan-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media