Asymmetric graph based zero shot learning

Wang, Yinduo; Zhang, Haofeng; Zhang, Zheng; Long, Yang

doi:10.1007/s11042-019-7689-y

Asymmetric graph based zero shot learning

Published: 14 May 2019

Volume 79, pages 33689–33710, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yinduo Wang¹,
Haofeng Zhang ORCID: orcid.org/0000-0002-4039-7618¹,
Zheng Zhang² &
…
Yang Long³

451 Accesses
8 Citations
Explore all metrics

Abstract

Zero-shot learning (ZSL) now has gained a great deal of focus due to its ability of recognizing unseen categories by training with samples of only seen categories. Existing efforts have been devoted to learn a projection between semantic space and feature space, which has made a big progress in ZSL. However, simply establishing a projection often suffers from the visual semantic ambiguity problem and hubness problem. Specifically, visual patterns and semantic concepts often can not properly match each other, and lead to inaccurate recognition result. To this end, in this paper, we propose a novel ZSL model, namely Asymmetric Graph-based Zero Shot Learning (AGZSL), to simultaneously preserve class level semantic manifold and instance level visual manifold in a latent space. In addition, to make the model more discriminative, we also constrain the latent space to be orthogonal, which means that the projected visual features and semantic embeddings in the latent space are orthogonal when they belong to different categories. We test our approach on four benchmark datasets under both standard zero-shot setting and more realistic generalized zero-shot learning (GZSL) setting, and the results show that our AGZSL can significantly improve the final performance comparing to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Article 04 April 2024

Qi Fan, Wei Zhuo, … Yu-Wing Tai

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Kaiyang Zhou, Jingkang Yang, … Ziwei Liu

Indirect visual–semantic alignment for generalized zero-shot recognition

Article 03 April 2024

Yan-He Chen & Mei-Chen Yeh

References

Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438
Article Google Scholar
Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2927–2936
Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936
Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Proces Syst 14(6):585–591
Google Scholar
Bittorf V, Recht B, Ré C, Tropp JA (2012) Factoring nonnegative matrices with linear programs. In: Advances in neural information processing systems, pp 1214–1222
Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 5327–5336
Chao WL, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European conference on computer vision, pp 52–68
Deutsch S, Kolouri S, Kim K, Owechko Y, Soatto S (2017) Zero shot learning via multi-scale manifold regularization. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7112–7119
Ding Z, Shao M, Fu Y (2017) Low-rank embedded ensemble semantic dictionary for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2050–2058
Ding Z, Shao M, Fu Y (2018) Generative zero-shot learning via low-rank embedded semantic dictionary. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2867870
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on international conference on machine learning, pp 647–655
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1778–1785
Ferrari V, Zisserman A (2008) Learning visual attributes. In: Advances in neural information processing systems, pp 433–440
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T, et al. (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129
Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: European conference on computer vision, pp 584–599
Fu Y, Xiang T, Jiang YG, Xue X, Sigal L, Gong S (2018) Recent advances in zero-shot recognition. IEEE Signal Process Mag 35(1):112–125
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Ji Z, Yu Y, Pang Y, Guo J, Zhang Z (2017) Manifold regularized cross-modal embedding for zero-shot learning. Inf Sci 378:48–58
Article Google Scholar
Jiang H, Wang R, Shan S, Chen X (2018) Learning class prototypes via structure alignment for zero-shot recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 118–134
Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 2452–2460
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3174–3183
Lampert CH, Hannes N, Stefan H (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36 (3):453–465
Article Google Scholar
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36 (3):453–465
Article Google Scholar
Lee H, Pham PT, Yan L, Ng AY (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems, pp 1096–1104
Li J, Lu K, Huang Z, Zhu L, Shen H (2019) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 46(6):2144–2155
Article Google Scholar
Li J, Lu K, Zhu L, Li Z (2017) Locality-constrained transfer coding for heterogeneous domain adaptation. In: Australasian database conference, pp 193–204
Li J, Yue W, Ke L (2017) Structured domain adaptation. IEEE Trans Circuits Syst Video Technol 27(8):1700–1713
Article Google Scholar
Li J, Zhao J, Lu K (2016) Joint feature selection and structure preservation for domain adaptation. In: International joint conferences on artificial intelligence (IJCAI), pp 1697–1703
Li J, Zhu L, Huang Z, Lu K, Zhao J (2018) I read, i saw, i tell: texts assisted fine-grained visual classification. In: 2018 ACM multimedia conference on multimedia conference, pp 663–671
Long Y, Liu L, Shao L (2016) Attribute embedding with visual-semantic ambiguity removal for zero-shot learning. In: BMVC
Long Y, Shao L (2017) Describing unseen classes by exemplars: Zero-shot learning using grouped simile ensemble. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 907–915
Maaten LVD, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605
MATH Google Scholar
Massei S, Palitta D, Robol L (2018) Solving rank-structured Sylvester and Lyapunov equations. SIAM J Matrix Anal Appl 39(4):1564–1590
Article MathSciNet Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2014) Zero-shot learning by convex combination of semantic embeddings. In: International conference on learning representation (ICLR)
Palatucci M, Pomerleau D, Hinton GE, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Advances in neural information processing systems, pp 1410–1418
Patterson G, Xu C, Su H, Hays J (2014) The sun attribute database: beyond categories for deeper scene understanding. Int J Comput Vis 108(1-2):59–81
Article Google Scholar
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on international conference on machine learning, pp 2152–2161
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representation (ICLR)
Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representation (ICLR)
Socher R, Ganjoo M, Sridhar H, Bastani O, Manning CD, Ng AY (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943
Song J, Shen C, Yang Y, Liu Y, Song M (2018) Transductive unbiased embedding for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1024–1033
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3 (1):71–86
Article Google Scholar
Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 792–808
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-UCSD Birds-200-2011 Dataset. Tech rep
Wright J, Ganesh A, Rao S, Peng Y, Ma Y (2009) Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in neural information processing systems, pp 2080–2088
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 69–77
Xian Y, Schiele B, Akata Z (2017) Zero-shot learning-the good, the bad and the ugly. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4582–4591
Yan H, Ye Q, Yu DJ, Yuan X, Xu Y, Fu L, et al. (2018) Least squares twin bounded support vector machines based on l1-norm distance metric for classification. Pattern Recogn 74:434–447
Article Google Scholar
Ye Q, Yang J, Liu F, Zhao C, Ye N, Yin T (2018) L1-norm distance linear discriminant analysis based on an effective iterative algorithm. IEEE Trans Circuits Syst Video Technol 28(1):114–129
Article Google Scholar
Zhang H, Long Y, Guan Y, Shao L (2019) Triple verification network for generalized zero-shot learning. IEEE Trans Image Process 28(1):506–517
Article MathSciNet Google Scholar
Zhang H, Long Y, Liu L, Shao L (2018) Adversarial unseen visual feature synthesis for zero-shot learning. Neurocomputing 329:12–20
Article Google Scholar
Zhang H, Long Y, Shao L (2018) Zero-shot leaning and hashing with binary visual similes. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-018-6842-3
Zhang H, Long Y, Yang W, Shao L (2019) Dual-verification network for zero-shot learning. Inf Sci 470:43–57
Article MathSciNet Google Scholar
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2021–2030
Zhang Z, Saligrama V (2015) Zero-shot learning via joint latent similarity embedding. In: 6034–6042
Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (No.61872187).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Yinduo Wang & Haofeng Zhang
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD, 4072, Australia
Zheng Zhang
Open Laboratory, School of Computing, Newcastle University, Newcastle upon Tyne, UK
Yang Long

Authors

Yinduo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haofeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Long
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haofeng Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Zhang, H., Zhang, Z. et al. Asymmetric graph based zero shot learning. Multimed Tools Appl 79, 33689–33710 (2020). https://doi.org/10.1007/s11042-019-7689-y

Download citation

Received: 29 December 2018
Revised: 27 February 2019
Accepted: 24 April 2019
Published: 14 May 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-019-7689-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymmetric graph based zero shot learning

Abstract

Access this article

Similar content being viewed by others

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Learning to Prompt for Vision-Language Models

Indirect visual–semantic alignment for generalized zero-shot recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Asymmetric graph based zero shot learning

Abstract

Access this article

Similar content being viewed by others

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Learning to Prompt for Vision-Language Models

Indirect visual–semantic alignment for generalized zero-shot recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation