Zero-shot leaning and hashing with binary visual similes

Zhang, Haofeng; Long, Yang; Shao, Ling

doi:10.1007/s11042-018-6842-3

Zero-shot leaning and hashing with binary visual similes

Published: 16 November 2018

Volume 78, pages 24147–24165, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

341 Accesses
10 Citations
Explore all metrics

Abstract

Conventional zero-shot learning methods usually learn mapping functions to project image features into semantic embedding spaces, in which to find the nearest neighbors with predefined attributes. The predefined attributes including both seen classes and unseen classes are often annotated with high dimensional real values by experts, which costs a lot of human labors. In this paper, we propose a simple but effective method to reduce the annotation work. In our strategy, only unseen classes are needed to be annotated with several binary codes, which lead to only about one percent of original annotation work. In addition, we design a Visual Similes Annotation System (ViSAS) to annotate the unseen classes, and build both linear and deep mapping models and test them on four popular datasets, the experimental results show that our method can outperform the state-of-the-art methods in most circumstances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

Learning with Noisy Correspondence

Article 13 April 2024

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Article 15 September 2023

Notes

https://uk.mathworks.com/help/matlab/ref/sylvester.html

References

Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: CVPR, pp 2927–2936
Akata Z, Malinowski M, Fritz M, Schiele B (2016) Multi-cue zero-shot learning with strong supervision. In: CVPR, pp 59–68
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE TPAMI 38(7):1425–1438
Article Google Scholar
Al-Halah Z, Stiefelhagen R (2017) Automatic discovery, association estimation and learning of semantic attributes for a thousand categories. In: CVPR
Bartels RH, Stewart GW (1972) Solution of the matrix equation AX + XB = C [F4]. Commun ACM 15(9):820–826
Article MATH Google Scholar
Bucher M, Herbin S, Jurie F (2016) Improving semantic embedding consistency by metric learning for zero-shot classiffication. In: ECCV, pp 730–746. Springer
Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: CVPR, pp 5327–5336
Cheng Z, Shen J (2016) On very large scale test collection for landmark image search benchmarking. Signal Process 124:13–26
Article Google Scholar
Cheng Z, Ding Y, Zhu L, Kankanhalli M (2018) Aspect-aware latent factor model: Rating prediction with ratings and reviews. In: WWW
Demirel B, Cinbis RG, Cinbis NI (2017) Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning. In: CVPR
Ding Z, Shao M, Fu Y (2017) Low-rank embedded ensemble semantic dictionary for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2050–2058
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: CVPR, pp 1778–1785. IEEE
Ferrari V, Zisserman A (2008) Learning visual attributes. In: NIPS, pp 433–440
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T, et al. (2013) Devise: A deep visual-semantic embedding model. In: NIPS, pp 2121–2129
Fu Y, Sigal L (2016) Semi-supervised vocabulary-informed learning. In: CVPR, pp 5337–5346
Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: ECCV, pp 584–599. Springer
Fu Z, Xiang T, Kodirov E, Gong S (2015) Zero-shot object recognition by semantic manifold distance. In: CVPR, pp 2635–2644
Guo Y, Ding G, Jin X, Wang J (2016) Transductive zero-shot recognition via shared model space learning. In: AAAI, pp 3494–5000
Guo Y, Ding G, Han J, Gao Y (2017) Sitnet: Discrete similarity transfer network for zero-shot hashing. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 1767–1773
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Huang S, Elhoseiny M, Elgammal A, Yang D (2015) Learning hypergraph-regularized attribute predictors. In: CVPR, pp 409–417
Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning. In: ICCV, pp 2452–2460
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: CVPR
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE TPAMI 36(3):453–465
Article Google Scholar
Li J, Zhao J, Lu K (2016) Joint feature selection and structure preservation for domain adaptation. In: IJCAI, pp 1697–1703
Li J, Wu Y, Zhao J, Lu K (2017) Low-rank discriminant embedding for multiview learning. IEEE Trans Cybern 47(11):3516–3529
Article Google Scholar
Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: A generalized framework for domain adaptation. IEEE Transactions on Cybernetics
Liu Y, Gao Q, Li J, Han J, Shao L (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496
Long Y, Shao L (2017) Describing unseen classes by exemplars: Zero-shot learning using grouped simile ensemble. In: WACV, pp 907–915. IEEE
Long Y, Liu L, Shao L, Shen F, Ding G, Han J (2017) From zero-shot learning to conventional supervised classification: Unseen visual data synthesis. In: CVPR
Lu J, Li J, Yan Z, Zhang C (2017) Zero-shot learning by generating pseudo feature representations. In: CVPR
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: ICLR
Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on Multimedia, pp 59–68. ACM
Nie L, Zhao YL, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst (TOIS) 32(1):5
Article Google Scholar
Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2014) Zero-shot learning by convex combination of semantic embeddings. In: ICLR
Patterson G, Xu C, Su H, Hays J (2014) The sun attribute database: Beyond categories for deeper scene understanding. IJCV 108(1-2):59–81
Article Google Scholar
Qiao R, Liu L, Shen C, van den Hengel A (2016) Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp 2249–2257
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: ICML, pp 2152–2161
Shen F, Shen C, Liu W, Tao Shen H (2015) Supervised discrete hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 37–45
Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: NIPS, pp 935–943
Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-ucsd birds, pp 200
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: CVPR, pp 69–77
Xian Y, Schiele B, Akata Z (2017) Zero-shot learning-the good, the bad and the ugly. In: CVPR
Yang Y, Luo Y, Chen W, Shen F, Shao J, Shen H T (2016) Zero-shot hashing via transferring supervised knowledge. In: Proceedings of the 2016 ACM on multimedia conference, pp 1286–1295. ACM
Yang Y, Luo Y, Chen W, Shen F, Shao J, Shen H T (2016) Zero-shot hashing via transferring supervised knowledge. In: ACM MM, pp 1286–1295. ACM
Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: ICCV, pp 4166–4174
Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl (TOMM) 12(3):37
Google Scholar
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: CVPR
Zhang H, Liu L, Long Y, Shao L (2018) Unsupervised deep hashing with pseudo labels for scalable image retrieval. IEEE Trans Image Process 27(4):1626–1638
Article MathSciNet Google Scholar
Zhang H, Long Y, Yang W, Shao L (2019) Dual-verification network for zero-shot learning. Inform Sci 470:43–57
Article MathSciNet Google Scholar
Zhu L, Shen J, Liu X, Xie L, Nie L (2016) Learning compactvisual representation with canonical views for robust mobile landmark search. In: IJCAI
Zhu L, Shen J, Xie L (2016) Topic hypergraph hashing for mobile imageretrieval. In: ACM MM
Zhu L, Huang Z, Chang X, Song J, Shen H T (2017) Exploring consistent preferences:discrete hashing with pair-exemplar for scalable landmark search. In: ACM MM
Zhu L, Huang Z, Liu X, Xie L (2017) Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE TMM 19(9):2066–2079
Google Scholar
Zhu L, Huang Z, Liu X, Xie L (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 19(9):2066–2079
Google Scholar
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistance for efficient content-based web image retrieval. IEEE TKDE 29 (2):472–486
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Haofeng Zhang
Open Laboratory, School of Computing, Newcastle University, Newcastle upon Tyne, UK
Yang Long
Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, United Arab Emirates
Ling Shao

Authors

Haofeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Long
View author publications
You can also search for this author in PubMed Google Scholar
Ling Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haofeng Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by National Natural Science Foundation of China (No.61872187) and the Major Special Project of Core Electronic Devices, High-end Generic Chips and Basic Software (No.2015ZX01041101).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Long, Y. & Shao, L. Zero-shot leaning and hashing with binary visual similes. Multimed Tools Appl 78, 24147–24165 (2019). https://doi.org/10.1007/s11042-018-6842-3

Download citation

Received: 31 May 2018
Revised: 19 October 2018
Accepted: 05 November 2018
Published: 16 November 2018
Issue Date: 15 September 2019
DOI: https://doi.org/10.1007/s11042-018-6842-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zero-shot leaning and hashing with binary visual similes

Abstract

Access this article

Similar content being viewed by others

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Learning with Noisy Correspondence

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Zero-shot leaning and hashing with binary visual similes

Abstract

Access this article

Similar content being viewed by others

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Learning with Noisy Correspondence

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation