research-article

Zero-shot Recognition with Image Attributes Generation using Hierarchical Coupled Dictionary Learning

Authors:

Baocai YinAuthors Info & Claims

MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

Article No.: 32, Pages 1 - 7

https://doi.org/10.1145/3469877.3490613

Published: 10 January 2022 Publication History

Abstract

Zero-shot learning (ZSL) aims to recognize images from unseen (novel) classes with the training images from seen classes. The attributes of each class is exploited as auxiliary semantic information. Recently most ZSL approaches focus on learning visual-semantic embeddings to transfer knowledge from the seen classes to the unseen classes. However, few works study whether the auxiliary semantic information in the class-level is extensive enough or not for the ZSL task. To tackle such problem, we propose a hierarchical coupled dictionary learning (HCDL) approach to hierarchically align the visual-semantic structures in both the class-level and the image-level. Firstly, the class-level coupled dictionary is trained to establish a basic connection between visual space and semantic space. Then, the image attributes are generated based on the basic connection. Finally, the fine-grained information can be embedded by training the image-level coupled dictionary. Zero-shot recognition is performed in multiple spaces by searching the nearest neighbor class of the unseen image. Experiments on two widely used benchmark datasets show the effectiveness of the proposed approach.

References

[1]

Zeynep Akata, Florent Perronnin, Zaïd Harchaoui, and Cordelia Schmid. 2016. Label Embedding for Image Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 7(2016), 1425–1438.

[2]

Lei Jimmy Ba, Kevin Swersky, Sanja Fidler, and Ruslan Salakhutdinov. 2015. Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions. In International Conference on Computer Vision. 4247–4255.

[3]

Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized Classifiers for Zero-Shot Learning. In IEEE Conference on Computer Vision and Pattern Recognition. 5327–5336.

[4]

Ali Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. 2009. Describing objects by their attributes. In IEEE Conference on Computer Vision and Pattern Recognition. 1778–1785.

[5]

Rafael Felix, B. G. Vijay Kumar, Ian Reid, and Gustavo Carneiro. 2018. Multi-modal Cycle-Consistent Generalized Zero-Shot Learning. In European Conference Computer Vision. 21–37.

[6]

Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, and Tomás Mikolov. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In Advances in Neural Information Processing Systems. 2121–2129.

[7]

Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2014. Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation. In European Conference Computer Vision, Vol. 8690. 584–599.

[8]

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems. 2672–2680.

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[10]

Chen Huang, Chen Change Loy, and Xiaoou Tang. 2016. Local Similarity-Aware Deep Feature Embedding. In Advances in Neural Information Processing Systems. 1262–1270.

[11]

Zhong Ji, Junyue Wang, Yunlong Yu, Yanwei Pang, and Jungong Han. 2019. Class-specific synthesized dictionary model for Zero-Shot Learning. Neurocomputing 329(2019), 339–347.

[12]

Huajie Jiang, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2018. Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition. In European Conference on Computer Vision, Vol. 11214. Springer, 121–138.

Digital Library

[13]

Elyor Kodirov, Tao Xiang, and Shaogang Gong. 2017. Semantic Autoencoder for Zero-Shot Learning. In IEEE Conference on Computer Vision and Pattern Recognition. 4447–4456.

[14]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems. 1106–1114.

[15]

Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning to detect unseen object classes by between-class attribute transfer. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 951–958.

[16]

Hugo Larochelle, Dumitru Erhan, and Yoshua Bengio. 2008. Zero-data Learning of New Tasks. In Proceedings of the Conference on Artificial Intelligence. 646–651.

[17]

Yanan Li, Donghui Wang, Huanhang Hu, Yuetan Lin, and Yueting Zhuang. 2017. Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths. In IEEE Conference on Computer Vision and Pattern Recognition. 5207–5215.

[18]

Yan Li, Junge Zhang, Jianguo Zhang, and Kaiqi Huang. 2018. Discriminative Learning of Latent Features for Zero-Shot Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 7463–7471.

[19]

Jinlu Liu, Xirong Li, and Gang Yang. 2018. Cross-Class Sample Synthesis for Zero-shot Learning. In British Machine Vision Conference. 113–125.

[20]

Ming-Yu Liu, Thomas M. Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In Advances in Neural Information Processing Systems. 700–708.

[21]

Shichen Liu, Mingsheng Long, Jianmin Wang, and Michael I. Jordan. 2018. Generalized Zero-Shot Learning with Deep Calibration Network. In Advances in Neural Information Processing Systems. 2009–2019.

[22]

Yang Liu, Xinbo Gao, Quanxue Gao, Jungong Han, and Ling Shao. 2020. Label-activating framework for zero-shot learning. Neural Networks 121(2020), 1–9.

Digital Library

[23]

Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. Computer Science (2014), 2672–2680.

[24]

Bernardino Romera-Paredes and Philip H. S. Torr. 2015. An embarrassingly simple approach to zero-shot learning. In Proceedings of the International Conference on Machine Learning, Vol. 37. 2152–2161.

[25]

Tianxiao Shen, Tao Lei, Regina Barzilay, and Tommi S. Jaakkola. 2017. Style Transfer from Non-Parallel Text by Cross-Alignment. In Advances in Neural Information Processing Systems. 6830–6841.

[26]

Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Y. Ng. 2013. Zero-Shot Learning Through Cross-Modal Transfer. In Advances in Neural Information Processing Systems. 935–943.

[27]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition. 1–9.

[28]

Yongqin Xian, Tobias Lorenz, Bernt Schiele, and Zeynep Akata. 2018. Feature Generating Networks for Zero-Shot Learning. In IEEE Conference on Computer Vision and Pattern Recognition. 5542–5551.

[29]

Yongqin Xian, Bernt Schiele, and Zeynep Akata. 2017. Zero-Shot Learning - The Good, the Bad and the Ugly. In IEEE Conference on Computer Vision and Pattern Recognition. 3077–3086.

[30]

Haofeng Zhang, Yang Long, Li Liu, and Ling Shao. 2019. Adversarial unseen visual feature synthesis for Zero-shot Learning. Neurocomputing 329(2019), 12–20.

[31]

Haofeng Zhang, Yang Long, and Ling Shao. 2018. Zero-shot Hashing with Orthogonal Projection for Image Retrieval. Pattern Recognition Letters 117, 1 (2018), 201–209.

[32]

Xiangxin Zhu, Dragomir Anguelov, and Deva Ramanan. 2014. Capturing Long-Tail Distributions of Object Subcategories. In IEEE Conference on Computer Vision and Pattern Recognition. 915–922.

Cited By

Uehara KHarada T(2024)Learning by Asking Questions for Knowledge-Based Novel Object RecognitionInternational Journal of Computer Vision10.1007/s11263-023-01976-7132:6(2290-2309)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1007/s11263-023-01976-7
Li SWang LWang SKong DYin B(2023)Hierarchical Coupled Discriminative Dictionary Learning for Zero-Shot LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.324647533:9(4973-4984)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1109/TCSVT.2023.3246475

Index Terms

Zero-shot Recognition with Image Attributes Generation using Hierarchical Coupled Dictionary Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
2. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders
Extended Reality
Abstract
Multi-label Zero-Shot Learning (ZSL) is an extension of traditional single-label ZSL, where the objective is to accurately classify images containing multiple unseen classes that are not available during training. Current techniques depends on ...
Zero-shot Image Categorization by Image Correlation Exploration
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

The problem of image categorization from zero or only a few training examples, called zero-shot learning, occurs frequently, but it has hardly been studied in computer vision research. To tackle this problem, mid-level semantic attributes are introduced ...
Generalized Zero-Shot Learning using Identifiable Variational Autoencoders
Highlights
- Identifiable VAE is a generative model to address conventional and generalized ZSL.
Abstract
Deep learning tasks rely heavily on a large amount of training data, but collecting and annotating data daily is not practical. Therefore, Zero-shot learning (ZSL) has become important for the applications, where no labeled data is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

December 2021

508 pages

ISBN:9781450386074

DOI:10.1145/3469877

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China
Beijing Outstanding Young Scientists Projects
Beijing Natural Science Foundation

Conference

MMAsia '21

Sponsor:

SIGMM

MMAsia '21: ACM Multimedia Asia

December 1 - 3, 2021

Gold Coast, Australia

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
75
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Uehara KHarada T(2024)Learning by Asking Questions for Knowledge-Based Novel Object RecognitionInternational Journal of Computer Vision10.1007/s11263-023-01976-7132:6(2290-2309)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1007/s11263-023-01976-7
Li SWang LWang SKong DYin B(2023)Hierarchical Coupled Discriminative Dictionary Learning for Zero-Shot LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.324647533:9(4973-4984)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1109/TCSVT.2023.3246475

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten