Zero-Shot Learning Based on Salient Region and Enhanced Semantics

Pan, Zongrong; Zhu, Anna

doi:10.1007/978-3-030-60639-8_32

Zongrong Pan¹⁶ &
Anna Zhu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12306))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1520 Accesses

Abstract

Zero-shot learning (ZSL) refer to recognizing the new class without training samples. Traditionally, the projection function learned from visual features to semantic features is used for object recognition. However, few works will focus on accurate feature representation of recognition objects. The human designed semantics are not discriminative and sufficient to recognize different and new classes. In this paper, we propose to use the image reconstruction to extract enhanced semantics (ES) on salient region of image. The salient region of image is encoded corresponding to predefined attributes and ES features. And then decoded to original image of salient region. The Lifted structure feature embedding (LSFE) is applied to make the extended features more discriminative. Softmax is used for classification thus makes ES features more accurate. Experiments on two benchmark datasets AwA2 and CUB, demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sung, F., Yang, Y., Zhang, L., et al.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Google Scholar
Song, J, Shen, C, Yang, Y, et al.: Transductive unbiased embedding for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1024–1033 (2018)
Google Scholar
Yu, Y., Ji, Z., Li, X., et al.: Transductive zero-shot learning with a self-training dictionary approach. IEEE Trans. Cybern. 48(10), 2908–2919 (2018)
Article Google Scholar
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–958 (2009)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliencybased visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Akata, Z., Perronnin, F., Harchaoui, Z., et al.: Label-embedding for attribute-based classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2013)
Google Scholar
Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1641–1648 (2011)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2013)
Article Google Scholar
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)
Google Scholar
Changpinyo, S., Chao, W.L., Gong, B., et al.: Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5327–5336 (2016)
Google Scholar
Oh Song, H., Xiang, Y., Jegelka, S., et al.: Deep metric learning via lifted structured feature embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4004–4012 (2016)
Google Scholar
Hou, Q., Cheng, M.M., Hu, X., et al.: Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3203–3212 (2017)
Google Scholar
Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: Zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2584–2591 (2013)
Google Scholar
Frome, A., et al.: Devise: Adeepvisual-semantic embedding model. In: Proceedings of Advances in Neural Information Processing Systems, pp. 2121–2129 (2013)
Google Scholar
Lei Ba, J., Swersky, K., Fidler, S.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4247–4255 (2015)
Google Scholar
Xian, Y., Lampert, C.H., Schiele, B., et al.: Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)
Article Google Scholar
Wah, C., Branson, S.., Perona, P., et al.: Multiclass recognition and part localization with humans in the loop. In: International Conference on Computer Vision, pp. 2524–2531. IEEE (2011)
Google Scholar
Russakovsky, O., Deng, J., Su, H., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161 (2015)
Google Scholar
Li, Y., Zhang, J., Zhang, J., et al.: Discriminative learning of latent features for zero-shot recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7463–7471 (2018)
Google Scholar
Norouzi, M., et al.: Zero-shot learning by convex combination of semantic embeddings. arXiv preprint arXiv:1312.5650 (2013)
Chao, W.-L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 52–68. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_4
Chapter Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Kodirov, E., Xiang, T., Fu, Z., et al.: Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision, pp. 2452–2460 (2015)
Google Scholar
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3174–3183 (2017)
Google Scholar
Ye, M., Guo, Y.: Zero-shot classification with discriminative semantic representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7140–7148 (2017)
Google Scholar
Jiang, H., Wang, R., Shan, S., et al.: Learning discriminative latent attributes for zero-shot classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4223–4232 (2017)
Google Scholar
Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2021–2030 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Wuhan University of Technology, Wuhan, China
Zongrong Pan & Anna Zhu

Authors

Zongrong Pan
View author publications
You can also search for this author in PubMed Google Scholar
Anna Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Zhu .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Yuxin Peng
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Dalian University of Technology, Dalian, China
Huchuan Lu
Chinese Academy of Sciences, Beijing, China
Zhenan Sun
Chinese Academy of Sciences, Beijing, China
Chenglin Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xilin Chen
Peking University, Beijing, China
Hongbin Zha
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, Z., Zhu, A. (2020). Zero-Shot Learning Based on Salient Region and Enhanced Semantics. In: Peng, Y., et al. Pattern Recognition and Computer Vision. PRCV 2020. Lecture Notes in Computer Science(), vol 12306. Springer, Cham. https://doi.org/10.1007/978-3-030-60639-8_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-60639-8_32
Published: 15 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60638-1
Online ISBN: 978-3-030-60639-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics