ABSTRACT
Zero-Shot Learning (ZSL) is an effective paradigm to solve label prediction when some classes have no training samples. In recent years, many ZSL algorithms have been proposed. Among them, semantic autoencoder (SAE) is widely used because of its simplicity and good generalization ability. However, our research found that most of the existing SAE based methods use implicit constraints to guarantee the mapping quality between feature space and semantic space. In fact, the implicit constraints are insufficient in minimizing the structural risk of the model and easy to cause the over-fitting problem. To solve this problem, we propose a novel SAE algorithm with the L2-norm constraint (SAE-L2) in this study. SAE-L2 adds the L2 regularization constraint to the mapping parameters in its optimization objective, which explicitly guarantees the structural risk minimization of the model. Extensive experiments on four benchmark datasets show that our proposed SAE-L2 can achieve better performance than the original SAE model and other ZSL algorithms.
- Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. Journal of Machine Learning Research 12(3) (2011)Google Scholar
- Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of finegrained visual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 49–58 (2016)Google Scholar
- Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE transactions on pattern analysis and machine intelligence 36(3), 453–465 (2013)Google Scholar
- Bucher, M., Herbin, S., Jurie, F.: Improving semantic embedding consistency by metric learning for zero-shot classification. In: European Conference on Computer Vision. pp. 730–746. Springer (2016)Google Scholar
- Bartels, R.H., Stewart, G.W.: Solution of the matrix equation ax+ xb= c [f4]. Communications of the ACM 15(9), 820–826 (1972)Google ScholarDigital Library
- Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4582–4591 (2017)Google ScholarCross Ref
- Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision. pp. 4166–4174 (2015)Google ScholarDigital Library
- Yang, Y., Hospedales, T.M.: A unified perspective on multi-domain and multi-task learning. arXiv preprint arXiv:1412.7489 (2014)Google Scholar
- Wah, C., Branson, S., Perona, P., Belongie, S.: Multiclass recognition and part localization with humans in the loop. In: 2011 International Conference on Computer Vision. pp. 2524–2531. IEEE (2011)Google Scholar
- Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Ranzato, M., Mikolov,T.: Devise: A deep visual-semantic embedding model. In: Advances in neural information processing systems. pp. 2121–2129 (2013)Google Scholar
- Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition.pp. 1778–1785. IEEE (2009)sGoogle ScholarCross Ref
- Patterson, G., Xu, C., Su, H., Hays, J.: The sun attribute database: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108(1-2), 59–81 (2014)Google ScholarDigital Library
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1–9 (2015)Google ScholarCross Ref
- He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition.pp. 770–778 (2016)Google Scholar
- Changpinyo, S., Chao, W.L., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5327–5336 (2016)Google ScholarCross Ref
- Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3174–3183 (2017)Google ScholarCross Ref
- Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: Zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2584–2591 (2013)Google ScholarDigital Library
- Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2021–2030 (2017)Google ScholarCross Ref
- Fu, Y., Sigal, L.: Semi-supervised vocabulary-informed learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 5337–5346 (2016)Google ScholarCross Ref
- Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning. pp. 2152–2161 (2015)Google Scholar
- Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In:Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2927–2936 (2015)Google ScholarCross Ref
- Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross modal transfer. In: Advances in neural information processing systems. pp. 935–943 (2013)Google Scholar
- Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic manifold distance. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2635–2644 (2015)Google ScholarCross Ref
- Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)Google Scholar
- Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., Corrado, G.S., Dean, J.: Zero-shot learning by convex combination of semantic embeddings.arXiv preprint arXiv:1312.5650 (2013)Google Scholar
Recommendations
Semantic Enhanced Cross-modal GAN for Zero-shot Learning
MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in AsiaThe goal of Zero-shot Learning (ZSL) is to recognize categories that are not seen during the training process. The traditional method is to learn an embedding space and map visual features and semantic features to this common space. However, this method ...
Generalized Zero-Shot Learning using Identifiable Variational Autoencoders
Highlights- Identifiable VAE is a generative model to address conventional and generalized ZSL.
AbstractDeep learning tasks rely heavily on a large amount of training data, but collecting and annotating data daily is not practical. Therefore, Zero-shot learning (ZSL) has become important for the applications, where no labeled data is ...
Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders
Extended RealityAbstractMulti-label Zero-Shot Learning (ZSL) is an extension of traditional single-label ZSL, where the objective is to accurately classify images containing multiple unseen classes that are not available during training. Current techniques depends on ...
Comments