research-article

Semantic Auto-Encoder with L2-norm Constraint for Zero-Shot Learning

Authors:
Yuhao Wu

Shenzhen University, China

Shenzhen University, China
View Profile

,
Weipeng Cao

Shenzhen University, China

Shenzhen University, China
View Profile

,
Ye Liu

Shenzhen University, China

Shenzhen University, China
View Profile

,
Zhong Ming

Shenzhen University, China

Shenzhen University, China
View Profile

,
Jianqiang Li

Shenzhen University, China

Shenzhen University, China
View Profile

,
Bo Lu

Southwest Oil and Natural Gas Branch of Sinopec, China

Southwest Oil and Natural Gas Branch of Sinopec, China
View Profile

ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and ComputingFebruary 2021Pages 101–105https://doi.org/10.1145/3457682.3457699

Published:21 June 2021Publication History

ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing

Pages 101–105

ABSTRACT

Zero-Shot Learning (ZSL) is an effective paradigm to solve label prediction when some classes have no training samples. In recent years, many ZSL algorithms have been proposed. Among them, semantic autoencoder (SAE) is widely used because of its simplicity and good generalization ability. However, our research found that most of the existing SAE based methods use implicit constraints to guarantee the mapping quality between feature space and semantic space. In fact, the implicit constraints are insufficient in minimizing the structural risk of the model and easy to cause the over-fitting problem. To solve this problem, we propose a novel SAE algorithm with the L2-norm constraint (SAE-L2) in this study. SAE-L2 adds the L2 regularization constraint to the mapping parameters in its optimization objective, which explicitly guarantees the structural risk minimization of the model. Extensive experiments on four benchmark datasets show that our proposed SAE-L2 can achieve better performance than the original SAE model and other ZSL algorithms.

References

Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. Journal of Machine Learning Research 12(3) (2011)Google Scholar
Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of finegrained visual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 49–58 (2016)Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE transactions on pattern analysis and machine intelligence 36(3), 453–465 (2013)Google Scholar
Bucher, M., Herbin, S., Jurie, F.: Improving semantic embedding consistency by metric learning for zero-shot classification. In: European Conference on Computer Vision. pp. 730–746. Springer (2016)Google Scholar
Bartels, R.H., Stewart, G.W.: Solution of the matrix equation ax+ xb= c [f4]. Communications of the ACM 15(9), 820–826 (1972)Google ScholarDigital Library
Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4582–4591 (2017)Google ScholarCross Ref
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision. pp. 4166–4174 (2015)Google ScholarDigital Library
Yang, Y., Hospedales, T.M.: A unified perspective on multi-domain and multi-task learning. arXiv preprint arXiv:1412.7489 (2014)Google Scholar
Wah, C., Branson, S., Perona, P., Belongie, S.: Multiclass recognition and part localization with humans in the loop. In: 2011 International Conference on Computer Vision. pp. 2524–2531. IEEE (2011)Google Scholar
Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Ranzato, M., Mikolov,T.: Devise: A deep visual-semantic embedding model. In: Advances in neural information processing systems. pp. 2121–2129 (2013)Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition.pp. 1778–1785. IEEE (2009)sGoogle ScholarCross Ref
Patterson, G., Xu, C., Su, H., Hays, J.: The sun attribute database: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108(1-2), 59–81 (2014)Google ScholarDigital Library
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1–9 (2015)Google ScholarCross Ref
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition.pp. 770–778 (2016)Google Scholar
Changpinyo, S., Chao, W.L., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5327–5336 (2016)Google ScholarCross Ref
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3174–3183 (2017)Google ScholarCross Ref
Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: Zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2584–2591 (2013)Google ScholarDigital Library
Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2021–2030 (2017)Google ScholarCross Ref
Fu, Y., Sigal, L.: Semi-supervised vocabulary-informed learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 5337–5346 (2016)Google ScholarCross Ref
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning. pp. 2152–2161 (2015)Google Scholar
Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In:Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2927–2936 (2015)Google ScholarCross Ref
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross modal transfer. In: Advances in neural information processing systems. pp. 935–943 (2013)Google Scholar
Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot object recognition by semantic manifold distance. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2635–2644 (2015)Google ScholarCross Ref
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)Google Scholar
Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., Corrado, G.S., Dean, J.: Zero-shot learning by convex combination of semantic embeddings.arXiv preprint arXiv:1312.5650 (2013)Google Scholar

Recommendations

Semantic Enhanced Cross-modal GAN for Zero-shot Learning
MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

The goal of Zero-shot Learning (ZSL) is to recognize categories that are not seen during the training process. The traditional method is to learn an embedding space and map visual features and semantic features to this common space. However, this method ...
Read More
Generalized Zero-Shot Learning using Identifiable Variational Autoencoders
Highlights
- Identifiable VAE is a generative model to address conventional and generalized ZSL.
Abstract
Deep learning tasks rely heavily on a large amount of training data, but collecting and annotating data daily is not practical. Therefore, Zero-shot learning (ZSL) has become important for the applications, where no labeled data is ...
Read More
Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders
Extended Reality
Abstract
Multi-label Zero-Shot Learning (ZSL) is an extension of traditional single-label ZSL, where the objective is to accurately classify images containing multiple unseen classes that are not available during training. Current techniques depends on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing
February 2021
601 pages
ISBN:9781450389310
DOI:10.1145/3457682

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
L2-norm constraint
Zero-shot learning
semantic auto-encoder
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 58
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Semantic Auto-Encoder with L2-norm Constraint for Zero-Shot Learning

ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing

ABSTRACT

References

Cited By

Recommendations

Semantic Enhanced Cross-modal GAN for Zero-shot Learning

Generalized Zero-Shot Learning using Identifiable Variational Autoencoders

Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Semantic Auto-Encoder with L2-norm Constraint for Zero-Shot Learning

ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing

ABSTRACT

References

Cited By

Recommendations

Semantic Enhanced Cross-modal GAN for Zero-shot Learning

Generalized Zero-Shot Learning using Identifiable Variational Autoencoders

Multi-label Generalized Zero-Shot Learning Using Identifiable Variational Autoencoders

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media