Abstract
Most of the zero-shot learning (ZSL) algorithms currently use the pre-trained models trained on ImageNet as their feature extractor, which is considered to be an effective method to improve the feature extraction ability of the ZSL models. However, our research found that this practice is difficult to work well if the training data used by the ZSL task differs greatly from ImageNet. Although one can adapt the pre-trained models to the ZSL task with fine-tuning methods, it turns out that the extractors obtained in this way cannot be guaranteed to be friendly to the unseen classes. To solve these problems, we have further studied a biologically inspired feature enhancement framework for ZSL that we proposed earlier and re-fined its biological taxonomy-based selection method for choosing auxiliary datasets. Moreover, we have proposed a word2vec-based selection strategy as a supplement to the biologically inspired selection method for the first time and experimentally proved the inherent unity of these two methods. Extensive experimental results show that our proposed method can effectively improve the generalization ability of the ZSL model and achieve state-of-the-art results on benchmarks. We have also explained the experimental phenomena through the way of feature visualization.
Similar content being viewed by others
Notes
Source codes for our method and related algorithms: https://github.com/Wepond/Biologically-Inspired-ZSL-framework.
References
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: The IEEE conference on computer vision and pattern recognition, pp 819–826
Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936
Caruana RA (1993) Multitask learning: a knowledge-based source of inductive bias. In: Proceedings of the tenth international conference, University of Massachusetts, Amherst, 27–29 June 1993, pp 41–48
Charles L (1799) Species plantarum. Impensis GC Nauk, vol 3
Charles L (1758) Systema naturae. Laurentii Salvii, Stockholm, p 532
Corinna C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Duong L, Cohn T, Bird S, Cook P (2015) Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), pp 845–850
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1778–1785
Gui L, Xu R, Qin Lu, Du J, Zhou Y (2018) Negative transfer detection in transductive transfer learning. Int J Mach Learn Cybern 9(2):185–197
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Jonathan B (2000) A model of inductive bias learning. J Artif Intell Res 12:149–198
Jia D, Wei D, Richard S, Jia L, Kai L, Li F (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
Jiang H, Wang R, Shan S, Chen X (2019) Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 9765–9774
Ktari A (2019) https://www.kaggle.com/aymenktari/flowerrecognition
Karessli N, Akata Z, Schiele B, Bulling A (2017) Gaze embeddings for zero-shot image classification. In: The IEEE conference on computer vision and pattern recognition, pp 4525–4534
Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv:1312.6114
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3174–3183
Karen S, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 270–280
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition, pp 951–958
Lampert CH, Nickisch H, Harmeling S (2013) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Luo Y, Wang X, Cao W (2020) A novel dataset-specific feature extractor for zero-shot learning. Neurocomputing 391(28):74–82
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mishra A, Reddy SK, Mittal A, Murthy H (2018) A generative model for zero shot learning using conditional variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2188–2196
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: 2008 6th Indian conference on computer vision, graphics & image processing, pp 722–729
Palatucci M, Pomerleau D, Hinton GE, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Advances in neural information processing systems, pp 1410–1418
Rich C (1997) Multitask learning. Mach Learn 28(1):41–75
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv:1706.05098
Reed S, Akata Z, Lee H, Schiele B (2016) Learning deep representations of fine-grained visual descriptions. In: The IEEE conference on computer vision and pattern recognition, pp 49–58
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161
Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero-and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8247–8255
Socher R, Ganjoo M, Manning CD, Andrew Ng (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Cham, pp 135–151
Sung F, Yang Y, Zhang L, Xiang T, et al (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Thomas C, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset, California Institute of Technology, (CNS TR 2011 001)
Wang Y, Hou Y, Che W et al (2020) From static to dynamic word representations: a survey. Int J Mach Learn Cybern 11:1611–1630. https://doi.org/10.1007/s13042-020-01069-8
Wang R, Wang X, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475
Wang X, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48(2):703–715
Wang X, Xing H, Li Y et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 69–77
Xue Y, Liao X, Carin L, Krishnapuram B (2007) Multi-task learning for classification with dirichlet process priors. J Mach Learn Res 8(Jan):35–63
Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5542–5551
Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265
Xian Y, Schiele B, Akata Z (2017) Zero-shot learning-the good, the bad and the ugly. In: The IEEE conference on computer vision and pattern recognition, pp 4582–4591
Xie Z, Cao W, Wang X, Ming Z, Zhang JJ, Zhang JY (2020) A biologically inspired feature enhancement framework for zero-shot learning. arXiv:2005.0870
Yang Y, Hospedales TM (2016) Trace norm regularised deep multi-task learning. arXiv:1606.04038
Yu C, Wang J, Chen Y, Qin X (2019) Transfer channel pruning for compressing deep domain adaptation models. Int J Mach Learn Cybern 10(11):3129–3144
Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1004–1013
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2021–2030
Zhang Y, Yang Q (2018) An overview of multi-task learning. Natl Sci Rev 5(1):30–43
Acknowledgements
The authors would like to thank Prof. Xizhao Wang from the College of Computer Science and Software Engineering, Shenzhen University, China, for his valuable suggestions which have greatly improved the manuscript. This work was supported by National Natural Science Foundation of China (61836005, 61732011, and 61976141), the Basic Research Project of Knowledge Innovation Program in ShenZhen (JCYJ20180305125850156), and the Opening Project of Shanghai Trusted Industrial Control Platform (TICPSH202003008-ZC).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xie, Z., Cao, W. & Ming, Z. A further study on biologically inspired feature enhancement in zero-shot learning. Int. J. Mach. Learn. & Cyber. 12, 257–269 (2021). https://doi.org/10.1007/s13042-020-01170-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-020-01170-y