Abstract
In few-shot learning tasks, a series of semantic-based methods have shown excellent performance due to the modality fusion of both visual and semantic modalities. However, in single-shot learning tasks, the fused visual modality fails to comprehensively capture the class information since only one image is available. To address this issue, we propose a semantic-based single-shot method which considers from both local and global perspectives. Specifically, we fully exploit local visual features to replace the traditional image-level features in the modality fusion in those semantic-based methods. Moreover, a global classification loss is introduced to enlarge the encoding space for accurate and distinguishable local embeddings. Through a series of experiments, we show that by exploiting local features from a global classification perspective, our model boosts the performance of semantic-based approaches by a large margin on two different data sets and global classification loss is effective on both metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Soulié, F.F., Hérault, J. (eds.) Neurocomputing, pp. 227–236. Springer (1990). https://doi.org/10.1007/978-3-642-76153-9_28
Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C.F., Huang, J.B.: A closer look at few-shot classification. In: ICLR (2019)
Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X.: Meta-baseline: exploring simple meta-learning for few-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9062–9071 (2021)
Dhillon, G.S., Chaudhari, P., Ravichandran, A., Soatto, S.: A baseline for few-shot image classification. ICLR (2020)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML (2017)
Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Ranzato, M., Mikolov, T.: Devise: A deep visual-semantic embedding model. Advances in neural information processing systems 26 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Cross attention network for few-shot classification. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Jiang, X., Havaei, M., Varno, F., Chartrand, G., Chapados, N., Matwin, S.: Learning to learn with conditional class dependencies. In: International Conference on Learning Representations (2018)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, A., Huang, W., Lan, X., Feng, J., Li, Z., Wang, L.: Boosting few-shot learning with adaptive margin loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12576–12584 (2020)
Li, W., et al.: Libfewshot: a comprehensive library for few-shot learning. arXiv preprint arXiv:2109.04898 (2021)
Li, W., Wang, L., Xu, J., Huo, J., Gao, Y., Luo, J.: Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7260–7268 (2019)
Liu, X., et al.: Learn from concepts: towards the purified memory for few-shot learning. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 888–894. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/123, main Track
Oreshkin, B., RodrÃguez López, P., Lacoste, A.: Tadam: task dependent adaptive metric for improved few-shot learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: ICLR (2017)
Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49–58 (2016)
Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: ICLR (2018)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Rusu, A.A., et al.: Meta-learning with latent embedding optimization. In: International Conference on Learning Representations (2018)
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850. PMLR (2016)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J.B., Isola, P.: Rethinking few-shot image classification: a good embedding is all you need? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 266–282. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_16
Vartak, M., Thiagarajan, A., Miranda, C., Bratman, J., Larochelle, H.: A meta-learning perspective on cold-start recommendations for items. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Wan, Z., et al.: Transductive zero-shot learning with visual structure constraint. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. (CSUR) 53(3), 1–34 (2020)
Xing, C., Rostamzadeh, N., Oreshkin, B., O Pinheiro, P.O.: Adaptive cross-modal few-shot learning. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Xu, W., Wang, H., Tu, Z., et al.: Attentional constellation nets for few-shot learning. In: International Conference on Learning Representations (2021)
Yang, L., Li, L., Zhang, Z., Zhou, X., Zhou, E., Liu, Y.: DPGN: distribution propagation graph network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13390–13399 (2020)
Yoo, T.K., Choi, J.Y., Kim, H.K.: Feasibility study to improve deep learning in oct diagnosis of rare retinal diseases with few-shot classification. Med. Biol. Eng. Comput. 59(2), 401–415 (2021)
Zhang, B., Li, X., Ye, Y., Huang, Z., Zhang, L.: Prototype completion with primitive knowledge for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3754–3762 (2021)
Acknowledgement
This research is partly supported by Ministry of Science and Technology, China (No. 2019YFB1311503) and Committee of Science and Technology, Shanghai, China (No.19510711200).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, J., Fang, K., Yu, W., Yang, J., Qiao, Y. (2023). Local-Global Semantic Fusion Single-shot Classification Method. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Lecture Notes in Computer Science, vol 13623. Springer, Cham. https://doi.org/10.1007/978-3-031-30105-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-30105-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30104-9
Online ISBN: 978-3-031-30105-6
eBook Packages: Computer ScienceComputer Science (R0)