Fine-grained classification with few labeled samples has urgent needs in practice since fine-grained samples are more difficult and expensive to collect and annotate. Standard few-shot learning (FSL) focuses on generalising across seen and unseen classes, where the classes are at the same level of granularity. Therefore, when applying existing FSL methods to tackle this problem, large amounts of labeled samples for some fine-grained classes are required. Since samples of coarse-grained classes are much cheaper and easier to obtain, it is desired to learn knowledge from coarse-grained categories that can be transferred to fine-grained classes with a few samples. In this paper, we propose a novel learning problem called cross-granularity few-shot learning (CG-FSL), where sufficient samples of coarse-grained classes are available for training, but in the test stage, the goal is to classify the fine-grained subclasses. This learning paradigm follows the laws of cognitive neurology. We first give an analysis of CG-FSL through the Structural Causal Model (SCM) and figure out that the standard FSL model learned at the coarse-grained level is actually a confounder. We thus perform backdoor adjustment to decouple the interferences and consequently derive a causal CG-FSL model called Meta Attention-Generation Network (MAGN), which is trained in a bilevel optimization manner. We construct benchmarks from several fine-grained image datasets for the CG-FSL problem and empirically show that our model significantly outperforms standard FSL methods and baseline CG-FSL methods.
The CUB-200 dataset analysed during the current study is available at https://resolver.caltech.edu/CaltechAUTHORS:20111026-155425465. The Stanford Car dataset is available at http://ai.stanford.edu/jkrause/cars/car_dataset.html. The Stanford Dog dataset is available at http://vision.stanford.edu/aditya86/ImageNetDogs/main.html. The FGVC-Aircraft dataset is available at https://www.robots.ox.ac.uk/vgg/data/fgvc-aircraft/. The Oxford Flower dataset is available at https://www.robots.ox.ac.uk/vgg/data/flowers/102/. The Veg200 dataset is available at https://github.com/ustc-vim/vegfru. The Meta-iNat is available at https://github.com/visipedia/inat-comp/tree/master/2017. The Meta-Datas et is available at https://github.com/google-research/me ta-dataset. The tieredImageNet is available at https://ba ir.berkeley.edu/blog/2017/07/18/learning-to-learn/. The miniImageNet is available at https://github.com/twitter-research/meta-learning-lstm.
This work is supported in part by the National Natural Science Foundation of China No. 61976206 and No. 61832017, Beijing Outstanding Young Scientist Program NO. BJJWZYJH012019100020098, Foshan HKUST Projects (FSUST21-FYTRI01A, FSUST21-FY TRI02A), Beijing Academy of Artificial Intelligence (BAAI), the Fundamental Research Funds for the Central Universities, the Research Funds of Renmin University of China 21XNLG05, and Public Computing Cloud, Renmin University of China.
