ABSTRACT
Few-shot object detection has drawn more attention in computer vision now. One acknowledged task setting is that train the network to detect both of the base classes with abundant images and novel classes with only a few. Under this scenario, two classic pipelines of few-shot object detection are developed. One is fine-tuning, which trains the detection network on images of base classes and fine-tunes the last layer on images of both base and novel classes. The other one is meta learning, in which one pioneering model utilizes a meta-learner to transform supporting images into reweighting vectors, which are used to reweight features of the query images obtained through the feature extractor. A typical meta learning method splits the training process into two phases: meta-training and meta-testing. Firstly in the meta-training phase, the model is trained on base classes, then on both base and novel classes. In this paper, we synthesize these two pipelines together. For the network structure, we tailor Faster-RCNN to the reweighting module; for training, we follow the meta-training procedure and fine-tune the reweighting module and only the last layer of Faster-RCNN during meta-testing. Experiments on NWPU VHR-10 images show that our method improves the mAP by about 10 ∼ 20 percentages than both of the reweighting and fine-tuning methods.
- Rashid Ali, Ran Liu, Yongping He, Anand Nayyar, and Basit Qureshi. 2021. Systematic review of dynamic multi-object identification and localization: Techniques and technologies. IEEE Access (2021).Google Scholar
- Jafar Alzubi, Anand Nayyar, and Akshi Kumar. 2018. Machine learning from theory to algorithms: an overview. In Journal of physics: conference series, Vol. 1142. IOP Publishing, 012012.Google Scholar
- Gong Cheng, Junwei Han, Peicheng Zhou, and Lei Guo. 2014. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing 98 (2014), 119–132.Google ScholarCross Ref
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic Meta-learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70(ICML’17). JMLR.org, Sydney, NSW, Australia, 1126–1135.Google Scholar
- Kun Fu, Tengfei Zhang, Yue Zhang, Menglong Yan, Zhonghan Chang, Zhengyuan Zhang, and Xian Sun. 2019. Meta-SSD: Towards fast adaptation for few-shot object detection with meta-learning. IEEE Access 7(2019), 77597–77606.Google ScholarCross Ref
- Ryo Hasegawa, Yutaro Iwamoto, and Yen-Wei Chen. 2020. Robust Japanese road sign detection and recognition in complex scenes using convolutional neural networks. Journal of Image and Graphics 8, 3 (2020), 59–66.Google ScholarCross Ref
- Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2020. Meta-Learning in Neural Networks: A Survey. (April 2020). arxiv:2004.05439 [cs.LG]Google Scholar
- Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, and Trevor Darrell. 2019. Few-shot object detection via feature reweighting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8420–8429.Google ScholarCross Ref
- Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21–37.Google ScholarCross Ref
- Ying Liu, Luyao Geng, Weidong Zhang, Yanchao Gong, and Zhijie Xu. 2021. Survey of Video Based Small Target Detection. Journal of Image and Graphics 9, 4 (2021).Google Scholar
- Florian Spiess, Lucas Reinhart, Norbert Strobel, Dennis Kaiser, Samuel Kounev, and Tobias Kaupp. 2021. People Detection with Depth Silhouettes and Convolutional Neural Networks on a Mobile Robot. Journal of Image and Graphics 9, 4 (2021).Google Scholar
- Xin Wang, Thomas E Huang, Trevor Darrell, Joseph E Gonzalez, and Fisher Yu. 2020. Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957(2020).Google Scholar
- Yaqing Wang, Quanming Yao, James T Kwok, and Lionel M Ni. 2020. Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys (CSUR) 53, 3 (2020), 1–34.Google ScholarDigital Library
- Yu-Xiong Wang, Deva Ramanan, and Martial Hebert. 2019. Meta-learning to detect rare objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9925–9934.Google ScholarCross Ref
- Xiongwei Wu, Doyen Sahoo, and Steven Hoi. 2020. Meta-rcnn: Meta learning for few-shot object detection. In Proceedings of the 28th ACM International Conference on Multimedia. 1679–1687.Google ScholarDigital Library
- Xiaopeng Yan, Ziliang Chen, Anni Xu, Xiaoxi Wang, Xiaodan Liang, and Liang Lin. 2019. Meta r-cnn: Towards general solver for instance-level low-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9577–9586.Google ScholarCross Ref
Index Terms
- Fine-tuning Faster-RCNN tailored to Feature Reweighting for Few-shot Object Detection
Recommendations
Meta-RCNN: Meta Learning for Few-Shot Object Detection
MM '20: Proceedings of the 28th ACM International Conference on MultimediaDespite significant advances in deep learning based object detection in recent years, training effective detectors in a small data regime remains an open challenge. This is very important since labelling training data for object detection is often very ...
Document Clustering With Dual Supervision Through Feature Reweighting
Traditional semi-supervised clustering uses only limited user supervision in the form of instance seeds for clusters and pairwise instance constraints to aid unsupervised clustering. However, user supervision can also be provided in alternative forms ...
Weakly- and Semi-Supervised Fast Region-Based CNN for Object Detection
AbstractLearning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object ...
Comments