Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution

Wei, Lili; Tang, Xiaofen; Dang, Jin

doi:10.1007/978-981-97-8493-6_6

Lili Wei¹⁵,
Xiaofen Tang^15,16 &
Jin Dang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15043))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

115 Accesses

Abstract

Few-Shot Object Detection (FSOD) is affected by the long-tailed distribution of data and the discrepancy in sample quantities between base classes and novel classes, leading to evident data bias. As a result, the generated feature distribution struggles to represent class features effectively. In scenarios with scarce samples, irrelevant factors in features may have a more significant impact on feature distribution and even dominate feature representation. To obtain more compact and accurate class-specific feature representations, this paper introduces the disentangled representation into few-shot object detection and proposes a semantic disentanglement representation meta-learning model, referred to as FSOD-SDR. Firstly, in the feature extraction phase, a feature information aggregation module is constructed to aggregate features from different scales of the backbone, thereby enabling a more comprehensive representation of support features containing limited information. Secondly, to address highly coupled features, background-relevant and label-relevant semantic factor distributions are simultaneously disentangled from aggregated features by a semantic disentanglement representation module. The label-relevant feature distribution can more accurately represent class features. To effectively achieve disentanglement of the goal, the Evidence Lower Bound (ELBO) loss function is extended during model optimization. Lastly, experiments on the PASCAL VOC and MS COCO datasets show that FSOD-SDR has a significant performance improvement (an average improvement of 5.7% across all metrics) over the previous state-of-the-art methods, achieving comparably good detection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Adaptive Detector for Few Shot Object Detection

Any-Shot Object Detection

Few-Shot Object Detection Based on Latent Knowledge Representation

References

Li, B., Yang, B., Liu, C., Liu, F., Ji, R., Ye, Q.: Beyond max-margin: class margin equilibrium for few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 7363–7372 (2021)
Google Scholar
Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7352–7362 (2021)
Google Scholar
Han, J., Ren, Y., Ding, J., et al.: Few-shot object detection via variational feature aggregation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 1, pp. 3755–3763 (2023)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding Variational Bayes. https://doi.org/10.48550/arXiv.1312.6114 (2013)
Wang, Z., Yang, B., Yue, H., et al.: Fine-grained prototypes distillation for few-shot object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 6, pp. 5859–5866 (2024)
Google Scholar
Wang, X., Chen, H., Tang, S., Wu, Z., Zhu, W.: Disentangled representation learning. arXiv preprint arXiv:2211.11695) (2022
Chen, Z., Luo, Y., Qiu, R., Wang, S., Huang, Z., Li, J.J., Zhang, Z.: Semantics disentangling for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8712–8720 (2021)
Google Scholar
Higgins, I., Matthey, L., Pal, A., et al.: Beta-Vae: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (ICLR) (Poster), vol. 3 (2017)
Google Scholar
Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning(PMLR), pp. 2649–2658 (2018)
Google Scholar
Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. Adv. Neural Inf. Process. Syst. 31 (2018)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8420–8429(2019)
Google Scholar
Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9925–9934 (2019)
Google Scholar
Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. https://doi.org/10.48550/arXiv.2003.06957 (2020)
Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: Towards general solver for instance-level low-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision(CVPR), pp. 9577–9586 (2019)
Google Scholar
Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale positive sample refinement for few-shot object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 456–472. Springer, Cham (2020)
Google Scholar
Zhang, W., Wang, Y.X.: Hallucination improves few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13008–13017 (2021)
Google Scholar
Wu, A., Han, Y., Zhu, L., Yang, Y.: Universal-prototype enhancing for few-shot object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9567–9576 (2021)
Google Scholar
Cao, Y., Wang, J., Jin, Y., Wu, T., Chen, K., Liu, Z., Lin, D.: Few-shot object detection via association and discrimination. Adv. Neural. Inf. Process. Syst. 34, 16570–16581 (2021)
Google Scholar
Qiao, L., Zhao, Y., Li, Z., Qiu, X., Wu, J., Zhang, C.: Defrcn: decoupled faster R-CNN for few-shot object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8681–8690 (2021)
Google Scholar
Han, G., Huang, S., Ma, J., Huang, S., Chen, L., Chang, S.F.: Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment. https://doi.org/10.48550/arXiv.2104.07719 (2022)
Han, G., Ma, J., Huang, S., Chen, L., Chang, S.F.: Few-shot object detection with fully cross-transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5321–5330 (2022)
Google Scholar
Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018)
Google Scholar
Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-RPN and multi-relation detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4013–4022 (2020)
Google Scholar
Xiao, Y., Lepetit, V., Marletm, R.: Few-shot object detection and viewpoint estimation for objects in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3090–3106 (2022)
MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under grant No.61966029.

Author information

Authors and Affiliations

School of Information Engineering, Ningxia University, Yinchuan, 750021, China
Lili Wei & Xiaofen Tang
Ningxia Key Laboratory of Artificial Intelligence and Information Security for Channeling Computing Resources from the East to the West, Yinchuan, 750021, China
Xiaofen Tang
School of Mathematics and Computer Science, Ningxia Normal University, Guyuan, 756099, China
Jin Dang

Authors

Lili Wei
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofen Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Dang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Urumqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Urumqi, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 258 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, L., Tang, X., Dang, J. (2025). Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15043. Springer, Singapore. https://doi.org/10.1007/978-981-97-8493-6_6

Download citation

DOI: https://doi.org/10.1007/978-981-97-8493-6_6
Published: 01 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8492-9
Online ISBN: 978-981-97-8493-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive Detector for Few Shot Object Detection

Any-Shot Object Detection

Few-Shot Object Detection Based on Latent Knowledge Representation

References

Acknowledgements

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 258 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Adaptive Detector for Few Shot Object Detection

Any-Shot Object Detection

Few-Shot Object Detection Based on Latent Knowledge Representation

References

Acknowledgements

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 258 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation