Skip to main content

Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15043))

Included in the following conference series:

  • 115 Accesses

Abstract

Few-Shot Object Detection (FSOD) is affected by the long-tailed distribution of data and the discrepancy in sample quantities between base classes and novel classes, leading to evident data bias. As a result, the generated feature distribution struggles to represent class features effectively. In scenarios with scarce samples, irrelevant factors in features may have a more significant impact on feature distribution and even dominate feature representation. To obtain more compact and accurate class-specific feature representations, this paper introduces the disentangled representation into few-shot object detection and proposes a semantic disentanglement representation meta-learning model, referred to as FSOD-SDR. Firstly, in the feature extraction phase, a feature information aggregation module is constructed to aggregate features from different scales of the backbone, thereby enabling a more comprehensive representation of support features containing limited information. Secondly, to address highly coupled features, background-relevant and label-relevant semantic factor distributions are simultaneously disentangled from aggregated features by a semantic disentanglement representation module. The label-relevant feature distribution can more accurately represent class features. To effectively achieve disentanglement of the goal, the Evidence Lower Bound (ELBO) loss function is extended during model optimization. Lastly, experiments on the PASCAL VOC and MS COCO datasets show that FSOD-SDR has a significant performance improvement (an average improvement of 5.7% across all metrics) over the previous state-of-the-art methods, achieving comparably good detection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, B., Yang, B., Liu, C., Liu, F., Ji, R., Ye, Q.: Beyond max-margin: class margin equilibrium for few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 7363–7372 (2021)

    Google Scholar 

  2. Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7352–7362 (2021)

    Google Scholar 

  3. Han, J., Ren, Y., Ding, J., et al.: Few-shot object detection via variational feature aggregation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 1, pp. 3755–3763 (2023)

    Google Scholar 

  4. Kingma, D.P., Welling, M.: Auto-encoding Variational Bayes. https://doi.org/10.48550/arXiv.1312.6114 (2013)

  5. Wang, Z., Yang, B., Yue, H., et al.: Fine-grained prototypes distillation for few-shot object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 6, pp. 5859–5866 (2024)

    Google Scholar 

  6. Wang, X., Chen, H., Tang, S., Wu, Z., Zhu, W.: Disentangled representation learning. arXiv preprint arXiv:2211.11695) (2022

  7. Chen, Z., Luo, Y., Qiu, R., Wang, S., Huang, Z., Li, J.J., Zhang, Z.: Semantics disentangling for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8712–8720 (2021)

    Google Scholar 

  8. Higgins, I., Matthey, L., Pal, A., et al.: Beta-Vae: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (ICLR) (Poster), vol. 3 (2017)

    Google Scholar 

  9. Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning(PMLR), pp. 2649–2658 (2018)

    Google Scholar 

  10. Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. Adv. Neural Inf. Process. Syst. 31 (2018)

    Google Scholar 

  11. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)

    Google Scholar 

  12. Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8420–8429(2019)

    Google Scholar 

  13. Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9925–9934 (2019)

    Google Scholar 

  14. Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. https://doi.org/10.48550/arXiv.2003.06957 (2020)

  15. Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: Towards general solver for instance-level low-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision(CVPR), pp. 9577–9586 (2019)

    Google Scholar 

  16. Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale positive sample refinement for few-shot object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 456–472. Springer, Cham (2020)

    Google Scholar 

  17. Zhang, W., Wang, Y.X.: Hallucination improves few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13008–13017 (2021)

    Google Scholar 

  18. Wu, A., Han, Y., Zhu, L., Yang, Y.: Universal-prototype enhancing for few-shot object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9567–9576 (2021)

    Google Scholar 

  19. Cao, Y., Wang, J., Jin, Y., Wu, T., Chen, K., Liu, Z., Lin, D.: Few-shot object detection via association and discrimination. Adv. Neural. Inf. Process. Syst. 34, 16570–16581 (2021)

    Google Scholar 

  20. Qiao, L., Zhao, Y., Li, Z., Qiu, X., Wu, J., Zhang, C.: Defrcn: decoupled faster R-CNN for few-shot object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8681–8690 (2021)

    Google Scholar 

  21. Han, G., Huang, S., Ma, J., Huang, S., Chen, L., Chang, S.F.: Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment. https://doi.org/10.48550/arXiv.2104.07719 (2022)

  22. Han, G., Ma, J., Huang, S., Chen, L., Chang, S.F.: Few-shot object detection with fully cross-transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5321–5330 (2022)

    Google Scholar 

  23. Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018)

    Google Scholar 

  24. Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-RPN and multi-relation detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4013–4022 (2020)

    Google Scholar 

  25. Xiao, Y., Lepetit, V., Marletm, R.: Few-shot object detection and viewpoint estimation for objects in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3090–3106 (2022)

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under grant No.61966029.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 258 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wei, L., Tang, X., Dang, J. (2025). Few-Shot Object Detection via Disentangling Class-Related Factors in Feature Distribution. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15043. Springer, Singapore. https://doi.org/10.1007/978-981-97-8493-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-8493-6_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-8492-9

  • Online ISBN: 978-981-97-8493-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics