CP-VoteNet: Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection

Li, Xuejing; Zhang, Weijia; Ma, Chao

doi:10.1007/978-981-97-8508-7_32

Xuejing Li¹⁵,
Weijia Zhang¹⁵ &
Chao Ma¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15036))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

141 Accesses

Abstract

Few-shot point cloud 3D object detection (FS3D) aims to identify and localise objects of novel classes from point clouds, using knowledge learnt from annotated base classes and novel classes with very few annotations. Thus far, this challenging task has been approached using prototype learning, but the performance remains far from satisfactory. We find that in existing methods, the prototypes are only loosely constrained and lack of fine-grained awareness of the semantic and geometrical correlation embedded within the point cloud space. To mitigate these issues, we propose to leverage the inherent contrastive relationship within the semantic and geometrical subspaces to learn more refined and generalisable prototypical representations. To this end, we first introduce contrastive semantics mining, which enables the network to extract discriminative categorical features by constructing positive and negative pairs within training batches. Meanwhile, since point features representing local patterns can be clustered into geometric components, we further propose to impose contrastive relationship at the primitive level. Through refined primitive geometric structures, the transferability of feature encoding from base to novel classes is significantly enhanced. The above designs and insights lead to our novel Contrastive Prototypical VoteNet (CP-VoteNet). Extensive experiments on two FS3D benchmarks FS-ScanNet and FS-SUNRGBD demonstrate that CP-VoteNet surpasses current state-of-the-art methods by considerable margins across different FS3D settings. Further ablation studies conducted corroborate the rationale and effectiveness of our designs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Zero-Shot Learning on 3D Point Cloud Objects and Beyond

Article 04 August 2022

A Closer Look at Few-Shot 3D Point Cloud Classification

Article 15 December 2022

Point2Vec for Self-supervised Representation Learning on Point Clouds

References

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Google Scholar
Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X.: Meta-baseline: Exploring simple meta-learning for few-shot learning. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9042–9051 (2021). https://doi.org/10.1109/ICCV48922.2021.00893
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5828–5839 (2017)
Google Scholar
Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-rpn and multi-relation detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4013–4022 (2020)
Google Scholar
Han, G., Ma, J., Huang, S., Chen, L., Chang, S.F.: Few-shot object detection with fully cross-transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5321–5330 (2022)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8420–8429 (2019)
Google Scholar
Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J., Kim, J.: Adaptive prototype learning and allocation for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8334–8343 (2021)
Google Scholar
Li, M., Xie, Y., Shen, Y., Ke, B., Qiao, R., Ren, B., Lin, S., Ma, L.: Hybridcr: Weakly-supervised 3d point cloud semantic segmentation via hybrid contrastive regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14930–14939 (2022)
Google Scholar
Li, Z., Zhou, F., Chen, F., Li, H.: Meta-sgd: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835 (2017)
Liu, C., Fu, Y., Xu, C., Yang, S., Li, J., Wang, C., Zhang, L.: Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 8635–8643 (2021)
Google Scholar
Liu, J., Dong, X., Zhao, S., Shen, J.: Generalized few-shot 3d object detection of lidar point cloud for autonomous driving. arXiv preprint arXiv:2302.03914 (2023)
Liu, Y., Zhang, X., Zhang, S., He, X.: Part-Aware Prototype Network for Few-Shot Semantic Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 142–158. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_9
Chapter Google Scholar
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Ouali, Y., Hudelot, C., Tami, M.: Spatial contrastive learning for few-shot classification. In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part I 21, pp. 671–686. Springer (2021)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850. PMLR (2016)
Google Scholar
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Song, S., Lichtenberg, S.P., Xiao, J.: Sun rgb-d: A rgb-d scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)
Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Google Scholar
Tang, L., Zhan, Y., Chen, Z., Yu, B., Tao, D.: Contrastive boundary learning for point cloud segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8489–8499 (2022)
Google Scholar
Tang, W., Yang, B., Li, X., Liu, Y.H., Heng, P.A., Fu, C.W.: Prototypical variational autoencoder for 3d few-shot object detection. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R., Jia, J.: Prior guided feature enrichment network for few-shot segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 1050–1065 (2020)
Article Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: Few-shot image semantic segmentation with prototype alignment. In: proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9197–9206 (2019)
Google Scholar
Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957 (2020)
Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale Positive Sample Refinement for Few-Shot Object Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 456–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_27
Chapter Google Scholar
Wu, S., Pei, W., Mei, D., Chen, F., Tian, J., Lu, G.: Multi-faceted distillation of base-novel commonality for few-shot object detection. In: European Conference on Computer Vision, pp. 578–594. Springer (2022)
Google Scholar
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)
Google Scholar
Xu, Y., Zhao, N., Lee, G.H.: Towards robust few-shot point cloud semantic segmentation. arXiv preprint arXiv:2309.11228 (2023)
Yang, Z., Wang, J., Zhu, Y.: Few-shot classification with contrastive learning. In: European conference on computer vision, pp. 293–309. Springer (2022)
Google Scholar
Ye, M., Zhang, X., Yuen, P.C., Chang, S.F.: Unsupervised embedding learning via invariant and spreading instance feature. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6210–6219 (2019)
Google Scholar
Yuan, S., Li, X., Huang, H., Fang, Y.: Meta-det3d: Learn to learn few-shot 3d object detection. In: Proceedings of the Asian Conference on Computer Vision, pp. 1761–1776 (2022)
Google Scholar
Zhao, S., Qi, X.: Prototypical votenet for few-shot 3d point cloud object detection. Adv. Neural. Inf. Process. Syst. 35, 13838–13851 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Xuejing Li, Weijia Zhang & Chao Ma

Authors

Xuejing Li
View author publications
You can also search for this author in PubMed Google Scholar
Weijia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Ma .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Ürümqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Ürümqi, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, X., Zhang, W., Ma, C. (2025). CP-VoteNet: Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15036. Springer, Singapore. https://doi.org/10.1007/978-981-97-8508-7_32

Download citation

DOI: https://doi.org/10.1007/978-981-97-8508-7_32
Published: 03 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8507-0
Online ISBN: 978-981-97-8508-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CP-VoteNet: Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Zero-Shot Learning on 3D Point Cloud Objects and Beyond

A Closer Look at Few-Shot 3D Point Cloud Classification

Point2Vec for Self-supervised Representation Learning on Point Clouds

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

CP-VoteNet: Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Zero-Shot Learning on 3D Point Cloud Objects and Beyond

A Closer Look at Few-Shot 3D Point Cloud Classification

Point2Vec for Self-supervised Representation Learning on Point Clouds

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation