skip to main content
10.1145/3637843.3637848acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicraiConference Proceedingsconference-collections
research-article

Enhancing Few-Shot 3D Point Cloud Semantic Segmentation through Bidirectional Prototype Learning

Authors Info & Claims
Published:06 March 2024Publication History

ABSTRACT

In recent years, significant strides have been made in point cloud semantic segmentation, which, however, are unspectacular when the training is deprived of sufficient densely-annotated samples, especially with the face of new classes unseen during the training. Given limited data and unacquainted categories, learning efficiency becomes of great concern to the overall segmentation outcome. To obtain improved segmentation performance under this few-shot training condition, we introduce a bidirectional learning method that allows mutual prototype learning between support set and query set. Specifically, we manage to realize enhanced efficiency by exploiting the support and query sets to a larger extent, effectively extracting information and generating prototypes in two opposite learning orientations. Refined by our method, models are able to achieve better performance in few-shot 3D semantic segmentation tasks without the need of further introducing more parameters that may lead to higher model complexity. To validate our method, we respectively test different models for 1-shot and 5-shot settings on the S3DIS [23] dataset. The remarkably improved IoU scores on unseen classes in the evaluation tests show the effectiveness of our proposed method.

References

  1. Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, 77-85. https://doi.org/10.1109/CVPR.2017.16.Google ScholarGoogle ScholarCross RefCross Ref
  2. Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet++: Deep hierarchical feature learning on point sets in a metric space. arXiv: 1706.02413. https://doi.org/10.48550/arXiv.1706.02413.Google ScholarGoogle ScholarCross RefCross Ref
  3. Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical networks for few-shot learning. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, December 2017, 4080–4090. arXiv:1703.05175. https://doi.org/10.48550/arXiv.1703.05175.Google ScholarGoogle ScholarCross RefCross Ref
  4. Na Zhao, Tat-Seng Chua, and Gim Hee Lee. 2021. Few-shot 3D point cloud semantic segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, 8869-8878. https://doi.org/10.1109/CVPR46437.2021.00876.Google ScholarGoogle ScholarCross RefCross Ref
  5. Xian Zhong, Cheng Gu, Wenxin Huang, Lin Li, Shuqin Chen, and Chia-Wen Lin. 2020. Complementing representation deficiency in few-shot image classification: A Meta-Learning approach. 25th International Conference on Pattern Recognition (ICPR), arXiv:2007.10778. https://doi.org/10.48550/arXiv.2007.10778.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ardhendu Shekhar Tripathi, Martin Danelljan, Luc Van Gool, and Radu Timofte. 2021. Fast Few-Shot classification by Few-Iteration Meta-Learning. Internet Content Rating Association (ICRA). arXiv:2010.00511. https://doi.org/10.48550/arXiv.2010.00511.Google ScholarGoogle ScholarCross RefCross Ref
  7. Sepp Hochreiter, Arthur S. Younger, and Peter R. Conwell. 2001. Learning to learn using gradient descent. International Conference on Artificial Neural Networks, Springer, 87-94. https://doi.org/10.1007/3-540-44668-0_13.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jane X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. 2016. Learning to reinforcement learn. arXiv:1611.05763. https://doi.org/10.48550/arXiv.1611.05763.Google ScholarGoogle ScholarCross RefCross Ref
  9. Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching networks for one shot learning. Advances in Neural Information Processing Systems, June 2016, 3630-3638. arXiv:1606.04080. https://doi.org/10.48550/arXiv.1606.04080.Google ScholarGoogle ScholarCross RefCross Ref
  10. Chi Zhang, Guosheng Lin, Fayao Liu, Rui Yao, and Chunhua Shen. 2019. CANet: Class-Agnostic segmentation networks with iterative refinement and attentive Few-Shot learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, 5212-5221. https://doi.org/10.1109/CVPR.2019.00536.Google ScholarGoogle ScholarCross RefCross Ref
  11. Rinu Boney and Alexander Ilin. 2018. Semi-Supervised and active Few-Shot learning with prototypical networks. arXiv:1711.10856. https://doi.org/10.48550/arXiv.1711.10856.Google ScholarGoogle ScholarCross RefCross Ref
  12. Nanqing Dong and Eric P. Xing. 2018. Few-Shot semantic segmentation with prototype learning. British Machine Vision Conference, September 2018.Google ScholarGoogle Scholar
  13. Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, and Joongkyu Kim. 2021. Adaptive prototype learning and allocation for Few-Shot segmentation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, 8330-8339. https://doi.org/10.1109/CVPR46437.2021.00823.Google ScholarGoogle ScholarCross RefCross Ref
  14. Sachin Ravi and Hugo Larochelle. 2017. Optimization as a model for few-shot learning. International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  15. Kaixin Wang, Jun Hao Liew, Yingtian Zou, Daquan Zhou, and Jiashi Feng. 2019. PANet: Few-Shot image semantic segmentation with prototype alignment. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, 9196-9205. https://doi.org/10.1109/ICCV.2019.00929.Google ScholarGoogle ScholarCross RefCross Ref
  16. Xudong Li, Li Feng, Lei Li, and Chen Wang. 2021. Few-shot Meta-learning on Point Cloud for Semantic Segmentation. arXiv:2104.02979. https://doi.org/10.48550/arXiv.2104.02979.Google ScholarGoogle ScholarCross RefCross Ref
  17. Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, and Sergey Levine. 2018. Few-Shot segmentation propagation with guided networks. arXiv:1806.07373. https://doi.org/10.48550/arXiv.1806.07373.Google ScholarGoogle ScholarCross RefCross Ref
  18. Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2018. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics 38(5), January 2018. arXiv:1801.07829. https://doi.org/10.48550/arXiv.1801.07829.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Computation and Language (cs.CL); Machine Learning (cs.LG). arXiv:1706.03762. https://doi.org/10.48550/arXiv.1706.03762.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mingtao Feng, Liang Zhang, Xuefei Lin, Syed Zulqarnain Gilani, Ajmal Mian. 2019. Point attention network for semantic segmentation of 3D point clouds. arXiv:1909.12663. https://doi.org/10.48550/arXiv.1909.12663.Google ScholarGoogle ScholarCross RefCross Ref
  21. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR). arXiv:2010.11929. https://doi.org/10.48550/arXiv.2010.11929.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jinlu Liu and Yongqiang Qin. 2020. Prototype refinement network for Few-Shot segmentation. arXiv:2002.03579. https://doi.org/10.48550/arXiv.2002.03579.Google ScholarGoogle ScholarCross RefCross Ref
  23. Iro Armeni, Ozan Sener, Amir R. Zamir, Helen Jiang and Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3D semantic parsing of Large-Scale indoor spaces. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, 1534-1543. https://doi.org/10.1109/CVPR.2016.170.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICRAI '23: Proceedings of the 2023 9th International Conference on Robotics and Artificial Intelligence
    November 2023
    72 pages
    ISBN:9798400708282
    DOI:10.1145/3637843

    Copyright © 2023 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 6 March 2024

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)2

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format