Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap

Chen, Yongwei; Wang, Zihao; Zou, Longkun; Chen, Ke; Jia, Kui

doi:10.1007/978-3-031-19827-4_42

Yongwei Chen^12,13,
Zihao Wang¹²,
Longkun Zou¹²,
Ke Chen^12,14 &
…
Kui Jia^12,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13693))

Included in the following conference series:

European Conference on Computer Vision

4001 Accesses
5 Citations

Abstract

Semantic analyses of object point clouds are largely driven by releasing of benchmarking datasets, including synthetic ones whose instances are sampled from object CAD models. However, learning from synthetic data may not generalize to practical scenarios, where point clouds are typically incomplete, non-uniformly distributed, and noisy. Such a challenge of Simulation-to-Reality (Sim2Real) domain gap could be mitigated via learning algorithms of domain adaptation; however, we argue that generation of synthetic point clouds via more physically realistic rendering is a powerful alternative, as systematic non-uniform noise patterns can be captured. To this end, we propose an integrated scheme consisting of physically realistic synthesis of object point clouds via rendering stereo images via projection of speckle patterns onto CAD models and a novel quasi-balanced self-training designed for more balanced data distribution by sparsity-driven selection of pseudo labeled samples for long tailed classes. Experiment results can verify the effectiveness of our method as well as both of its modules for unsupervised domain adaptation on point cloud classification, achieving the state-of-the-art performance. Source codes and the SpeckleNet synthetic dataset are available at https://github.com/Gorilla-Lab-SCUT/QS3.

Y. Chen and Z. Wang—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Self-supervised Learning with Multi-view Rendering for 3D Point Cloud Analysis

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Article Open access 08 March 2024

Progressive Point Cloud Deconvolution Generation Network

References

Achituve, I., Maron, H., Chechik, G.: Self-supervised learning for domain adaptation on point clouds. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 123–133 (2021)
Google Scholar
Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M.: Domain-adversarial neural networks. Stat 1050, 15 (2014)
MATH Google Scholar
Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Pseudo-labeling and confirmation bias in deep semi-supervised learning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)
Google Scholar
Barron, J.T., Malik, J.: Intrinsic scene properties from a single RGB-D image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2013)
Google Scholar
Bartell, F.O., Dereniak, E.L., Wolfe, W.L.: The theory and measurement of bidirectional reflectance distribution function (BRDF) and bidirectional transmittance distribution function (BTDF). In: Radiation Scattering in Optical Systems (RSOS), vol. 257, pp. 154–160. SPIE (1981)
Google Scholar
Bohg, J., Romero, J., Herzog, A., Schaal, S.: Robot arm pose estimation through pixel-wise part classification. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3143–3150. IEEE (2014)
Google Scholar
Chang, A.X., et al.: ShapeNet: an information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)
Chen, W., Jia, X., Chang, H.J., Duan, J., Shen, L., Leonardis, A.: FS-Net: fast shape-based network for category-level 6d object pose estimation with decoupled rotation mechanism. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1581–1590 (2021)
Google Scholar
Blender Online Community: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). www.blender.org
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5828–5839 (2017)
Google Scholar
Deng, S., Liang, Z., Sun, L., Jia, K.: VISTA: boosting 3d object detection via dual cross-view spatial attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8448–8457 (2022)
Google Scholar
Denninger, M., et al.: BlenderProc: reducing the reality gap with photorealistic rendering. In: Robotics: Science and Systems (RSS) (2020)
Google Scholar
Fang, J., et al.: Augmented lidar simulator for autonomous driving. IEEE Robot. Autom. Lett. (RA-L) 5(2), 1931–1938 (2020)
Article Google Scholar
Gao, G., Lauri, M., Wang, Y., Hu, X., Zhang, J., Frintrop, S.: 6d object pose regression via supervised learning on point clouds. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3643–3649 (2020)
Google Scholar
Goyal, A., Law, H., Liu, B., Newell, A., Deng, J.: Revisiting point cloud shape classification with a simple and effective baseline. In: International Conference on Machine Learning (ICML), pp. 3809–3820. PMLR (2021)
Google Scholar
Grans, S., Tingelstad, L.: Blazer: laser scanning simulation using physically based rendering. arXiv preprint arXiv:2104.05430 (2021)
Gschwandtner, M., Kwitt, R., Uhl, A., Pree, W.: BlenSor: blender sensor simulation toolbox. In: Bebis, G., et al. (eds.) Blensor: Blender sensor simulation toolbox. LNCS, vol. 6939, pp. 199–208. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24031-7_20
Chapter Google Scholar
Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Understanding real world indoor scenes with synthetic data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4077–4085 (2016)
Google Scholar
Handa, A., Whelan, T., McDonald, J., Davison, A.J.: A benchmark for RGB-D visual odometry, 3d reconstruction and slam. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531. IEEE (2014)
Google Scholar
Heindl, C., Brunner, L., Zambal, S., Scharinger, J.: BlendTorch: a real-time, adaptive domain randomization library. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12664, pp. 538–551. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68799-1_39
Chapter Google Scholar
Kim, J., Hur, Y., Park, S., Yang, E., Hwang, S.J., Shin, J.: Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33 (2020)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1189–1197 (2010)
Google Scholar
Landau, M.J., Choo, B.Y., Beling, P.A.: Simulating kinect infrared and depth images. IEEE Trans. Cybernet. 46(12), 3018–3031 (2015)
Article Google Scholar
Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICML Workshop on Challenges in Representation Learning (WREPL) (2013)
Google Scholar
Li, B., Zhang, T., Xia, T.: Vehicle detection from 3d lidar using fully convolutional network. In: Robotics: Science and Systems (RSS) (2016)
Google Scholar
Li, Wet al.: InteriorNet: mega-scale multi-sensor photo-realistic indoor scenes dataset. In: British Machine Vision Conference (BMVC) (2018)
Google Scholar
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 31, pp. 820–830 (2018)
Google Scholar
Lian, Q., Lv, F., Duan, L., Gong, B.: Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: a non-adversarial approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6758–6767 (2019)
Google Scholar
Lin, J., Wei, Z., Li, Z., Xu, S., Jia, K., Li, Y.: DualposeNet: category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3560–3569 (2021)
Google Scholar
Lin, X., Chen, K., Jia, K.: Object point cloud classification via poly-convolutional architecture search. In: Proceedings of the ACM International Conference on Multimedia (MM), pp. 807–815 (2021)
Google Scholar
Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8895–8904 (2019)
Google Scholar
Mallick, T., Das, P.P., Majumdar, A.K.: Characterizations of noise in kinect depth images: a review. IEEE Sens. J. 14(6), 1731–1740 (2014)
Google Scholar
Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11167–11176 (2020)
Google Scholar
Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT), pp. 524–530. IEEE (2012)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8026–8037 (2019)
Google Scholar
Planche, B., Singh, R.V.: Physics-based differentiable depth sensor simulation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14387–14397 (2021)
Google Scholar
Planche, B., et al.: DepthSynth: real-time realistic synthetic data generation from cad models for 2.5 d recognition. In: International Conference on 3D Vision (3DV), pp. 1–10. IEEE (2017)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 30 (2017)
Google Scholar
Qin, C., You, H., Wang, L., Kuo, C.C.J., Fu, Y.: PointDAN: a multi-scale 3d domain adaption network for point cloud representation. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32 (2019)
Google Scholar
Reitmann, S., Neumann, L., Jung, B.: Blainder-a blender AI add-on for generation of semantically labeled depth-sensing data. Sensors 21(6), 2144 (2021)
Article Google Scholar
Roy, S., Siarohin, A., Sangineto, E., Bulò, S.R., Sebe, N., Ricci, E.: Unsupervised domain adaptation using feature-whitening and consensus loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9463–9472 (2019)
Google Scholar
Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. In: International Conference on Machine Learning (ICML), vol. 70, pp. 2988–2997 (2017)
Google Scholar
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 596–608 (2020)
Google Scholar
Straub, J., et al.: The replica dataset: a digital replica of indoor spaces. arXiv preprint arXiv:1906.05797 (2019)
Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, London (2010). https://doi.org/10.1007/978-1-84882-935-0
Book MATH Google Scholar
Tallavajhula, A., Meriçli, Ç., Kelly, A.: Off-road lidar simulation with data-driven terrain primitives. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 7470–7477. IEEE (2018)
Google Scholar
Tang, H., Chen, K., Jia, K.: Unsupervised domain adaptation via structurally regularized deep clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Uy, M.A., Pham, Q., Hua, B., Nguyen, T., Yeung, S.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1588–1597 (2019)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)
Article Google Scholar
Wei, C., Sohn, K., Mellina, C., Yuille, A., Yang, F.: CReST: a class-rebalancing self-training framework for imbalanced semi-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10857–10866 (2021)
Google Scholar
Wu, Z., et al.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1912–1920 (2015)
Google Scholar
Xu, Z., Chen, K., Liu, K., Ding, C., Wang, Y., Jia, K.: Classification of single-view object point clouds. arXiv preprint arXiv:2012.10042 (2020)
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3d object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7652–7660 (2018)
Google Scholar
Zhang, Y., et al.: Physically-based rendering for indoor scene understanding using convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Zhang, Y., Lin, J., He, C., Chen, Y., Jia, K., Zhang, L.: Masked surfel prediction for self-supervised point cloud learning. arXiv preprint arXiv:2207.03111 (2022)
Zou, L., Tang, H., Chen, K., Jia, K.: Geometry-aware self-training for unsupervised domain adaptation on object point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6403–6412 (2021)
Google Scholar
Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 297–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_18
Chapter Google Scholar
Zou, Y., Yu, Z., Liu, X., Kumar, B.V., Wang, J.: Confidence regularized self-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (Grant No.: 61902131), the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No.: 2017ZT07X183), the Guangdong Provincial Key Laboratory of Human Digital Twin (Grant No.: 2022B1212010004).

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Yongwei Chen, Zihao Wang, Longkun Zou, Ke Chen & Kui Jia
DexForce Co. Ltd., Shenzhen, China
Yongwei Chen
Peng Cheng Laboratory, Shenzhen, China
Ke Chen & Kui Jia

Authors

Yongwei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zihao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Longkun Zou
View author publications
You can also search for this author in PubMed Google Scholar
Ke Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kui Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ke Chen or Kui Jia .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7351 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Wang, Z., Zou, L., Chen, K., Jia, K. (2022). Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13693. Springer, Cham. https://doi.org/10.1007/978-3-031-19827-4_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-19827-4_42
Published: 02 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19826-7
Online ISBN: 978-3-031-19827-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap