Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation

Gong, Jingyu; Liu, Fengqi; Xu, Jiachen; Wang, Min; Tan, Xin; Zhang, Zhizhong; Yi, Ran; Song, Haichuan; Xie, Yuan; Ma, Lizhuang

doi:10.1007/978-3-031-20086-1_30

Jingyu Gong¹²,
Fengqi Liu¹²,
Jiachen Xu¹²,
Min Wang¹³,
Xin Tan¹⁴,
Zhizhong Zhang¹⁴,
Ran Yi¹²,
Haichuan Song¹⁴,
Yuan Xie¹⁴ &
…
Lizhuang Ma^12,14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13662))

Included in the following conference series:

European Conference on Computer Vision

1858 Accesses
4 Citations

Abstract

Recently, studies considering domain gaps in shape completion attracted more attention, due to the undesirable performance of supervised methods on real scans. They only noticed the gap in input scans, but ignored the gap in output prediction, which is specific for completion. In this paper, we disentangle partial scans into three (domain, shape, and occlusion) factors to handle the output gap in cross-domain completion. For factor learning, we design view-point prediction and domain classification tasks in a self-supervised manner and bring a factor permutation consistency regularization to ensure factor independence. Thus, scans can be completed by simply manipulating occlusion factors while preserving domain and shape information. To further adapt to instances in the target domain, we introduce an optimization stage to maximize the consistency between completed shapes and input scans. Extensive experiments on real scans and synthetic datasets show that ours outperforms previous methods by a large margin and is encouraging for the following works. Code is available at https://github.com/azuki-miho/OptDE.

J. Gong and F. Liu—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aberman, K., Li, P., Lischinski, D., Sorkine-Hornung, O., Cohen-Or, D., Chen, B.: Skeleton-aware networks for deep motion retargeting. ACM Trans. Graph. (TOG) 39(4), 62-1 (2020)
Google Scholar
Barlow, H.B., Kaushal, T.P., Mitchison, G.J.: Finding minimum entropy codes. Neural Comput. 1(3), 412–423 (1989)
Article Google Scholar
Bau, D., et al.: Semantic photo manipulation with a generative image prior. In: SIGGRAPH (2020)
Google Scholar
Chang, A., et al.: Matterport3D: learning from RGB-D data in indoor environments. In: 2017 International Conference on 3D Vision (3DV), pp. 667–676. IEEE Computer Society (2017)
Google Scholar
Chang, A.X., et al.: ShapeNet: an information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)
Chen, X., Chen, B., Mitra, N.J.: Unpaired point cloud completion on real scans using adversarial training. In: International Conference on Learning Representations (2020)
Google Scholar
Cosmo, L., Norelli, A., Halimi, O., Kimmel, R., Rodolà, E.: LIMP: learning latent shape representations with metric preservation priors. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 19–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_2
Chapter Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5828–5839 (2017)
Google Scholar
Fu, H., et al.: 3d-future: 3d furniture shape with texture. arXiv preprint arXiv:2009.09633 (2020)
Fumero, M., Cosmo, L., Melzi, S., Rodolà, E.: Learning disentangled representations via product manifold projection. In: ICML (2021)
Google Scholar
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189. PMLR (2015)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The Kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Google Scholar
Gonzalez-Garcia, A., van de Weijer, J., Bengio, Y.: Image-to-image translation for cross-domain disentanglement. In: NeurIPS (2018)
Google Scholar
Hou, J., Dai, A., Nießner, M.: RevealNet: seeing behind objects in RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2098–2107 (2020)
Google Scholar
Huang, Z., Yu, Y., Xu, J., Ni, F., Le, X.: PF-Net: point fractal network for 3d point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7662–7670 (2020)
Google Scholar
Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning (ICML), pp. 2649–2658. PMLR (2018)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Google Scholar
Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2252–2261 (2019)
Google Scholar
Liu, A.H., Liu, Y.C., Yeh, Y.Y., Wang, Y.C.F.: A unified feature disentangler for multi-domain image translation and manipulation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2595–2604 (2018)
Google Scholar
Liu, M., Sheng, L., Yang, S., Shao, J., Hu, S.M.: Morphing and sampling network for dense point cloud completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11596–11603 (2020)
Google Scholar
Liu, Y.C., Yeh, Y.Y., Fu, T.C., Wang, S.D., Chiu, W.C., Wang, Y.C.F.: Detach and adapt: learning cross-domain disentangled deep representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8867–8876 (2018)
Google Scholar
Ma, F., Ayaz, U., Karaman, S.: Invertibility of convolutional generative networks from partial measurements. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Pan, L., et al.: Variational relational point completion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8524–8533 (2021)
Google Scholar
Peng, X., Huang, Z., Sun, X., Saenko, K.: Domain agnostic learning with disentangled representations. In: International Conference on Machine Learning, pp. 5102–5112. PMLR (2019)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3d classification and segmentation. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 652–660 (2017)
Google Scholar
Schmidhuber, J.: Learning factorial codes by predictability minimization. Neural Comput. 4(6), 863–879 (1992)
Article Google Scholar
Shu, D.W., Park, S.W., Kwon, J.: 3d point cloud generative adversarial network based on tree structured graph convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3859–3868 (2019)
Google Scholar
Sigal, L., Balan, A., Black, M.: Combined discriminative and generative articulated pose and non-rigid shape estimation. Adv. Neural. Inf. Process. Syst. 20, 1337–1344 (2007)
Google Scholar
Straßer, W.: Schnelle kurven-und flächendarstellung auf grafischen sichtgeräten. Ph.D. thesis (1974)
Google Scholar
Tchapmi, L.P., Kosaraju, V., Rezatofighi, H., Reid, I., Savarese, S.: TopNet: structural point cloud decoder. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 383–392 (2019)
Google Scholar
Wang, H., Liu, Q., Yue, X., Lasenby, J., Kusner, M.J.: Unsupervised point cloud pre-training via occlusion completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9782–9792 (2021)
Google Scholar
Wang, X., Ang Jr, M.H., Lee, G.H.: Cascaded refinement network for point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 790–799 (2020)
Google Scholar
Wen, X., Han, Z., Cao, Y.P., Wan, P., Zheng, W., Liu, Y.S.: Cycle4completion: unpaired point cloud completion using cycle transformation with missing region coding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13080–13089 (2021)
Google Scholar
Wen, X., Li, T., Han, Z., Liu, Y.S.: Point cloud completion by skip-attention network with hierarchical folding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1939–1948 (2020)
Google Scholar
Wen, X., et al.: PMP-Net: point cloud completion by learning multi-step point moving paths. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7443–7452 (2021)
Google Scholar
Wu, R., Chen, X., Zhuang, Y., Chen, B.: Multimodal shape completion via conditional generative adversarial networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 281–296. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_17
Chapter Google Scholar
Wu, X., Huang, H., Patel, V.M., He, R., Sun, Z.: Disentangled variational representation for heterogeneous face recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 9005–9012 (2019)
Google Scholar
Wu, Z., et al.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Xie, H., Yao, H., Zhou, S., Mao, J., Zhang, S., Sun, W.: GRNet: gridding residual network for dense point cloud completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 365–381. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_21
Chapter Google Scholar
Yang, Y., Feng, C., Shen, Y., Tian, D.: FoldingNet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)
Google Scholar
Yuan, W., Khot, T., Held, D., Mertz, C., Hebert, M.: PCN: point completion network. In: 2018 International Conference on 3D Vision (3DV), pp. 728–737. IEEE (2018)
Google Scholar
Zhang, J., et al.: Unsupervised 3d shape completion through GAN inversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1768–1777 (2021)
Google Scholar
Zhang, W., Yan, Q., Xiao, C.: Detail preserved point cloud completion via separated feature aggregation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 512–528. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_31
Chapter Google Scholar

Download references

Acknowledgments

This work is sponsored by the National Key Research and Development Program of China (No. 2019YFC1521104), the National Natural Science Foundation of China (No. 61972157,72192821), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102), Shanghai Sailing Program (22YF1420300), Shanghai Science and Technology Commission (21511101200) and SenseTime Collaborative Research Grant.

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Jingyu Gong, Fengqi Liu, Jiachen Xu, Ran Yi & Lizhuang Ma
SenseTime Research, Shanghai, China
Min Wang
East China Normal University, Shanghai, China
Xin Tan, Zhizhong Zhang, Haichuan Song, Yuan Xie & Lizhuang Ma
Qing Yuan Research Institute, SJTU, Shanghai, China
Lizhuang Ma

Authors

Jingyu Gong
View author publications
You can also search for this author in PubMed Google Scholar
Fengqi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiachen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Min Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Tan
View author publications
You can also search for this author in PubMed Google Scholar
Zhizhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ran Yi
View author publications
You can also search for this author in PubMed Google Scholar
Haichuan Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Lizhuang Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yuan Xie or Lizhuang Ma .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 11410 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gong, J. et al. (2022). Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13662. Springer, Cham. https://doi.org/10.1007/978-3-031-20086-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-20086-1_30
Published: 11 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20085-4
Online ISBN: 978-3-031-20086-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation