research-article

Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion

Authors:
Rui Peng

Bytedance Inc., Shenzhen, China

Bytedance Inc., Shenzhen, China
View Profile

,
Tao Zhang

Bytedance Inc., Shenzhen, China

Bytedance Inc., Shenzhen, China
View Profile

,
Bing Li

King Abdullah University of Science and Technology, Jeddah, Saudi Arabia

King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
View Profile

,
Yitong Wang

Bytedance Inc., Shenzhen, China

Bytedance Inc., Shenzhen, China
View Profile

MM '22: Proceedings of the 30th ACM International Conference on MultimediaOctober 2022Pages 3926–3935https://doi.org/10.1145/3503161.3548019

Published:10 October 2022Publication History

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 3926–3935

ABSTRACT

Image guided depth completion is an extensively studied multi-modal task that takes sparse measurements and RGB images as input to recover dense depth maps. While the common practice is to regress the depth value from the unbounded range, some recent methods achieve breakthrough performance by discretizing the regression range into a number of discrete depth values, namely, Depth Hypotheses, and casting the scalar regression to the distribution estimation. However, existing methods employ the handcraft or image-level adaptive discretization strategies, where their generated depth hypotheses are pixel-shared, which can not adapt to all pixels and is inefficient. In this paper, we are the first to consider the difference between pixels and propose Pixelwise Adaptive Discretization to generate the tailored depth hypotheses for each pixel. Meanwhile, we introduce Uncertainty Sampling to generate the compact depth hypotheses for easy pixels and loose for hard pixels. This divide-and-conquer for each pixel allows the discrete depth hypotheses to be concentrated around the ground-truth of each pixel as much as possible, which is the core of discretization methods. Extensive experiments on the outdoor KITTI and indoor NYU Depth V2 datasets show that our model, called PADNet, surpasses the previous state-of-the-art methods even with limited parameters and computational cost.

Supplemental Material

MM22-fp1148.mp4

mp4

128 MB

Download

References

Shariq Farooq Bhat, Ibraheem Alhashim, and PeterWonka. 2021. Adabins: Depth estimation using adaptive bins. In CVPR. 4009--4018.Google Scholar
Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural network. In ICML. PMLR, 1613--1622.Google Scholar
Jie Chang, Zhonghao Lan, Changmao Cheng, and Yichen Wei. 2020. Data uncertainty learning in face recognition. In CVPR. 5710--5719.Google Scholar
Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In CVPR. 5410--5418.Google Scholar
Tianqi Chen, Emily Fox, and Carlos Guestrin. 2014. Stochastic gradient hamiltonian monte carlo. In ICML. PMLR, 1683--1691.Google Scholar
Yun Chen, Bin Yang, Ming Liang, and Raquel Urtasun. 2019. Learning joint 2d-3d representations for depth completion. In ICCV. 10023--10032.Google Scholar
Zhi Chen, Xiaoqing Ye, Liang Du, Wei Yang, Liusheng Huang, Xiao Tan, Zhenbo Shi, Fumin Shen, and Errui Ding. 2021. AggNet for Self-supervised Monocular Depth Estimation: Go An Aggressive Step Furthe. In ACM MM. 1526--1534.Google Scholar
Xinjing Cheng, Peng Wang, Chenye Guan, and Ruigang Yang. 2020. Cspn: Learning context and resource aware convolutional spatial propagation networks for depth completion. In AAAI, Vol. 34. 10615--10622.Google ScholarCross Ref
Xinjing Cheng, PengWang, and Ruigang Yang. 2018. Depth estimation via affinity learned with convolutional spatial propagation network. In ECCV. 103--119.Google Scholar
James Diebel and Sebastian Thrun. 2005. An application of markov random fields to range sensing. NeurIPS 18.Google Scholar
Shivam Duggal, ShenlongWang,Wei-Chiu Ma, Rui Hu, and Raquel Urtasun. 2019. Deeppruner: Learning efficient stereo matching via differentiable patchmatch. In ICCV. 4384--4393.Google Scholar
Abdelrahman Eldesokey, Michael Felsberg, Karl Holmquist, and Michael Persson. 2020. Uncertainty-aware cnns for depth completion: Uncertainty from beginning to end. In CVPR. 12014--12023.Google Scholar
Abdelrahman Eldesokey, Michael Felsberg, and Fahad Shahbaz Khan. 2019. Confidence propagation through cnns for guided sparse depth regression. IEEE TPAMI 42, 10 (2019), 2423--2436.Google ScholarDigital Library
Huan Fu, Mingming Gong, ChaohuiWang, Kayhan Batmanghelich, and Dacheng Tao. 2018. Deep ordinal regression network for monocular depth estimation. In CVPR. 2002--2011.Google Scholar
Yarin Gal and Zoubin Ghahramani. 2015. Bayesian convolutional neural networks with Bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158 (2015).Google Scholar
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In ICML. PMLR, 1050--1059.Google Scholar
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR. IEEE, 3354--3361.Google Scholar
Clément Godard, Oisin Mac Aodha, Michael Firman, and Gabriel J Brostow. 2019. Digging into self-supervised monocular depth estimation. In ICCV. 3828--3838.Google Scholar
Alex Graves. 2011. Practical variational inference for neural networks. NeurIPS 24 (2011).Google Scholar
Jiaqi Gu, Zhiyu Xiang, Yuwen Ye, and Lingxuan Wang. 2021. DenseLiDAR: A real-time pseudo dense depth guided depth completion network. IEEE RAL 6, 2 (2021), 1808--1815.Google Scholar
Simon Hawe, Martin Kleinsteuber, and Klaus Diepold. 2011. Dense disparity maps from sparse disparity measurements. In ICCV. IEEE, 2126--2133.Google Scholar
Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, and Xiangyu Zhang. 2019. Bounding box regression with uncertainty for accurate object detection. In CVPR. 2888--2897.Google Scholar
Mu Hu, ShulingWang, Bin Li, Shiyu Ning, Li Fan, and Xiaojin Gong. 2021. Penet: Towards precise and efficient image guided depth completion. In ICRA. IEEE, 13656--13662.Google Scholar
Saif Imran, Yunfei Long, Xiaoming Liu, and Daniel Morris. 2019. Depth coefficients for depth completion. In CVPR. IEEE, 12438--12447.Google Scholar
Yurim Jeon, Hwichang Kim, and Seung-Woo Seo. 2021. ABCD: Attentive Bilateral Convolutional Network for Robust Depth Completion. IEEE RAL 7, 1 (2021), 81--87.Google Scholar
Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision? NeurIPS 30 (2017).Google Scholar
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In CVPR. 7482--7491.Google Scholar
Md Fahim Faysal Khan, Nelson Daniel Troncoso Aldas, Abhishek Kumar, Siddharth Advani, and Vijaykrishnan Narayanan. 2021. Sparse to Dense Depth Completion using a Generative Adversarial Network with Intelligent Sampling Strategies. In ACM MM. 5528--5536.Google Scholar
Byeong-Uk Lee, Kyunghyun Lee, and In So Kweon. 2021. Depth Completion using Plane-Residual Representation. In CVPR. 13916--13925.Google Scholar
Lina Liu, Xibin Song, Xiaoyang Lyu, Junwei Diao, Mengmeng Wang, Yong Liu, and Liangjun Zhang. 2020. FCFR-Net: Feature fusion based coarse-to-fine residual learning for depth completion. In AAAI.Google Scholar
Lee-Kang Liu, StanleyH Chan, and Truong Q Nguyen. 2015. Depth reconstruction from sparse samples: Representation, algorithm, and sampling. IEEE TIP 24, 6 (2015), 1983--1996.Google Scholar
Nian Liu, Junwei Han, and Ming-Hsuan Yang. 2018. Picanet: Learning pixel-wise contextual attention for saliency detection. In CVPR. 3089--3098.Google Scholar
Kaiyue Lu, Nick Barnes, Saeed Anwar, and Liang Zheng. 2020. From depth what can you see? Depth completion via auxiliary image reconstruction. In CVPR. 11306--11315.Google Scholar
Fangchang Ma, Guilherme Venturelli Cavalheiro, and Sertac Karaman. 2019. Selfsupervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. In ICRA. IEEE, 3288--3295.Google Scholar
Fangchang Ma and Sertac Karaman. 2018. Sparse-to-dense: Depth prediction from sparse depth samples and a single image. In ICRA. IEEE, 4796--4803.Google Scholar
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV. 116--131.Google Scholar
David JC MacKay. 1992. A practical Bayesian framework for backpropagation networks. Neural computation 4, 3 (1992), 448--472.Google Scholar
Dang-Khoa Nguyen, Wei-Lun Tseng, and Hong-Han Shuai. 2020. Domain- Adaptive Object Detection via Uncertainty-Aware Distribution Alignment.Google Scholar
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. 2019. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. NeruIPS 32.Google Scholar
Jinsun Park, Kyungdon Joo, Zhe Hu, Chi-Kuei Liu, and In So Kweon. 2020. Nonlocal spatial propagation network for depth completion. In ECCV. Springer, 120--136.Google Scholar
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).Google Scholar
Rui Peng, Ronggang Wang, Yawen Lai, Luyang Tang, and Yangang Cai. 2021. Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation. In ICCV. 15560--15569.Google Scholar
Rui Peng, Rongjie Wang, Zhenyu Wang, Yawen Lai, and Ronggang Wang. 2022. Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation. In CVPR.Google Scholar
Matteo Poggi, Filippo Aleotti, Fabio Tosi, and Stefano Mattoccia. 2020. On the uncertainty of self-supervised monocular depth estimation. In CVPR. 3227--3237.Google Scholar
Jiaxiong Qiu, Zhaopeng Cui, Yinda Zhang, Xingdi Zhang, Shuaicheng Liu, Bing Zeng, and Marc Pollefeys. 2019. Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In CVPR. 3313--3322.Google Scholar
Chao Qu, Wenxin Liu, and Camillo J Taylor. 2021. Bayesian deep basis fitting for depth completion with uncertainty. In ICCV. 16147--16157.Google Scholar
Guibao Shen, Yingkui Zhang, Jialu Li, Mingqiang Wei, Qiong Wang, Guangyong Chen, and Pheng-Ann Heng. 2021. Learning Regularizer for Monocular Depth Estimation with Adversarial Guidance. In ACM MM. 5222--5230.Google Scholar
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. In ECCV. Springer, 746--760.Google Scholar
Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, and Jan Kautz. 2019. Pixel-adaptive convolutional neural networks. In CVPR. 11166--11175.Google Scholar
Yukun Su, Guosheng Lin, Ruizhou Sun, Yun Hao, and Qingyao Wu. 2021. Modeling the Uncertainty for Self-supervised 3D Skeleton Action Representation Learning. In ACM MM. 769--778.Google Scholar
Jie Tang, Fei-Peng Tian, Wei Feng, Jian Li, and Ping Tan. 2020. Learning guided convolutional network for depth completion. IEEE TIP 30 (2020), 1116--1129.Google Scholar
Qi Tang, Runmin Cong, Ronghui Sheng, Lingzhi He, Dan Zhang, Yao Zhao, and Sam Kwong. 2021. BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation. In ACM MM. 2148--2157.Google Scholar
Fabio Tosi, Yiyi Liao, Carolin Schmitt, and Andreas Geiger. 2021. Smd-nets: Stereo mixture density networks. In CVPR. 8942--8952.Google Scholar
Jonas Uhrig, Nick Schneider, Lukas Schneider, Uwe Franke, Thomas Brox, and Andreas Geiger. 2017. Sparsity invariant cnns. In 3DV. IEEE, 11--20.Google Scholar
Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. 2020. Uncertainty estimation using a single deep deterministic neural network. In ICML. PMLR, 9690--9700.Google Scholar
Haowen Wang, Mingyuan Wang, Zhengping Che, Zhiyuan Xu, Xiuquan Qiao, Mengshi Qi, Feifei Feng, and Jian Tang. 2022. RGB-Depth Fusion GAN for Indoor Depth Completion. In CVPR.Google Scholar
Max Welling and Yee W Teh. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In ICML. Citeseer, 681--688.Google Scholar
Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. Cbam: Convolutional block attention module. In ECCV. 3--19.Google Scholar
Yan Xu, Xinge Zhu, Jianping Shi, Guofeng Zhang, Hujun Bao, and Hongsheng Li. 2019. Depth completion from sparse lidar data with depth-normal constraints. In ICCV. 2811--2820.Google Scholar
Lin Yan, Kai Liu, and Evgeny Belyaev. 2020. Revisiting sparsity invariant convolution: A network for image guided depth completion. IEEE Access 8 (2020), 126323--126332.Google ScholarCross Ref
Qingxiong Yang, Ruigang Yang, James Davis, and David Nistér. 2007. Spatialdepth super resolution for range images. In CVPR. IEEE, 1--8.Google Scholar
Yanchao Yang, AlexWong, and Stefano Soatto. 2019. Dense depth posterior (ddp) from single image and sparse range. In CVPR. 3353--3362.Google Scholar
Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, and Long Quan. 2018. Mvsnet: Depth inference for unstructured multi-view stereo. In ECCV. 767--783.Google Scholar
Feihu Zhang, Victor Prisacariu, Ruigang Yang, and Philip HS Torr. 2019. Ga-net: Guided aggregation net for end-to-end stereo matching. In CVPR. 185--194.Google Scholar
Youmin Zhang, Yimin Chen, Xiao Bai, Suihanjin Yu, Kun Yu, Zhiwei Li, and Kuiyuan Yang. 2020. Adaptive unimodal cost volume filtering for deep stereo matching. In AAAI, Vol. 34. 12926--12934.Google ScholarCross Ref
Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, and Chao Dong. 2020. Efficient image super-resolution using pixel attention. In ECCV. Springer, 56--72.Google Scholar

Index Terms

Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Vision for robotics

Recommendations

Sparse to Dense Depth Completion using a Generative Adversarial Network with Intelligent Sampling Strategies
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Predicting dense depth accurately is essential for 3D scene understanding applications such as autonomous driving and robotics. However, the depth obtained from commercially available LiDAR and Time-of-Flight sensors is very sparse. With RGB color ...
Read More
Depth completion for kinect v2 sensor

Kinect v2 adopts a time-of-flight (ToF) depth sensing mechanism, which causes different type of depth artifacts comparing to the original Kinect v1. The goal of this paper is to propose a depth completion method, which is designed especially for the ...
Read More
Depth-map completion for large indoor scene reconstruction
Highlights
- Propose a new depth completion algorithm for MVS depth-maps.
- Use occlusion ...
Abstract
Traditional Multi View Stereo (MVS) algorithms are often difficult to deal with large-scale indoor scene reconstruction, due to the photo-consistency measurement errors in weak textured regions, which are commonly exist in indoor ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
depth completion
pixelwise adaptive
uncertainty sampling
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 222
  Total Downloads
- Downloads (Last 12 months)93
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Sparse to Dense Depth Completion using a Generative Adversarial Network with Intelligent Sampling Strategies

Depth completion for kinect v2 sensor

Depth-map completion for large indoor scene reconstruction