Abstract
With the continuous development of sensors, a number of inexpensive and effective depth cameras have shown up, which have greatly contributed to the development of autonomous driving and 3D reconstruction technologies. However, the depth images captured by low-cost depth cameras are low-resolution, which is difficult to meet the needs of practical applications. We propose a two-branch network to achieve depth map super-resolution with high-resolution guidance image, which can be viewed as a prior to guide the low-resolution depth map to restore the missing high-frequency details of structures. To emphasize the guidance role of high-resolution images, we use spatially-variant kernels based on the guidance feature map to replace the original convolution kernels. In addition, in order to extract the feature maps of the depth images more effectively, we add the channel attention mechanism between convolution layers. Our network is trained end-to-end, supporting various sizes of input images because the network backbone uses full convolution and no fully connected layers. The proposal model is trained only on a certain dataset under three super-resolution factors and utilized directly on other datasets without fine-tuning. We show the effectiveness of our model by comparing it with state-of-art methods.
National Key Research and Development Program of China under Grant 2018AAA0103001.
National Natural Science Foundation of China (Grants No. U1813208, 62173319, 62063006).
Guangdong Basic and Applied Basic Research Foundation (2020B1515120054).
Shenzhen Fundamental Research Program (JCYJ20200109115610172).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: 2012 IEEE International Conference on Robotics and Automation, pp. 1691–1696 (2012)
Qu, Y., Ou, Y.: LEUGAN: low-light image enhancement by unsupervised generative attentional networks. arXiv preprint arXiv:2012.13322 (2020)
Qu, Y., Chen, K., Liu, C., et al.: UMLE: unsupervised multi-discriminator network for low light enhancement. arXiv preprint arXiv:2012.13177 (2020)
Guo, K., Xu, F., Yu, T., Liu, X., Dai, Q., Liu, Y.: Real-time geometry, Albedo, and motion reconstruction using a single RGB-D camera. ACM Trans. Graph. 36(4), 1 (2017). Article 44a
Zollhöfer, M., Nießner, M., Izadi, S., et al.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33(4), 1–12 (2014). Article 156
Kopf, J., Cohen, M.F., Lischinski, D., et al.: Joint bilateral upsampling. ACM Trans. Graph. 26(3), 96 (2007)
Yang, Q., Yang, R., Davis, J., et al.: Spatial-depth super resolution for range images. In: Proceedings of 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Minneapolis, Minnesota, USA, 18–23 June 2007 (2007)
He, K., Sun, J., Tang, X.: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2013)
Diebel, J., Thrun, S.: An application of Markov random fields to range sensing. In: NIPS (2005)
Park, J., Kim, H., Tai, Y.W., Brown, M., Kweon, I.: High quality depth map upsampling for 3D-TOF cameras. In: ICCV, pp. 1623–1630 (2011)
Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: ICCV, pp. 993–1000 (2013)
Mac Aodha, O., Campbell, N.D.F., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 71–84. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_6
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13
Dong, C., Deng, Y., Loy, C.C., et al.: Compression artifacts reduction by a deep convolutional network. In: Proceedings of 2015 IEEE International Conference on Computer Vision, ICCV, Santiago, Chile (2015)
Eigen, D., Krishnan, D., Fergus, R.: Restoring an image taken through a window covered with dirt or rain. In: Proceedings of IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia (2013)
Ren, J.S., Xu, L., Yan, Q., et al.: Shepard convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, 7–12 December 2015 (2015)
Hui, T.-W., Loy, C.C., Tang, X.: Depth map super-resolution by deep multi-scale guidance. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 353–369. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_22
Li, Y., Huang, J.-B., Ahuja, N., Yang, M.-H.: Deep joint image filtering. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 154–169. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_10
Li, Y., Huang, J.B., Ahuja, N., Yang, M.H.: Joint image filtering with deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1909–1923 (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2018)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Hirschmuller, H., Scharstein, D.: Evaluation of cost functions for stereo matching. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2007)
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_44
Lu, S., Ren, X., Liu, F.: Depth enhancement via low-rank matrix completion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2014)
Su, H., Jampani, V., Sun, D., Gallo, O., Learned-Miller, E., Kautz, J.: Pixel-adaptive convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11166–11175 (2019)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 839–846 (1998)
Qu, Y., Ou, Y., Xiong, R.: Low illumination enhancement for object detection in self-driving. In: IEEE International Conference on Robotics and Biomimetics (ROBIO), vol. 2019, pp. 1738–1743 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Guo, J., Xiong, R., Ou, Y., Wang, L., Liu, C. (2022). Depth Image Super-resolution via Two-Branch Network. In: Sun, F., Hu, D., Wermter, S., Yang, L., Liu, H., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2021. Communications in Computer and Information Science, vol 1515. Springer, Singapore. https://doi.org/10.1007/978-981-16-9247-5_15
Download citation
DOI: https://doi.org/10.1007/978-981-16-9247-5_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9246-8
Online ISBN: 978-981-16-9247-5
eBook Packages: Computer ScienceComputer Science (R0)