Depth map upsampling using compressive sensing based model
Introduction
In recent years, a wide range of devices have been developed to measure the 3D information in the real world, such as laser scanners, structured-light systems, time-of-flight cameras and passive stereo systems. The depth maps (range images) captured with most active sensors usually suffer from relatively low resolution, limited precision and significant sensor noise. Therefore, effective depth map post-processing techniques are essential for practical applications such as scene reconstruction and 3D video production, especially for 3D face recognition [1] and 3D object recognition [2], [3].
In this paper, we present a method to enhance the spatial resolution of a depth map with a registered high-resolution color image. Our method is based on two key assumptions: first, neighboring pixels with similar colors are likely to have similar depth values; second, just like most natural images, an ideal depth map without noise corruption has large smooth regions and relatively few discontinuities, and therefore can be approximated with a sparse representation in some transform domain such as multiscale wavelets. Although the first assumption has been extensively explored in recent depth post-processing work [4], [5], [6], [7], [8], [9], relative less attention has been given to the second assumption [10], [11].
Inspired by the theory of compressive sensing [12], [13], we try to recover the upsampled depth map in a sparse signal reconstruction process. We first compute a set of measurement data from the low-resolution depth map. The measurement data near depth discontinuities are generated with a cellular automaton algorithm, and no filtering techniques are involved in the process. Then we reconstruct the depth signal in an optimization model, with constraints on measurements, smoothness and representation sparseness. An efficient numerical method is provided to solve the model with linear complexity in the number of the image pixels. Experimental results show that, by solving the problem in a CS-based framework, our algorithm can produce high quality depth results with relatively low resolution depth maps. And it shows stable performance under noisy conditions.
The rest of the paper is organized as follows. Related work is reviewed in Section 2. Section 3 provides a brief introduction to the CS theory, whereas our CS-based upsampling model is presented in Section 4. After that, in Section 5, we describe how to generate the sampling data for the model, and we provide a numerical solution in Section 6. Section 7 reports the experimental results and discusses how to register a low resolution depth map and its companion high resolution color image as well as the influence of sampling pattern. At last, conclusions are given in Section 8.
Section snippets
Related work
As stated in Section 1, the idea of enhancing a depth map with a coupled color image is not new. Existing methods can be roughly classified as either filtering-based methods [5], [6], [7], [8] or optimization-based methods [4], [9].
Filtering-based methods employ color information with various edge-preserving filters [14], [15]. Kopf et al. [5] use a joint bilateral filter to refine the upsampled depth results. Yang et al. [6] instead initialize a cost volume and iteratively smooth each cost
CS theory and underdetermined linear system
CS theory finds an optimal solution from the observed data by reducing the problem to solving an underdetermined linear system. In mathematical terms, the observed data is connected to the signal of interest viawhere , x is the s-sparse vector which only has s nonzero components and the measurement matrix models the linear measurement process. Traditional wisdom of linear algebra suggests that the number m of measurements must be at least as large as the signal
CS-based upsampling model
We build our upsampling model upon a fundamental fact that many signals can be represented or approximated with only a few coefficients in a suitable basis. Consider a high-resolution depth map in column vector form, it can be linearly represented with an orthonormal basis and a set of coefficients : . The map d is linearly measured m times (), which leads to a set of measurements with a measurement matrix : . The CS theory tries to recover depth map
Sampling data generation
This section describes how to construct the measurement matrix used for depth upsampling (i.e. how to generate the sampling data from a low-resolution depth map Dl and a registered high-resolution color image Ih) as the measurement matrix is a specific matrix that should satisfy the minimum measurement requirements. In the following paragraphs, the sampling position information is denoted as a mask image Mh. If the pixel (i,j) is selected as a sampling point, ; otherwise .
Numerical solution
In this section, we provide a first-order numerical solution for the optimization problem defined in Eq. (10). A major difficulty in minimizing Eq. (10) is that both the TV term and the sparseness term are non-differential l1 regularizes. We decompose the original problem into three subproblems with variable-splitting and quadratic penalty techniques. For each subproblem, efficient solution is available. Therefore, the original problem can be solved in an alternating minimization framework [30].
Experiments
In this section, we first describe a preprocessing step to register the depth camera and conventional camera as the procedure is an essential step for following experiments. Second, the experiments’ parameter configuration and the evaluation index are presented. After that, we compare the upsampling results of different sensing patterns and discuss the functions of different terms in Eq. (10). Last but not least, we conduct extensive experiments, including the synthesized data and the real
Conclusion
We had presented a new method for depth map upsampling. Based on the theory of compressive sensing, our method converts the low resolution depth maps into a set of measurements, and then formulates the upsampling task as a constrained optimization problem with data, smoothness and represent sparseness constraints. We validated our method with the Middlebury data sets, demonstrating that our method clearly outperforms previous methods under large upsampling factors and noisy inputs.
Acknowledgements
This work was supported by National Natural Science Foundation of China (Nos. 61332017, 61331018, 91338202, and 61271430)). The authors would like to thank respected anonymous reviewers for their constructive and valuable suggestions for improving the overall quality of this paper.
Longquan Dai received his B.S. degree in Electronic Engineering from Henan University of Technology, China, in 2006. He received his M.S. degree in Electronic Engineering from Shantou University, China, in 2010. Currently, he is working toward the Ph.D. degree in Computer Science at institute of automation, Chinese academy of sciences, China. His research interests lie in computer graphics, computer vision and optimization-based techniques for image analysis and synthesis.
References (35)
- et al.
An efficient 3D face recognition approach using local geometrical signatures
Pattern Recognit.
(2014) - et al.
Fusion of range and color images for denoising and resolution enhancement with a non-local filter
Comput. Vis. Image Underst.
(2010) - et al.
Sparse coding with an overcomplete basis set: a strategy employed by V1?
Vis. Res.
(1997) - et al.
A scale independent selection process for 3D object recognition in cluttered scenes
Int. J. Comput. Vis.
(2013) - et al.
Rotational projection statistics for 3D local surface description and object recognition
Int. J. Comput. Vis.
(2013) - J. Diebel, S. Thrun, An application of Markov random fields to range sensing, in: Proceedings of Conference on Neural...
- J. Kopf, M.F. Cohen, D. Lischinski, M. Uyttendaele, Joint bilateral upsampling, ACM Transa. Graph. 26 (3), ISSN...
- Q. Yang, R. Yang, J. Davis, D. Nister´, Spatial-depth super resolution for range images, in: IEEE Conference on...
- D. Chan, H. Buisman, C. Theobalt, S. Thrun, A noise aware filter for real-time depth upsampling, in: Workshop on...
- J. Park, H. Kim, Y. Tai, M.S. Brown, I. Kweon, High quality depth map upsampling for 3D-TOF cameras, in: Proceedings of...
Learning sparse representations of depth
IEEE J. Sel. Top. Signal Process.
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inf. Theory
Compressive sensing
IEEE Trans. Inf. Theory
Gradient projection for sparse reconstructionapplication to compressed sensing and other inverse problems
IEEE J. Sel. Top. Signal Process.
Cited by (2)
Depth map up-sampling with fractal dimension and texture-depth boundary consistencies
2017, NeurocomputingCitation Excerpt :In [16], a novel Stereo-Vision-Assisted (SVA) model was proposed to increase the resolution of the depth map. In [17], depth map up-sampling was formulated as a sparse signal recovery problem. However, blurry artifacts can be found at the edges.
Joint depth map interpolation using minimax paths
2015, Neurocomputing
Longquan Dai received his B.S. degree in Electronic Engineering from Henan University of Technology, China, in 2006. He received his M.S. degree in Electronic Engineering from Shantou University, China, in 2010. Currently, he is working toward the Ph.D. degree in Computer Science at institute of automation, Chinese academy of sciences, China. His research interests lie in computer graphics, computer vision and optimization-based techniques for image analysis and synthesis.
Haoxing Wang is a Ph.D. candidate in the Sino-French Laboratory (LIAMA) and National Laboratory of Pattern Recognition (NLPR) at Institute of Automation, Chinese Academy of Sciences. He received his B.S. and M.S. degrees in geographic information system from Wuhan University of Technology, China in 2006, and Beijing Normal University, China, in 2010, respectively. His research interests include photogrammetry, structure from motion techniques, optimization-based techniques for image analysis and synthesis.
Xiaopeng Zhang received his M.Sc. degree in Mathematics from Northwest University in 1987, and the Ph.D. degree in Computer Science from Institute of Software, Chinese Academy of Sciences (CAS), in 1999. He is a Professor in the Sino-French Laboratory (LIAMA) and (National Laboratory of Pattern Recognition) at Institute of Automation, CAS. His main research interests are computer graphics and pattern recognition. He received the National Scientific and Technological Progress Prize (Second Class) in 2004. Xiaopeng Zhang is a member of ACM and IEEE.