Elsevier

Image and Vision Computing

Volume 28, Issue 9, September 2010, Pages 1377-1385
Image and Vision Computing

Depth reconstruction uncertainty analysis and improvement – The dithering approach

https://doi.org/10.1016/j.imavis.2010.03.003Get rights and content

Abstract

The depth spatial quantization uncertainty is one of the factors which influence the depth reconstruction accuracy caused by a discrete sensor. This paper discusses the quantization uncertainty distribution, introduces a mathematical model of the uncertainty interval range, and analyzes the movements of the sensors in an Intelligent Vision Agent System. Such a system makes use of multiple sensors which control the deployment and autonomous servo of the system. This paper proposes a dithering algorithm which reduces the depth reconstruction uncertainty. The algorithm assures high accuracy from a few images taken by low-resolution sensors. The dither signal is estimated and then generated through an analysis of the iso-disparity planes. The signal allows for control of the camera movement. The proposed approach is validated and compared with a direct triangulation method. The simulation results are reported in terms of depth reconstruction error statistics. The physical experiment shows that the dithering method reduces the depth reconstruction error.

Introduction

The human ability to process visual information may be extended with the help of some advanced technologies. The Intelligent Vision Agent System, IVAS, is one such high-performance autonomous distributed vision and information processing system [1]. The system collects data in order to reconstruct 3D information for security, health care, medical and surveillance applications, and so on. It focuses on the important and informative parts of a visual scene by dynamically controlling the pan–tilt–zoom of a stereo pair. For such a system, the critical problem is to find the optimal configurations of sensors and to gain the required reconstruction accuracy. By adjusting the stereo pair’s profile, such as baseline, convergence angle, focal lengths and pixel size, the depth reconstruction accuracy can be improved [2], [3]. When determining the accuracy of 3D reconstruction, depth spatial quantization is one of the most influential factors. This type of factor cannot be reduced even in more accurate measurements. For instance, increasing the image resolution by reducing the sensor pixel size is of limited use since the signal-to-noise ratio, SNR, is then reduced, and also because of the restricted sensitivity of the sensor itself. The selection of an optimal sensor pixel size is discussed by Chen et al. [4].

How to reconstruct a super-resolution image from the low-resolution images has been the focus of much research in recent years. To overcome the digital camera sensor pixel size limitation, attempts have been made to combine the information from a set of slightly different low-resolution images of the same scene and use them to construct a higher-resolution image. The methods used can be categorized as belonging to frequency domain analysis, statistical analysis, and the geometrical interpolation approach [5], [6]. Francisco and Bergholm introduced a method whereby the stereo vergence could be changed by shifting the camera sensor chip [7]. Also, Ben-Ezra et al. proposed a jitter camera to minimize the motion blur in the reconstruction of video super-resolution [8]. The quantization uncertainty of depth reconstruction for parallel and convergence stereo cameras is discussed by Chen et al. [9]. However, there has been little work on reducing the depth reconstruction uncertainty based on a combination of low-resolution images. In this paper, the dithering algorithm combines dithering and iso-disparity analyses to find the optimal number of low-resolution images used to reduce the depth reconstruction uncertainty.

Dithering is a well-known technique that is applied in analogue-to-digital converters, ADCs. To control the output error, a proper dither signal is added to the sequence before the quantizer. A mathematical model of dithered quantization was introduced by Wannamaker et al. [10], [11], Wagdy [12], and Carbone and Petri [13], and demonstrated that the system resolution decreased below the least significant bit. Although the ADCs analysis is performed in the time domain, our case requires that the depth reconstruction is analyzed, and the dither signal added, in the space domain.

For instance, many astronomical cameras use the dithering method, which in this case involves multiple exposures of the same field with small shifts between the exposures. This feature enables reconstruction of some of the information which has been lost because of the digital camera’s discretization [14]. In relation to this, Yang [15] has addressed the problem that the quantization error becomes serious when the size of the pixel is significant compared to the allowable measurement tolerance. In his paper, the error analysis and sensor position optimization for the inspection of the edge line are explored. Liu and Ehrich [16] also use dithering to locate a subpixel edge in a binary image. Klarquist and Bovik [17] presented a vergent active stereo vision system to recover a high depth resolution by accumulating and integrating a multiresolution map of the surface depth over multiple successive fixations.

In general, pixels of camera sensor are uniformly distributed in a two-dimensional array. The projection of each 3D point in the scene is approximated to the center of the nearest pixel; the resulting error is referred to as a quantization error. In stereo, the quantization error generates an uncertainty in the depth estimation at each 3D point. Basu and Shabi [18] have introduced a model using stereo cameras with a non-uniform resolution sensor based on an optimal estimation of the 3D points’ locations. Furthermore, Kil et al. [19] have used a laser scanner to reconstruct a high resolution 3D image of the target surface using hundreds of lower resolution scans as inputs. The lower resolution scans are randomly shifted, so that each of them has contributed information to the final model. The limitation of this approach is that a huge number of scans are required as inputs, and the improvement in accuracy cannot be controlled. Unlike Kil et al.’s approach, the algorithm proposed in this paper can control the depth reconstruction uncertainty. Farid [20] introduced a method whereby the depth can be determined by placing a variable opacity optical attenuation mask directly in front of a camera lens. The small continuum of viewpoints’ change can be achieved by the choice of non-uniformed masks as opposed to the discrete views obtained by shifting the cameras. The fast computation and the avoidance of correspondence points matching calculation that characterize the proposed approach can be amenable to a real-time implementation of the proposed algorithm based on shifting cameras in future work.

Section snippets

Problem statement

At least two images are needed to obtain a depth map of the world, the accuracy of which is limited by the camera sensor resolution. When a digital camera takes an image, the scene perspective is projected onto a sensor plane. The sensor elements are arranged into two-dimensional arrays to represent the scene. The coordinates of the image are discrete and the resolution is in pixels. This leads to a depth reconstruction quantization uncertainty in the representation of the spatial position of a

Problem analysis

Depth reconstruction may be calculated from stereo pair cameras with an accuracy determined by the system configuration, which is defined by the sensor pixel resolution Δ (a square pixel of the size Δ × Δ), the focal length f, and the baseline length, B, for a general parallel stereo pair. To get a more accurate depth reconstruction, the stereo configuration can be adjusted within its limits. Since the quantization uncertainty is caused by the properties of a digital sensor, a dithering method is

Implementation

From Fig. 5, it can be seen that the dither signals dli and dri control the position of the left and right cameras. The target point projections xli and xri correspond to the ith dither position of the left and right camera respectively and the quantized signals are xQli and xQri for the left and right image respectively. Furthermore, we can now calculate the target depth information by averaging the depths of all possible disparities di of the stereo pairs. The arithmetic average of all the

Results

In this section, we describe two case studies where we test the mathematical models outlined above. The first study is a simulation experiment and the second is a physical experiment. The case studies both illustrate how the depth reconstruction uncertainty in stereo coverage is reduced by the dithering algorithm. The results of the studies are compared with a conventional direct triangular method [23]. The simulation experiment was performed in MATLAB 7.0. The Epipolar Geometry Toolbox [24]

Conclusion

One way of resolving the limitations of sensor sensitivity and image resolution due to SNR is to acquire a super-resolution image from a low-resolution sensor. Through this process, it is also possible to reduce the spatial quantization caused by a discrete sensor. Super-resolution can be obtain by combining the information from a set of slightly different low-resolution images of the same scene. The dithering method can be applied to the model and control the movement of the cameras used to

Acknowledgements

The authors wish to acknowledge Prof. Stefan Andersson-Engels, Pontus Svenmarker and Haiyan Xie at the Division of Atomic Physics, Lund University, Sweden for lending their expertise regarding the necessary laboratory equipment. The authors would also like to thank Dr. Benny Lövström at Blekinge Institute of Technology, Sweden and Dr. Fredrik Bergholm for valuable discussion and support. Finally, we would like to thank Dr. Johan Höglund for his comments.

References (24)

  • P. Carbone et al.

    Mean value and variance of noisy quantized data

    Elsevier Measurement

    (1998)
  • W. Kulesza et al.

    Arrangement of a multi stereo visual sensor system for a human activities space

  • J. Chen, S. Khatibi, W. Kulesza, Planning of a multi stereo visual sensor system for a human activities space, in:...
  • J. Chen, S. Khatibi, W. Kulesza, Planning of a multi stereo visual sensor system depth accuracy and variable baseline...
  • T. Chen, P. Catrysse, A. Gamal, B. Wandell, How small should pixel size be? in: Proc. of SPIE on Sensors and Camera...
  • S.C. Park et al.

    Super-resolution image reconstruction: a technical overview

    IEEE Signal Processing Magazine

    (2003)
  • P. Vandewalle et al.

    A frequency domain approach to registration of aliased images with application to super-resolution

    EURASIP Journal on Applied Signal Processing

    (2006)
  • A. Francisco et al.

    On the importance of being asymmetric in stereopsis—or why we should use skewed parallel cameras

    International Journal of Computer Vision

    (1998)
  • M. Ben-Ezra et al.

    Video super-resolution using controlled subpixel detector shifts

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2005)
  • J. Chen, S. Khatibi, J. Wirandi, W. Kulesza, Planning of a multi stereo visual sensor system for a human activities...
  • R.A. Wannamaker, The Theory of Dithered Quantization, PhD Thesis, The University of Waterloo, Canada,...
  • R.A. Wannamaker et al.

    A theory of nonsubtractive dither

    IEEE Transactions on Signal Processing

    (2000)
  • Cited by (7)

    View all citing articles on Scopus
    View full text