Keywords

1 Introduction

Due to the increased depth of field and light gathering relative to traditional cameras, plenoptic camera [3] are paid more and more attention. What’s more, plenoptic camera will play an important role in computer vision applications, such as refocus, occlusion removal, closed-form visual odometry and so on [4, 6, 7, 13, 22]. For the plenoptic camera, little attention has been paid on the calibration of light field, which is the most basic and essential part.

Plenoptic cameras includes several types: microlens array, mask-based cameras, freeform collections of cameras [9, 16, 18,19,20,21]. Ng [18] proposes a hand-held light field camera by placing a micro-lens array in front of the camera sensor. For each micro-lens, a tiny sharp image of the len aperture can be obtained and the directional distribution of incoming rays through it can be estimated. Based on Ng’s model, Georgiev and Lumsdaine [9] make some modifications, that is, the micro-lens array is interpreted as an imaging system lying in the focal plane of the main camera lens. Compared with Ng’s model, this system can capture a light field image with higher spatial resolution and lower angular resolution. Based on these methods by a micro-lens array, commercial light-field cameras such as Lytro [1] and Raytrix [2] have been released.

There are a lot of work about the calibration of grids or freeform collections of multiple cameras [19]. However, these methods introduce more degrees of freedom, which are not necessary to describe the microlens-based plenoptic camera. Based on the ray transfer matrix analysis, Georgiev et al. [10] derive a plenoptic camera model. For the plenoptic camera model, Johannsen et al. [15] propose a metric calibration method through a dot pattern with a known grid size, and a depth distortion correction for the focused light-field cameras. In more details, Dansereau et al. [8] describe a decoding, calibration and rectification procedure for lenselet-based plenoptic cameras. They derive a homogeneous 5D intrinsic matrix relating each recorded pixel to its corresponding ray in 3D space, which can be applied to the commercial Lytro light field. Bok et al. [5] present a novel method of geometric calibration of micro-lens-based light-field cameras using line features. Instead of using sub-aperture images, they utilize raw images directly for calibration. Inspired by this work [8, 22], based on the image formation for pinhole camera and the theory of light propagation, we model the imaging process of Lytro light field, and drive a novel intrinsic matrix.

In this paper, we derive a imaging model of Lytro light field camera, and present a calibration method using the checkboard grid. In theory, the microlens array can be modeled as an array of pinholes and the main len can be modeled as a thin len. According to these assumptions, the imaging model and the intrinsic matrix of Lytro light field camera are derived, which represent the relationship between each pixel and its corresponding light field. Then, based on the derived intrinsic matrix, we propose a calibration algorithm using the checkerboard grid for Lytro light field camera. At last, experiments have validated the correctness of our imaging model of Lytro light field camera and the effectiveness of the proposed calibration algorithm.

This paper is organized as follows: Sect. 2 reviews the image formation for pinhole camera and the representation of 4D light field. Section 3 presents a imaging model for Lytro light field camera and derive a intrinsic matrix. In Sect. 4, the calibration method based on the checkerboard grid for Lytro light field camera is described in detail. Experimental results are shown in Sect. 5. Finally, Sect. 6 presents some conclusion remarks.

2 Preliminaries

A bold letter denotes a vector or a matrix. Without special explanation, a vector is homogenous coordinates. In the following, we briefly review the image formation for pinhole camera in [14], and the representation of 4D light field in [11, 17].

2.1 Pinhole Camera

Let the intrinsic parameter matrix of the pinhole camera be

$$\begin{aligned} \mathbf {K}_{c}= \left( \begin{array}{ccc} r_{c}f_{c} &{} s &{} u_{0} \\ 0 &{} f_{c} &{} v_{0} \\ 0 &{} 0 &{} 1 \end{array}\right) , \end{aligned}$$

where \(r_c\) is the aspect ratio, \(f_c\) is the focal length, \(\left( u_0,v_0,1\right) ^T\) denoted as \(\mathbf {p}\) is the principal point, and s is the skew factor. Under the pinhole camera model, a space point \(\mathbf {M}\) is projected to its image point \(\mathbf {m}\) by [14]

$$\begin{aligned} \mathbf {m} = \lambda \mathbf {K}_c \left[ \mathbf {R}, \mathbf {t}\right] \mathbf {M}, \end{aligned}$$
(1)

where \(\lambda \) is a scalar, \(\left[ \mathbf {R}, \mathbf {t}\right] \) includes a rotation matrix and a translation, \(\mathbf {K}_c \) is the intrinsic matrix.

2.2 The Representation of 4D Light Field

The light field is a vector function that describes the amount of light flowing in every direction through every in space. The direction of each ray is given by the 5D plenoptic function, and the magnitude of each ray is given by the radiance. Instead of 5D plenoptic function, Gortler et al. [11] sample and reconstruct a 4D function, which is called a Lumigraph. The Lumigraph is a subset of the complete plenoptic function.

Levoy and Hanrahan [17] propose a new representation of the light field, the radiance as a function of position and direction, in regions of space free of occluders (free space). In free space, the light field is a 4D, not a 5D function. As shown in Fig. 1, the coordinate system on the first plane is (uv) and on the second plane is (st). An oriented line is defined by connecting a point on the uv plane to a point on the st plane. Thus, the light field can be represented through two parallel planes or a 4D vector \((u,v,s,t)^{T}\). That is, when the light passes through the first plane, record its position information \((u,v,1)^{T}\); when the light passes through the second plane, record its direction information \((s,t)^{T}\).

Fig. 1.
figure 1

The representation of 4D light field.

3 Imaging Model of Lytro Light Field Camera

In theory, though each pixel of a plenoptic camera integrates light from a volume, each pixel is approximated as integrating along a single ray [12]. In additional, the microlens array is modeled as an array of pinholes. Based on these assumptions, in this section, according to the theory of light propagation, we present the imaging model of Lytro light field camera, and derive its intrinsic matrix, which represents the relationship between each pixel and its corresponding light field.

Fig. 2.
figure 2

The image formation for Lytro light field.

For Lytro light field camera, the microlens array is hexagonally packed. As shown in Fig. 2, microlens array is placed on the imaging plane of the conventional camera main lens. Then, optically sensitive device (CCD or CMOS) is parallel to the microlens array, and the distance between them is the focal length of the microlens array. At last, by the microlens array, the light is mapped to the corresponding area on the optically sensitive device.

Assume that the microlens array includes \(W \times W^{'}\) micronlens. Denote the image plane of one microlen (kl) be \(\mathbf {U}_{mic(k,l)}\). Set up the coordinate systems on the microlen (kl) and its image plane \(\mathbf {U}_{mic(k,l)}\) respectively, as shown in Fig. 3. In the image coordinate system, the image center \(\mathbf {o}_{mic(k,l)}\) as the origin;the line through the origin and orthogonal to the image plane as \(z_{mic(k,l)}\), which points to the optically sensitive device. The microlen coordinate system is parallel to the image coordinate system and the origin is the optical center \(\mathbf {O}_{mic(k,l)}\).

Fig. 3.
figure 3

The image format for the microlens (kl).

With the development of camera manufacturing technology, there are same parameters for each microlen in the microlens array. What’s more, the aspect ratio is 1, the skew factor is 0 and the principal point is close to the image center. Thus, the principal point can be estimated through the image center. Therefore, let the intrinsic matrix of the microlen be

$$\begin{aligned} \mathbf {K}_{mic} = \left( \begin{array}{ccc} f_{mic} &{} 0 &{} u^{0}_{mic}\\ 0 &{} f_{mic} &{} v^{0}_{mic}\\ 0 &{} 0 &{}1 \end{array}\right) . \end{aligned}$$

According to the pinhole camera imaging model (Eq. (1)), under the microlen coordinate system, the coordinate of the image point \(\mathbf {m}^{i}_{mic(k,l)}\) is

$$\begin{aligned} \mathbf {M}^{i}_{mic(k,l)} = \mathbf {K}_{c}^{-1}\mathbf {m}^{i}_{mic(k,l)} \end{aligned}$$
$$\begin{aligned} =\left( \begin{array}{ccc} \frac{1}{f_{mic}}&{}0&{}-\frac{u^{0}_{mic}}{f_{mic}}\\ 0&{}\frac{1}{f_{mic}}&{}-\frac{v^{0}_{mic}}{f_{mic}}\\ 0&{}0&{}1 \end{array}\right) \left( \begin{array}{c} u_{i}\\ v_{i}\\ 1 \end{array}\right) . \end{aligned}$$

where \((u_{i},v_{i},1)^{T}\) is the coordinate of the image point \(\mathbf {m}^{i}_{mic(k,l)}\) in the image coordinate system \(\{\mathbf {o}_{mic(k,l)}-x_{mic(k,l)},y_{mic(k,l)},z_{mic(k,l)}\}\).

Because the microlen (kl) is parallel to its image plane \(\mathbf {U}_{mic(k,l)}\), by the representation of 4D light field, the homogeneous representation of the corresponding light field of the image point \(\mathbf {m}^{i}_{mic(k,l)}\) is

$$\begin{aligned} L_{m^{i}_{mic(k,l)}} = (0_{mic(k,l)},0_{mic(k,l)},\frac{u_{i}}{f_{mic} },\frac{v_{i}}{f_{mic}},1)^{T}, \end{aligned}$$
(2)

where \(\mathbf {O}_{mic(k,l)}=(0_{mic(k,l)},0_{mic(k,l)},1)^{T}\) records the location of the light, which is the optical center of the microlen (kl), and \(\mathbf {M}^{i}_{mic(k,l)} \) records the direction of the light.

The light field \(L_{m^{i}_{mic(k,l)}}\) (Eq. (2)) is equivalent to

$$\begin{aligned} L_{m^{i}_{mic(k,l)}} =\left( \begin{array}{ccccc} 0 &{} 0 &{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}0\\ 0&{}0&{}\frac{1}{f_{mic}}&{}0&{}-\frac{u^{0}_{mic}}{f_{mic}}\\ 0&{}0&{}0&{}\frac{1}{f_{mic}}&{}-\frac{v^{0}_{mic}}{f_{mic}}\\ 0&{}0&{}0&{}0&{}1\\ \end{array}\right) \left( \begin{array}{c} 0_{mic(k,l)}\\ 0_{mic(k,l)}\\ u_{i}\\ v_{i}\\ 1 \end{array}\right) \end{aligned}$$
$$\begin{aligned} =\mathbf {L}(\mathbf {K}_{mic})\times \mathbf {n}^{i}_{mic(k,l)}. \end{aligned}$$
(3)

In the following, we set up a microlens array coordinate system, as shown in Fig. 4. Denote the light field on the micronlens array as \(L_{mic}\) and the light field on the micronlen (kl) as \(L_{mic(k,l)}\). Assume that the distance between optical centers of two adjacent microlens along \(X_{mic}\) is d, then the optical center \(\mathbf {O}_{(k,l)}\) of the microlen (kl) under the microlens array coordinate system is

$$\begin{aligned} \mathbf {O}_{(k,l)}=\left( \begin{array}{c} (k-\frac{W^{'}+1}{2})d\\ (l-\frac{W+1}{2})d\\ 1 \end{array}\right) =\left( \begin{array}{ccc} d &{} 0&{} -\frac{W^{'}+1}{2}d\\ 0&{}d &{}-\frac{W+1}{2}d\\ 0&{}0&{}1\end{array}\right) \left( \begin{array}{c} k\\ l\\ 1 \end{array}\right) . \end{aligned}$$
(4)
Fig. 4.
figure 4

The coordinate system on the microlens array.

Therefore, in the light field \(L_{mic}\), the corresponding light field of the image point \(\mathbf {m}^{i}_{mic(k,l)}\) changes to

$$\begin{aligned} L_{m^{i}_{mic}} = V_{mic}(\mathbf {O}_{mic(k,l)})\times \mathbf {L}(\mathbf {K}_{mic})\times \mathbf {n}^{i}_{mic(k,l)}, \end{aligned}$$
(5)

where

$$\begin{aligned} \mathbf {V}_{mic}(\mathbf {O}_{mic(k,l)}) = \left( \begin{array}{ccccc} 1&{}0&{}0&{}0&{}(k-\frac{W^{'}+1}{2})d\\ 0&{}1&{}0&{}0&{}(l-\frac{W+1}{2})d\\ 0&{}0&{}1&{}0&{}0\\ 0&{}0&{}0&{}1&{}0\\ 0&{}0&{}0&{}0&{}1 \end{array}\right) . \end{aligned}$$

Denote \( \hat{\mathbf {n}}^{i}_{mic(k,l)}= (k,l,u_{i},v_{i},1)^{T}\), by Eqs. (3) and (4), Eq. (5) is equivalent to

$$\begin{aligned} L_{m^{i}_{mic}}= \left( \begin{array}{ccccc} d &{} 0&{}0&{}0&{}-\frac{W^{'}+1}{2}d\\ 0&{}d&{}0&{}0&{}-\frac{W+1}{2}d\\ 0&{}0&{}\frac{1}{f_{mic}}&{}0&{}-\frac{u^{0}_{mic}}{f_{mic}}\\ 0&{}0&{}0&{}\frac{1}{f_{mic}}&{}-\frac{v^{0}_{mic}}{f_{mic}}\\ 0&{}0&{}0&{}0&{}1 \end{array}\right) \left( \begin{array}{c} k\\ l\\ u_{i}\\ v_{i}\\ 1\end{array}\right) =\mathbf {K}_{light}\times \hat{\mathbf {n}}^{i}_{mic(k,l)}, \end{aligned}$$
(6)

where kl are the indices of the microlens through which a light passes, and \(\mathbf {K}_{light}\) is called the intrinsic matrix of Lytro light field camera.

4 Calibration of Lytro Light Field

In this section, based on the intrinsic matrix \(\mathbf {K}_{light}\) of Lytro light field camera, we propose a calibration method using the checkerboard pattern with known dimensions.

For the microlen (kl), let \(\mathbf {m}^{ij}_{mic(k,l)}\) be the extracted corner points on the checkerboard image plane \(\mathbf {U}_{mic(k,l)}\), \(i = 1,2,\ldots ,N_{mic(k,l)}\), \(j = 1,2,\ldots ,\hat{N}\), \(k = 1,2,\ldots ,W\) and \(l=1,2,\ldots ,W^{'}\). \(N_{mic(k,l)}\) refers to the number of extracted corner points on the checkerboard image plane \(\mathbf {U}_{mic(k,l)}\), and \(\hat{N}\) refers to the number of the checkerboard images taken by Lytro light field. According to the camera calibration method proposed in [23], we can obtain the \(W \times W^{'} \times \hat{N}\) homography matrix \(\mathbf {H}^{j}_{mic(k,l)}\), and the \(W \times W^{'} \times \hat{N}\) equations about the intrinsic parameters of the microlen \(\mathbf {K}_{mic}\). Then, through SVD decomposition, the intrinsic matrix \(\mathbf {K}_{mic}\) can be estimated.

What’s more, based on the estimated intrinsic matrix \(\mathbf {K}_{mic}\), we can compute the rotation matrix \(\mathbf {R}^{j}_{mic(k,l)}\) and the translation vector \(\mathbf {t}^{j}_{mic(k,l)}\). Therefore, the distance d between optical centers of two adjacent microlens along \(X_{mic}\) can be estimated as follows:

$$\begin{aligned} d = \frac{1}{\hat{N} \times W \times W^{'}}\sum ^{\hat{N}}_{j=1}\sum ^{W^{'}}_{l=1}\sum ^{W-1}_{k=1}{\parallel }\mathbf {t}^{j}_{mic(k,l)}-\mathbf {t}^{j}_{mic(k+1,l)}{\parallel }. \end{aligned}$$
(7)

The method to calibrate the Lytro light field is outlined as follows:

  • Step 1: Take a few images \(\hat{N}\) of the checkerboard under different orientations by moving either the plane or Lytro light field camera;

  • Step 2: For each checkerboard image, extract the corner points \(\mathbf {m}^{ij}_{mic(k,l)}\), where \(i = 1,2,\ldots ,N_{mic(k,l)}\), \(j = 1,2,\ldots ,\hat{N}\), \(k = 1,2,\ldots ,W\) and \(l=1,2,\ldots ,W^{'}\);

  • Step 3: Estimate the intrinsic matrix \(\mathbf {K}_{mic}\), the rotation matrix \(\mathbf {R}^{j}_{mic(k,l)}\) and the translation vector \(\mathbf {t}^{j}_{mic(k,l)}\) by the method in [23];

  • Step 4: Decode the image points \(\mathbf {m}^{ij}_{mic(k,l)}\) into 4D light field \(\mathbf {\hat{n}}^{i}_{mic(k,l)}\) by the method in [8];

  • Step 5: Obtain the distance d between optical centers of two adjacent microlens along \(X_{mic}\) by Eq. (7);

  • Step 6: Calibrate the intrinsic matrix of Lytro light field \(\mathbf {K}_{light}\) by Eq. (6);

  • Step 7: Initial the light field camera pose \(\mathbf {T}^{j} = (\mathbf {R}^{j},\mathbf {t}^{j})\) as follows:

    $$\begin{aligned} \mathbf {R}^{j} = \frac{1}{W \times W^{'}} \sum ^{W^{'}}_{l=1}\sum ^{W}_{k=1}\mathbf {R}^{j}_{mic(k,l)}, \mathbf {t}^{j} = \frac{1}{W \times W^{'}} \sum ^{W^{'}}_{l=1}\sum ^{W}_{k=1}\mathbf {t}^{j}_{mic(k,l)}; \end{aligned}$$
    (8)
  • Step 8: Optimize the intrinsic matrix \(\mathbf {K}_{light}\) of Lytro light field and the camera pose \(\mathbf {T}^{j}\), \(j = 1,2,\ldots ,\hat{N}\), through minimizing the following object function:

    $$\begin{aligned} argmin_{\mathbf {K}_{light},\mathbf {T}} = \sum _{j=1}^{\hat{N}}\sum _{i=1}^{N_{mic(k,l)}} \Vert \phi _{c}^{i}(\mathbf {K}_{light},\mathbf {T}^{j}),\mathbf {P}^{ij}\Vert ^{ray}, \end{aligned}$$

    where \(\mathbf {P}^{ij}\) refers to the known 3D feature location on the checkerboard grid, the light field camera pose \(\mathbf {T} = (\mathbf {T}^{1}, \mathbf {T}^{2},\ldots ,\mathbf {T}^{\hat{N}})^{T}\), and \(\Vert \Vert ^{ray}\) is the ray reprojection error described in [8].

5 Experiments

In this section, we perform a number of experiments with real images to evaluate the correctness of the imaging model of Lytro light field camera and the performance of our calibration algorithm.

Firstly, make the checkerboard grid, which the grid space is \(7.22\,\mathrm{mm} \times 7.22\,\mathrm{mm}\) and the board size is \(19 \times 19\). As shown in Fig. 5, the checkerboard grid images are captured under different orientations by moving the plane or Lytro light field camera, which are downloaded from http://www-personal.acfr.usyd.edu.au/ddan1654/PlenCalCVPR2013DatasetC.zip. There are 12 checkerboard grid images are used to calibrate the intrinsic matrix of Lytro light field camera.

Fig. 5.
figure 5

Twelve real images of the checkerboard captured by Lytro light field.

Next, using the calibration algorithm proposed in Sect. 4, the intrinsic matrix \(\mathbf {K}_{light}\) of Lytro light field and the camera pose \(\mathbf {T}^{j}\), \(j = 1,2,\ldots ,\hat{N}\) are estimated. Figure 6 shows the estimated light field camera pose. In Fig. 7, the left image refers to the initial estimation of light field camera pose by Eq. (8), and the right image refers to the optimized estimation of Lytro camera pose through minimizing the ray reprojection error.

Fig. 6.
figure 6

(a): The gray refers to the initial estimation of Lytro camera pose; (b): the green refers to the optimized estimation of Lytro camera pose. (Color figure online)

At last, integrating the lens distortion model proposed in [8], the estimated intrinsic matrix \(\mathbf {K}_{light}\) is used to rectify the light field images, as shown in Figs. 7 and 8. Figure 7(a) shows the initial image of the checkerboard grid captured by Lytro light field camera, and Fig. 7(b) shows the rectified image of the checkerboard grid based on the estimated intrinsic matrix \(\mathbf {K}_{light}\). Figure 8(a) is the initial light field image captured in the natural environment, and Fig. 8(b) is the rectified real image through the estimated intrinsic matrix \(\mathbf {K}_{light}\). From Figs. 7 and 8, especially from the enlarged part in Fig. 8, we can be seen that the initial light field image is well rectified. Therefore, our imaging model of Lytro light field is correct and the proposed calibration algorithm is effective.

Fig. 7.
figure 7

(a) The initial image of the checkerboard grid under Lytro light field camera, (b) the rectified image of the checkerboard grid using the estimated intrinsic matrix.

Fig. 8.
figure 8

(a) The initial light field image captured in the natural environment, (b) the rectified real image through the estimated intrinsic matrix.

With the development of camera manufacturing technology, there are same parameters for each microlen in the microlens array. What’s more, the some of the intrinsic parameters can be known in advance, such as: the aspect ratio \(r_{mic}\) is 1; the skew factor \(s_{mic}\) is 0 and the principal point \(\mathbf {p}_{mic} = (u^{0}_{mic},v^{0}_{mic},1)^{T}\) can be estimated by the image center. Based on these assumptions, our proposed calibration algorithm is compared with the method proposed in [8]. The SSE (Sum of the Squares Error) and RMSE (Root Mean Square Error) ray reprojection errors are shown in Table 1. The comparison results verify the correctness of our imaging model for Lytro light field camera and the effectiveness of the proposed calibration algorithm. Although our SSE and RMSE ray reprojection errors are almost the same as those of [8], our imaging model and calibration method have two advantages. That is, our imaging model is easy to understand and the intrinsic matrix has a smaller number of parameters than [8]. When the aspect ratio is 1 and the skew factor is 0, our derived intrinsic matrix only has 4 parameters. But the intrinsic matrix of Lytro light field camera proposed in [8] has 12 parameters.

Table 1. The comparison results.

6 Conclusions

In this paper, we model the image processing of Lytro light field and propose a calibration algorithm based on the checkerboard grid. Firstly, according to the imaging process of Lytro light field, the relationship between each pixel and its corresponding light field are derived. Secondly, the homogenous intrinsic matrix of Lytro light field is established. Secondly, based on the intrinsic matrix, a calibration algorithm by the checkerboard grid for Lytro light field camera is proposed. Then, there are 12 checkerboard grid images under different orientations are used to estimate the intrinsic matrix. At last, the camera pose and the rectified images are obtained based on the estimated intrinsic matrix to evaluate the correctness of our imaging model and the effectiveness of our calibration method. The experimental results have shown that our imaging model of Lytro light field camera is correct and the proposed calibration algorithm is effective.