Warping and Blending Enhancement for 3D View Synthesis Based on Grid Deformation

Hu, Ningning; Zhao, Yao; Bai, Huihui

doi:10.1007/978-3-319-71598-8_39

Warping and Blending Enhancement for 3D View Synthesis Based on Grid Deformation

Ningning Hu¹⁶,
Yao Zhao¹⁶ &
Huihui Bai¹⁶

Conference paper
First Online: 30 December 2017

2080 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10668))

Abstract

This paper proposes an efficient view synthesis scheme based on image warping, which uses grid mesh deformation to guide the mapping process. Firstly as the first contribution we use moving least squares algorithm to get the initial warping position of the reference image. And then as the second contribution a novel grid line constraint is added to the energy equation predefined in a typical image domain warping algorithm which is proposed by Disney Research. Finally, as the third contribution we propose an novel image blending method based on correlation matching to directly solve the stretch problem emerged in image border of the final synthesis result. Experimental results show that our proposed method can get a better visual quality just in image space only, which is a significant advantage compared to the state-of-art view synthesis method who needs not only the corresponding depth maps but also the additional depth information and camera intrinsic and extrinsic parameters.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Multimedia technology has undergone an unprecedented evolution in the new century especially in the content display field. Nowadays stereo 3D (S3D) is becoming a mainstream consumer production in both cinema and home with its immersive 3D experience [1]. However, the indispensible glasses serve as an unbridgeable gap restricting its further popularization in daily life. Multiview auto stereoscopic display (MAD) supports the motion parallax viewing in a limited range with the superiority of glasses free [2].

Considering the existing device capabilities it is unrealistic to directly capture, store and transmit the huge data required by multiple views. Based on the novel 3D format multi view plus depth (MVD), C. Fehn proposed depth-image-based rendering (DIBR) as a view synthesis method, which can generate N-views from M-views (M < N) [3]. The accurate final projection location stems from the mapping process from pixel to world to image coordinate system using camera in-extrinsic parameters and depth maps. But the inaccurate estimation of depth map would lower the synthesis quality. What’s more this discrete projection method can easily generate occlusion and disocclusion regions.

To solve the problem exposed by DIBR, Disney Research developed a special view synthesis scheme called Image-Domain-Warping (IDW) [4] based on nonlinear disparity mapping algorithm [5], which conducts mapping process in image space directly. The original image is covered by a regular grid mesh $ G(V,E,Q) $ with vertices $ V $, edges $ E $, quads $ Q $ [6]. It employs a global optimization to obtain a warping function which protects the spatial structure of saliency scenes by deforming homogeneous areas, which is widely used in image retargeting [8]. Because of the continuity mapping there are no holes in IDW as DIBR shows. Nonetheless, Lucas-Kanade (L-K) [11] applied to conducting stereo matching for feature points is a time consuming process. And there is an obvious stretch effect on image border of the synthesis view if the disparity is too large.

Inspired by IDW, our paper proposed a simpler warping scheme which aims to reduce the computational complexity in the energy equation and deal with the stretch effect on the image border. SIFT [12] feature points is only used in stereo matching, where moving least squares is used to utilize every feature points for every grid vertices weighted to make full use of SIFT information. We also introduce a grid-line constraint [8] instead of extracting additional vertical edge-points. Finally, a novel blending algorithm is proposed to solve the stretch effect on the left or right image border. This novel view synthesis method enables to have a suitable visual quality with high efficiency.

2 Proposed Method

To start with our illustration, it is necessary to give a brief explanation of the basic IDW theory. It is worth nothing that they cover the original image with a regular mesh $ G $ to reduce the solution space to grid vertices $ V $ in case of a too large system of linear equations. Differently from DIBR who uses dense depth estimation, IDW takes advantage of the sparse feature points $ P $ extracted from the reference image. These points are used to assign the supposed location in the synthesis image. Saliency map $ S $ is also used to guide the deformation to the non-saliency regions. An energy equation $ E(w) $ is defined that consists of three constraints: data term $ E_{d} $, spatial smooth term $ E_{s} $ and temporal smooth term $ E_{t} $. $ w $ represents the warped function for each grid vertex. Each term is weighted with a parameter:

$$ E(w) = \lambda_{d} E_{d} (w) + \lambda_{s} E_{s} (w) + \lambda_{t} E_{t} (w) $$

(1)

By minimizing the energy function the warp defined at the regular grid vertices is computed out. For the non-grid positions, the warp is obtained by bilinear interpolation algorithm. Finally, a synthesis image is rendered using the calculated warp function.

In this paper, based on IDW described above, we propose an improving view synthesis scheme as Fig. 1 shows. After extracting the feature points and saliency map from the input images, we use moving least squares (MLS) to get the warp position for each grid vertex in order to make full use of the feature points. We also add a grid line constraint to keep the grid line from over-bending. Iterative optimization is applied to obtain the final results. Image blending is dispensable to solve the stretch effect in the image border. Each step is stated in the following sections.

2.1 Image Warping

MLS is a classical algorithm which is widely used in image information [13, 14]. A set of control points with its positions after deformation are previously assigned and then the mission is going to find out the exact location for other points of the image.

In our scheme, as left and right views as input images our goal is to obtain the suitable $ w{ - }l $ and $ w{ - }r $ for the two reference images, respectively (hereinafter we use $ w $ as a unified statement). First sparse SIFT point set $ P $ whose outliers excluded by RANSAC [15] is obtained location info indicates the disparity between corresponding point pairs. We use this point set $ P(p_{l} , \, p_{r} ) $ as our control points in MLS algorithm. For a view located in the middle of the two input views, its disparity with the left or right view is:

$$ d = \frac{{p_{r} - p_{l} }}{2} $$

(2)

So the deformation location $ Q(q_{l} , \, q_{r} ) $ can be calculated as:

$$ q_{l} = p_{l} + d $$

(3)

$$ q_{r} = p_{r} - d $$

(4)

It is because our warp $ w $ is defined at grid vertices that we need to propagate the information of the feature points to the mesh vertices to achieve the final warp function. According to MLS, we need to solve for the best $ w $ for every grid vertex which guarantees to warp every $ p $ to $ q $. The $ E_{d} $ constraints can be formulized as:

$$ E_{d} (w) = \sum\limits_{i = 1}^{V} {\sum\limits_{j = 1}^{P} {f_{ij} } } | |w_{i} (p_{j} ) - q_{j} ||^{2} $$

(5)

where $ p_{j} $ and $ q_{j} $ are the original and final feature points. $ f_{ij} $ is the weight factor for every $ p_{j} $ distributing to $ v_{i} $:

$$ f_{ij} = \frac{1}{{\left\| {p_{j} - v_{i} } \right\|^{2} }} $$

(6)

Particularly, we apply the image deformation based MLS algorithm to get the initial warp function instead of putting it to the energy equation with other constraints. This decision not only makes full use of the efficient MLS, but also simplifies the large scale linear equations. The rigorous mathematical deduction is showed in [14].

In [2, 4, 16], besides SIFT points they introduce vertical edge-points which behave image structure well. L-K can estimate these disparities exactly but time-consuming. To protect the image spatial structure without an over time-consuming, we introduce a new grid line energy term to prevent the grid lines from serious bending, since the salient objects may occupy multiple connected quads [8]. It can retain the edge orientations well. Figure 2 gives a comparative result. Figure 2(a) suffer edge deformation serious, especially in the holder of the second ball. Meanwhile Fig. 2(b) keeps the holder from getting a bending deformation by adding the grid line constraint.

We formulate the grid line constraint term as

$$ E_{l} (w) = \sum\limits_{E} {S_{ij} } \left\| {w(v_{i} ) - w(v_{j} ) - l_{ij} (v_{i} - v_{j} )} \right\|^{2} $$

(7)

$$ l_{ij} = \frac{{v_{i}^{{\prime }} - v_{j}^{{\prime }} }}{{v_{i} - v_{j} }} $$

(8)

After the construction of the energy equation, the energy $ E(w) $ is minimized by solving the linear equations iteratively. The iterative process would not finish until the vertex movements compared to the previous one are smaller than 0.5.

2.2 Image Blending

Due to the different fields of the adjacent camera positions, stretch effect can occur on the right or left border if only left or right image is used for warping. This paper introduces a novel blending method which can employ the border information in the one synthesis image to cover the stretch region in the other synthesis image. The basic idea of the blending algorithm is illustrated in Fig. 3.

First two blocks are cut in the matched and matching images respective based on the relevant disparity relations. And a ceil $ C $ with the maximal gradient value is found out in block $ B $. As a template $ C $ is slid in $ B^{{\prime }} $, we need to find a region $ C^{{\prime }} $ with the maximal correction value with $ C $. The correlation level $ Corrs $ is calculated as:

$$ Corrs = \frac{{\sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{n} {C(i,j) \times C'\left( {i,j} \right)} } }}{{\left( {\sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{n} {C(i,j)^{2} } } \sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{n} {C'(i,j)^{2} } } } \right)^{1/2} }} $$

(9)

After finding the best matching block, we can stitch the block to the original synthesis results according to the location relations between the two blocks and the images.

3 Experimental Results

In the experiment, we show two video sequences, one with the resolution of 1024 × 768, and the other 1280 × 960. The relevant mesh resolutions are 30 × 40 and 40 × 50, which are much smaller than IDW’s 180 × 100. In the blending process, we assign the cell’s size as 10 × 10. We accomplish the scheme according to our comprehension since we couldn’t find the IDW’s implementation. To evaluate the performance we use the traditional DIBR method as our comparison.

The synthesized effects are shown in Figs. 4 and 5. From the figures it can be seen that the traditional DIBR would suffer blurring effect because of the overlay of the left and right synthesis views. However, our results have achieved better visual quality relying on the optimized energy equation, especially in some parts denoted by red rectangles.

4 Conclusion

In this paper, we have proposed a simple but efficient method to synthesis middle view from S3D inputs based on IDW. The first contribution is to obtain the initial warping position by image deformation based on MLS theory. As the second one, we introduce a grid line energy term to our energy equation. Finally, we apply a novel image blending algorithm to solve the border stretch deformation. Our experimental results demonstrate that our proposed method can generate a synthesis image meeting human visual comfort. In the future, we could make a further research on IDW and would introduce depth map to our method for depth map has a strong representation for the spatial structure of the image.

References

Stefanosti, N., Lang, M., Smolic, A.: Image quality vs rate optimized coding of warps for view synthesis in 3D video applications. In: ICIP, Orlando, USA, pp. 1289–1292. IEEE, September 2012
Google Scholar
Farre, M., Wang, O., Lang, M., Stefanoski, N., Hornung, A., Smolic, A.: Automatic content creation for multiview autostereoscopic displays using image domain warping. In: Jones, C.D., Smith, A.B., Roberts, E.F. (eds.) ICME, Barcelona, Spain, pp. 1–6. IEEE, July 2011
Google Scholar
Fehn, C.: Depth-image-based Rendering (DIBR), compression and transformation for a new approach on 3D-TV. In: Stereoscopic Displays and Virtual Reality Systems XI, SPIE, San Jose, Canada, pp. 93–104, May 2004
Google Scholar
Stefanoski, N., Wang, O., Lang, M., Greisen, P., Heinzle, S., Smolic, A.: Automatic view synthesis by image-domain-warping. IEEE TIP 22(9), 3320–3340 (2013)
MathSciNet MATH Google Scholar
Lang, M., Hornumg, A., Wang, O., Poulakos, S., Smolic, A., Gross, M.: Nonlinear disparity mapping for stereoscopic 3D. ACM TOG 29(4), 75 (2010). ACM, Seoul, South Korea
Article Google Scholar
Wang, H., Zhang, X.P., Xiong, H.K.: Spatial-temporal coherence for 3D view synthesis with curve–based disparity warping. In: VICP, Valletta, Malta, pp. 177–180. IEEE, December 2014
Google Scholar
Wolf, L., Guttmann, M., Cohen-Or, D.: Non-homogeneous content-driven video retargeting. In: ICCV, Rio de Janeiro, Brazil, pp. 1–6. IEEE, October 2007
Google Scholar
Wang, Y., Tai, C., Sorkine, O., Lee, T.: Optimized Scale-and Stretch for image resizing. ACM TOG 27(5), 118 (2008). ACM, Singapore
Article Google Scholar
Rubinstein, M., Gutierrez, D., Sorkine, O., Shamir, A.: A comparative study of image retargeting ACM TOG 29(6), 160 (2010). ACM, Seoul, South Korea
Google Scholar
Zhang, G.X., Cheng, M.M., Hu, S.M., Martin, R.R.: A shape-preserving approach to image resizing. Comput. Graph. Forum 28(7), 1897–1906 (2009). Blackwell Publishing Ltd.
Article Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI, Vancouver, Canada, pp. 285–289. Morgan Kaufmann Publishers Inc., August 1981
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). Kluwer Academic Publishers
Article Google Scholar
Levin, D.: The approximation power of moving least squares. Math. Comput. Am. Math. Soc, 64(224), 1517–1531 (1998)
Article MathSciNet MATH Google Scholar
Schaefer, S., McPhail, T., Warren, J.: Image deformation using moving least squares. ACM TOG 25(3), 533–540 (2006)
Article Google Scholar
Fischler, M.A., Bollles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). ACM
Article MathSciNet Google Scholar
Wang, O., Lang, M., Stefanoski, N.: Image domain warping for stereoscopic 3D applications. In: Emerging Technologies for 3D Video Creation, Coding, Transmission and Rendering, pp. 207–230, June 2013
Google Scholar

Download references

Acknowledgements

This work was jointly sponsored by the National Key Research and Development of China (No. 2016YFB0800404), the National Natural Science Foundation of China (No. 61210006, 61672087).

Author information

Authors and Affiliations

Institute of Information Science, Beijing Jiaotong University, Beijing, 100044, China
Ningning Hu, Yao Zhao & Huihui Bai

Authors

Ningning Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Huihui Bai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huihui Bai .

Editor information

Editors and Affiliations

Beijing Jiaotong University, Beijing, China
Yao Zhao
Dalian University of Technology, Dalian, China
Xiangwei Kong
UNSW, Sydney, New South Wales, Australia
David Taubman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, N., Zhao, Y., Bai, H. (2017). Warping and Blending Enhancement for 3D View Synthesis Based on Grid Deformation. In: Zhao, Y., Kong, X., Taubman, D. (eds) Image and Graphics. ICIG 2017. Lecture Notes in Computer Science(), vol 10668. Springer, Cham. https://doi.org/10.1007/978-3-319-71598-8_39

Download citation

DOI: https://doi.org/10.1007/978-3-319-71598-8_39
Published: 30 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71597-1
Online ISBN: 978-3-319-71598-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)