skip to main content
10.1145/3394171.3413974acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

All-in-depth via Cross-baseline Light Field Camera

Published: 12 October 2020 Publication History

Abstract

Light-field (LF) camera holds great promise for passive/general depth estimation benefited from high angular resolution, yet suffering small baseline for distanced region. While stereo solution with large baseline is superior to handle distant scenarios, the problem of limited angular resolution becomes bothering for near objects. Aiming for all-in-depth solution, we propose a cross-baseline LF camera using a commercial LF camera and a monocular camera, which naturally form a 'stereo camera' enabling compensated baseline for LF camera. The idea is simple yet non-trivial, due to the significant angular resolution gap and baseline gap between LF and stereo cameras.
Fusing two depth maps from LF and stereo modules in spatial domain is fluky, which relies on the imprecisely predicted depth to distinguish close or distance range, and determine the weights for fusion. Alternatively, taking the unified representation for both LF and monocular sub-aperture view in epipolar plane image (EPI) domain, we show that for each pixel, the minimum variance along different shearing degrees in EPI domain estimates its depth with the highest fidelity. By minimizing the minimum variance, the depth error is minimized accordingly. The insight is that the calculated minimum variance in EPI domain owns higher fidelity than the predicted depth in spatial domain. Extensive experiments demonstrate the superiority of our cross-baseline LF camera in providing high-quality all-in-depth map from 0.2m to 100m.

Supplementary Material

ZIP File (mmfp0446aux.zip)
only have one pdf file for supplementary material.
MP4 File (3394171.3413974.mp4)
Aiming for all-in-depth solution, we propose a cross-baseline LF camera using a commercial LF camera and a monocular camera. Based on this camera system, we also proposed an algorithm to get the final all-in-depth map in EPI domain. We only show the outline in this video, see details in the full paper.\r\n

References

[1]
[n.d.]. Blender software is used for 3d model construction. https://www.blender. org/.
[2]
[n.d.]. Lytro redefines photography with light field cameras. .http://www.lytro. com/. Press release,Jun 2011.
[3]
Linchao Bao, Qingxiong Yang, and Hailin Jin. 2014. Fast edge-preserving patchmatch for large displacement optical flow. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3534--3541.
[4]
Jonathan T Barron and Ben Poole. 2016. The fast bilateral solver. In European Conference on Computer Vision. Springer, 617--632.
[5]
Tom E Bishop and Paolo Favaro. 2012. The light field camera: Extended depth of field, aliasing, and superresolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (2012), 972--986.
[6]
Michael J Black and Paul Anandan. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer vision and image understanding 63, 1 (1996), 75--104.
[7]
Thomas Brox and Jitendra Malik. 2011. Large displacement optical flow: descriptor matching in variational motion estimation. IEEE transactions on pattern analysis and machine intelligence 33, 3 (2011), 500--513.
[8]
Andrés Bruhn, Joachim Weickert, and Christoph Schnörr. 2005. Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods. International journal of computer vision 61, 3 (2005), 211--231.
[9]
Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5410--5418.
[10]
Can Chen, Haiting Lin, Zhan Yu, Sing Bing Kang, and Jingyi Yu. 2014. Light field stereo matching using bilateral statistics of surface cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1518--1525.
[11]
Jie Chen, Junhui Hou, Yun Ni, and Lap-Pui Chau. 2018. Accurate light field depth estimation with superpixel regularization over partially occluded regions. IEEE Transactions on Image Processing 27, 10 (2018), 4889--4900.
[12]
Donald G. Dansereau, Bernd Girod, and Gordon Wetzstein. 2019. LiFF: Light Field Features in Scale and Depth. arXiv preprint arXiv:1901.03916 (Jan. 2019). https://arxiv.org/abs/1901.03916
[13]
Huan Fu, Mingming Gong, ChaohuiWang, Kayhan Batmanghelich, and Dacheng Tao. 2018. Deep Ordinal Regression Network for Monocular Depth Estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3354--3361.
[15]
Berthold KP Horn and Brian G Schunck. 1981. Determining optical flow. Artificial intelligence 17, 1--3 (1981), 185--203.
[16]
Po-Yuan Hsieh, Ping-Yen Chou, Hsiu-An Lin, Chao-Yu Chu, Cheng-Ting Huang, Chun-Ho Chen, Zong Qin, Manuel Martinez Corral, Bahram Javidi, and Yi-Pai Huang. 2018. Long working range light field microscope with fast scanning multifocal liquid crystal microlens array. Optics express 26, 8 (2018), 10981-- 10996.
[17]
OU Nirmal Jith, S Avinash Ramakanth, and R Venkatesh Babu. 2014. Optical flow estimation using approximate nearest neighbor field fusion. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 673--6577.
[18]
Ryan Kennedy and Camillo J Taylor. 2015. Optical flow with geometric occlusion estimation and fusion of multiple frames. In International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition. Springer, 364-- 377.
[19]
Changil Kim, Henning Zimmer, Yael Pritch, Alexander Sorkine-Hornung, and Markus H Gross. 2013. Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. 32, 4 (2013), 73--1.
[20]
Jiangbo Lu, Hongsheng Yang, Dongbo Min, and Minh N Do. 2013. Patch match filter: Efficient edge-aware filtering meets randomized search for fast correspondence field estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1854--1861.
[21]
Wenjie Luo, Alexander G Schwing, and Raquel Urtasun. 2016. Efficient deep learning for stereo matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5695--5703.
[22]
Scott McCloskey. 2014. Masking light fields to remove partial occlusion. In 2014 22nd International Conference on Pattern Recognition. IEEE, 2053--2058.
[23]
Carl Olsson, Johannes Ulén, and Yuri Boykov. 2013. In defense of 3d-label stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1730--1737.
[24]
Jiahao Pang, Wenxiu Sun, Jimmy SJ Ren, Chengxi Yang, and Qiong Yan. 2017. Cascade residual learning: A two-stage convolutional neural network for stereo matching. In Proceedings of the IEEE International Conference on Computer Vision. 887--895.
[25]
Cristian Perra, Francesca Murgia, and Daniele Giusto. 2016. An analysis of 3D point cloud reconstruction from light field images. In 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA). IEEE, 1--6.
[26]
Christian Perwass and Lennart Wietzke. 2012. Single lens 3D-camera with extended depth-of-field. In Human Vision and Electronic Imaging XVII, Vol. 8291. International Society for Optics and Photonics, 829108.
[27]
Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and ZehanWang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1874--1883.
[28]
Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8934--8943.
[29]
MichaelWTao, Sunil Hadap, Jitendra Malik, and Ravi Ramamoorthi. 2013. Depth from combining defocus and correspondence using light-field cameras. In Proceedings of the IEEE International Conference on Computer Vision. 673--680.
[30]
Rui Wang, Martin Schworer, and Daniel Cremers. 2017. Stereo DSO: Large-Scale Direct Sparse Visual Odometry With Stereo Cameras. In The IEEE International Conference on Computer Vision (ICCV).
[31]
Ting-ChunWang, Alexei A Efros, and Ravi Ramamoorthi. 2016. Depth estimation with occlusion modeling using light-field cameras. IEEE transactions on pattern analysis and machine intelligence 38, 11 (2016), 2170--2181.
[32]
Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei A. Efros, and Ravi Ramamoorthi. 2017. Light Field Video Capture Using a Learning-Based Hybrid Imaging System. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 36, 4 (2017).
[33]
SvenWanner and Bastian Goldluecke. 2012. Globally consistent depth labeling of 4D light fields. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 41--48.
[34]
PhilippeWeinzaepfel, Jerome Revaud, Zaid Harchaoui, and Cordelia Schmid. 2013. DeepFlow: Large displacement optical flow with deep matching. In Proceedings of the IEEE International Conference on Computer Vision. 1385--1392.
[35]
Li Xu, Jiaya Jia, and Yasuyuki Matsushita. 2012. Motion detail preserving optical flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 9 (2012), 1744--1757.
[36]
Zhan Yu, Xinqing Guo, Haibing Lin, Andrew Lumsdaine, and Jingyi Yu. 2013. Line assisted light field triangulation and stereo matching. In Proceedings of the IEEE International Conference on Computer Vision. 2792--2799.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. EPI domain
  2. cross-baseline
  3. depth map
  4. light field

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation ofChina (NSFC)
  • Shenzhen Science and Technology Research and Develop-ment Funds

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media