Abstract
In this paper, we propose a taxonomy of depth map creation methods which are used in multiview video compression. The focus is on investigation of approaches that used in 3D video compression. 3D videos provide a more realistic and extended view that leads to achievement more details about the environment under study. Depth map based methods produce the best results in standard evaluations. We first review the process of creation three-dimensional video and then propose our taxonomy by classifying depth map based methods into three categories: a) hardware or software based depth estimation methods, b) level of algorithms and c) 2D/3D depth map production. Finally, a comparison is performed on software-based methods based on video quality measure to evaluate the performance of some well-known approaches in this area. The most popular benchmarks utilized in papers are also described in full detail.
Similar content being viewed by others
Notes
The authors can send details for the interested readers via email.
References
Argyropoulos S, Tan AS, Thomos N, Arikan E, Strintzis MG (2007) Robust transmission of multi-view video streams using flexible macroblock ordering and systematic LT codes. In: 3DTV Conference, 2007, pp 1–4
Artigas X, Angeli E, Torres L (2006) Side information generation for multiview distributed video coding using a fusion approach. In: Signal processing symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic, pp 250–253
Azuma R, Baillot Y, Behringer R, Feiner S, Julier S, MacIntyre B (2001) Recent advances in augmented reality. IEEE Comput Graph Appl 21:34–47
Bai X, Wang J, Simons D, Sapiro G (2009) Video snapcut: robust video object cutout using localized classifiers. ACM Trans Graph 28:1–11
Barnard S (1989) Stochastic stereo matching over scale. Int J Comput Vis 3:17–32
Bartczak B, Koch R (2009) Dense depth maps from low resolution time-of-flight depth and high resolution color views. Presented at the Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II, Las Vegas, Nevada
Bilen C, Aksay A (2006) Multi-view codec using disparity compensation. In: Signal Processing and Communications Applications, 2006 IEEE 14th, pp 1–4
Bilen C, Aksay A, Akar GB (2006) A multi-view video codec based on H.264. In: Image Processing, 2006 IEEE International Conference on, pp 541–544
Bock AM (2009) Video compression systems from first principles to concatenated codecs, first published ed.: the Institution of Engineering and Technology
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. Pattern Anal Mach Intell IEEE Trans 23:1222–1239
Campbell ND, Vogiatzis G, Hernandez C, Cipolla R (2008) Using multiple hypotheses to improve depth-maps for multi-view stereo. Presented at the Proceedings of the 10th European Conference on Computer Vision: Part I, Marseille, France
Chen J, Ye F, Di J, Liu C, Men A (2012) Depth map compression via edge-based inpainting. Presented at the Picture Coding Symposium (PCS), Kraków, Poland
Cheung KM (2003) Visual hull construction, alignment and refinement for human kinematic modeling, motion tracking and rendering. Doctoral dissertation, Robotics Institute Carnegie Mellon University, Pittsburgh, PA
Daribo I, Miled W, Pesquet-Popescu B (2010) Joint depth-motion dense estimation for multiview video coding. J Vis Comun Image Represent 21:487–497
Drose M, Clemens C, Sikora T (2006) Extending single-view scalable video coding to multi-view based on H.264/AVC. In: Image processing, 2006 IEEE International Conference on, pp 2977–2980
Ekmekcioglu E, Worrall ST, Kondoz AM (2009) A temporal subsampling approach for multiview depth map compression. Circ Syst Video Technol IEEE Trans 19:1209–1213
El-Yamany NA, Ugur K, Hannuksela MM, Gabbouj M (2010) Evaluation of depth compression and view synthesis distortions in multiview-video-plus-depth coding systems. In: 3DTV-conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4
Favaro P (2007) Shape from focus and defocus: convexity, quasiconvexity and defocus-invariant textures. In: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp 1–7
Fecker U, Barkowsky M, Kaup A (2006) Improving the prediction efficiency for multi-view video coding using histogram matching. Presented at the in Picture Coding Symposium (PCS 2006), Beijing, China
Feng S, Gangyi J, Mei Y, Xiexiong C (2006) A new image correction method for multiview video system. In: Multimedia and expo, 2006 IEEE International Conference on, pp 205–208
Feng C, Wu H-t, Qiao B, Hu P (2010) Combining camera calibration with hand-eye calibration and using in monocular vision. In: Computer, mechatronics, control and electronic engineering (CMCE), 2010 International Conference on, pp 21–24
Flierl M, Girod B (2007) Multiview video compression exploiting inter-image similarities. IEEE Signal Proc Mag
Florencio D, Cha D (2009) Multiview video compression and streaming based on predicted viewer position. In: Acoustics, speech and signal processing, 2009. ICASSP 2009. IEEE International Conference on, pp. 657–660
Foix S, Alenya G, Torras C (2011) Lock-in time-of-flight (ToF) cameras: a survey. IEEE Sensors J 11:1917–1926
Furukawa Y, Ponce J (2010) Accurate, dense, and robust multiview stereopsis. Pattern Anal Mach Intell IEEE Trans 32:1362–1376
Gedik OS, Ozkalayci B, Alatan AA (2007) 3-D structure assisted reference view generation for H.264 based multi-view video coding. In: Signal processing and communications applications, 2007. SIU 2007. IEEE 15th, pp 1–4
Gonzalez-Aguilera D, Gomez-Lahoz J, Rodriguez-Gonzalvez P (2011) An automatic approach for radial lens distortion correction from a single image. IEEE Sensors J 11:956–965
Guo X, Lu Y, Wu F, Gao W, Li S (2006) Distributed multi-view video coding. SPIE Vis Commun Image Process
Habbecke M, Kobbelt L (2007) A surface-growing approach to multi-view stereo reconstruction. In: Computer vision and pattern recognition, 2007. CVPR ‘07. IEEE Conference on, pp 1–8
Halle M (1997) Autostereoscopic displays and computer graphics. SIGGRAPH Comput Graph 31:58–62
Holliman NS, Dodgson NA, Favalora GE, Pockett L (2011) Three-dimensional displays: a review and applications analysis. IEEE Trans Broadcast 57:362–371
Ijsselsteijn WA, de Ridder H, Vliegen J (2000) Subjective evaluation of stereoscopic images: effects of camera parameters and display duration. Circ Syst Video Technol IEEE Trans 10:225–233
Ince S, Martinian E, Yea S, Vetro A (2007) Depth estimation for view synthesis in multimedia video coding. Presented at the 3DTV Conference (3DTV-CON)
ISO/IECJTC1/SC29/WG11 (2004) Call for evidence on multiview video coding, vol. N6720, ed: MPEG document
ISO/IECJTC1/SC29/WG11 (2005) Call for proposals on multiview video coding, vol. N7327, ed: MPEG document
ISO/IECJTC1/SC29/WG11 (2008) Reference software of depth estimation and view synthesis for FTV/3DV, vol. M15836, ed
ISO/IECJTC1/SC29/WG11 (2008) 1D parallel test sequences for MPEGFTV, vol. M15378, ed: MPEG document
ISO/IECJTC1/SC29/WG11 (2008) Contribution for 3D video test material of outdoor scene, vol. M15371, ed: MPEG document
ISO/IECJTC1/SC29/WG11 (2009) Philips response to new call for 3DV test material: arrive book & mobile, vol. 16420, ed: MPEG document
ISO/IECJTC1/SC29/WG11 and ITUTSG16Q.6 (2006) Common test conditions for multiview video coding, vol. 207, ed: JVTT
Iyer KN, Maiti K, Navathe B, Kannan H, Sharma A (2010) Multiview video coding using depth based 3D warping. In: Multimedia and expo (ICME), 2010 IEEE International Conference on, pp 1108–1113
Jae Hoon K, PoLin L, Lopez J, Ortega A, Yeping S, Peng Y, Gomila C (2007) New coding tools for illumination and focus mismatch compensation in multiview video coding. Circ Syst Video Technol IEEE Trans 17:1519–1535
Jae-Ho H, Sukhee C, Yung-Lyul L (2007) Adaptive local illumination change compensation method for H.264/AVC-based multiview video coding. Circ Syst Video Technol IEEE Trans 17:1496–1505
Jessen JB (2009) Comparing FPGA and GPU performance for both sparse and dense stereo reconstruction. The Maersk Mc-Kinney Moller Institute Faculty of Engineering, University of Southern Denmark
Jin Young L, Wey H, Du-Sik P (2010) A novel approach for efficient multi-view depth map coding. In: Picture Coding Symposium (PCS), 2010, pp 302–305
Kalva H, Christodoulou L, Mayron L, Marques O, Furht B (2006) Challenges and opportunities in video coding for 3D TV. In: Multimedia and expo, 2006 IEEE International Conference on, pp 1689–1692
Kawada R (2004) KDDI multiview video sequences for MPEG 3DAV use. In: In 68th MPEG Meeting. Munich, German
Kolev K, Klodt M, Brox T, Cremers D (2009) Continuous global optimization in multiview 3D reconstruction. Int J Comput Vis 84:80–96
Lee C, Ho Y-S (2009) View synthesis using depth map for 3D video. Presented at the Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee, Sapporo, Japan
Liang C, Wong K-YK (2010) 3D reconstruction using silhouettes from unordered viewpoints. Image Vision Comput 28:579–589
Li-Fu D, Pei-Kuei T, Shao-Yi C, Wei-Yin C, Liang-Gee C (2008) Content-aware prediction algorithm with inter-view mode decision for multiview video coding. Multimedia IEEE Trans 10:1553–1564
Liu S, Lai P, Tian D, Chen CW (2011) New depth coding techniques with utilization of corresponding video. IEEE Trans Broadcast 57:551–561
Liu S, Lai P, Tian D, Gomila C, Chen CW (2010) Sparse dyadic mode for depth map compression. Presented at the Image Processing (ICIP), 2010 17th IEEE International Conference on, Hong Kong
Loop C, Zhengyou Z (1999) Computing rectifying homographies for stereo vision. In: Computer vision and pattern recognition, 1999. IEEE Computer Society Conference on, p 131 vol. 1
Lü C, Zhang Y, Shen Y (2011) Color correction based on SIFT and GRNN for multi-view video. Presented at the Proceedings of the 2011 Fourth International Joint Conference on Computational Sciences and Optimization
Lukac R (2007) Refined automatic white balancing. Electron Lett 43:445–446
Martinian E, Behrens A, Jun X, Vetro A, Huifang S (2006) Extensions of H.264/AVC for Multiview Video Compression. In: Image processing, 2006 IEEE International Conference on, pp 2981–2984
Martinian E, Behrens A, Xin J, Vetro A (2006) View synthesis for multiview video compression. Presented at the Picture Coding Symposium (PCS)
Merkle P, Brust H, Dix K, Muller K, Wiegand T (2009) Stereo video compression for mobile 3D services. In: 3DTV conference: the true vision—capture, transmission and display of 3D video, 2009, pp 1–4
Merkle P, Smolic A, Muller K, Wiegand T (2007) Efficient prediction structures for multiview video coding. Circ Syst Video Technol IEEE Trans 17:1461–1473
Merrell P, Akbarzadeh A, Wang L, Frahm J-m, Nistér RYD (2007) Real-time visibility-based fusion of depth maps. Presented at the In Int. Conf. on Computer Vision and Pattern Recognition
Mohammadi S, Javadi A (2010) An efficient technique for error-free implementation of H.264 using algebraic integer encoding. In: Signal acquisition and processing, 2010. ICSAP ‘10. International Conference on, pp 145–150
Morales S, Penc J, Vaudrey T, Klette R (2009) In: Bayro-Corrochano E, Eklundh J-O (eds) Graph-cut versus belief-propagation stereo on real-world images vol. 5856. Springer, Berlin, pp 732–740
Morvan Y, Farin D, De With P (2008) System architecture for free-viewpoint video and 3D-TV. IEEE Trans Consum Electron 54:925–932
Mourad O, Frederic D, Touradj E (2007) Multiview distributed video coding with encoder driven fusion. Presented at the The 2007 European Signal Processing Conference (EUSIPCO-2007), Poznan-Poland
Muller K, Smolic A, Dix K, Merkle P, Wiegand T (2009) Coding and intermediate view synthesis of multiview video plus depth. In: Image processing (ICIP), 2009 16th IEEE International Conference on, pp 741–744
Palaniappan K, Bunyak F, Kumar P, Ersoy I, Jaeger S, Ganguli K, Haridas A, Fraser J, Rao RM, Seetharaman G (2010) Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video. In: Information fusion (FUSION), 2010 13th Conference on, pp 1–8
Pang D, Xiaoyu X, Jie L (2009) Multiview video coding using projective rectification-based view extrapolation and synthesis bias correction. In: Multimedia and expo, 2009. ICME 2009. IEEE International Conference on, pp 5–8
Peng Z, Jiang G, Yu M, Dai Q (2008) Fast macroblock mode selection algorithm for multiview video coding. EURASIP J Image Video Process
Pinson MH, Wolf S (2004) A new standardized method for objectively measuring video quality. Broadcasting IEEE Trans 50:312–322
Pourazad MT, Nasiopoulos P, Ward RK (2006) An H.264-based video encoding scheme for 3D TV. Presented at the 14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy
Remondino F, Fraser C (2006) Digital camera calibration methods: considerations and comparisons. Presented at the ISPRS Commission V Symposium ‘Image Engineering and Vision Metrology’, Dresden, Germany
Sang-Beom L, Yo-Sung H (2010) View-consistent multi-view depth estimation for three-dimensional video generation. In: 3DTV-Conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4
Sang-Tae N, Kwan-Jung O, Cheon L, Yo-Sung H (2008) Multi-view depth video coding using depth view synthesis. In: Circuits and systems, 2008. ISCAS 2008. IEEE International Symposium on, pp 1400–1403
Seitz SM, Curless B, Diebel J, Scharstein D, Szeliski R (2006) A comparison and evaluation of multi-view stereo reconstruction algorithms. In: Computer vision and pattern recognition, 2006 IEEE Computer Society Conference on, pp. 519–528.
Seuntiens P, Meesters L, IJsselsteijn WA (2003) Perceptual evaluation of JPEG-coded stereoscopic images. SPIE 5006 vol 215
Shujie L, Ying C, Ye-Kui W, Gabbouj M, Hannuksela MM, Houqiang L (2008) Frame loss error concealment for multiview video coding. In: Circuits and systems, 2008. ISCAS 2008. IEEE International Symposium on, pp 3470–3473
Smirnov S, Gotchev A, Sen S, Tech G, Brust H (2011) 3D video processing algorithms—part I
Smolic A (2011) 3D video and free viewpoint video-from capture to display. Pattern Recogn 44:1958–1968
Smolic A, Mueller K, Stefanoski N, Ostermann J, Gotchev A, Akar GB, Triantafyllidis G, Koz A (2007) Coding algorithms for 3DTV—a survey. Circ Syst Video Technol IEEE Trans 17:1606–1621
Sourimant G (2010) A simple and efficient way to compute depth maps for multi-view videos. In: 3DTV-conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4
Starch K, Hilton (2008) Objective quality assessment in free-viewpoint video production . Presented at the 3DTV Conference: the true vision—capture, transmission and display of 3D video
Tao Y, Ping A, Liquan S, Qian Z, Zhaoyang Z (2009) Rate control algorithm for multi-view video coding based on correlation analysis. In: Photonics and optoelectronics, 2009. SOPO 2009. Symposium on, pp 1–4
Urey H, Chellappan KV, Erden E, Surman P (2011) State of the art in stereoscopic and autostereoscopic displays. Proc IEEE 99:540–555
Vetro A (2010) Representation and coding formats for stereo and multiview video. Intell Multimedia Commun Tech Appl 280/2010:51–73
Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proc IEEE 99:626–642
Vogiatzis G, Hernandez C, Torr PHS, Cipolla R (2007) Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. Patt Anal Mach Intell IEEE Tran 29:2241–2246
Waizenegger W, Atzpadin N, Schreer O, Feldmann I (2011) Patch-sweeping with robust prior for high precision depth estimation in real-time systems. In: Image processing (ICIP), 2011 18th IEEE International Conference on, pp 881–884
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13:600–612
Wang G, Zelek JS, Wu QMJ (2011) Spatial-and-temporal-weighted structure from motion. Presented at the Proceedings of the 2011 Canadian Conference on Computer and Robot Vision
With PHN, Wiegand T (2009) The effects of multiview depth video compression on multiview rendering. Image Commun 24:73–88
Xiaomin W, Weizhang X, Nanhao Z, Zhanxin Y (2010) A fast motion estimation algorithm for H.264. In: Signal Acquisition and Processing, 2010. ICSAP ‘10. International Conference on, pp 112–116.
Xun G, Yan L, Feng W, Debin Z, Wen G (2008) Wyner–Ziv-based multiview video coding. Circ Syst Video Technol IEEE Trans 18:713–724
Yanjie Li LS (2010) A novel upsampling scheme for depth map compression in 3DTV system. Presented at the Picture Coding Symposium (PCS), Nagoya, Japan
Yasakethu SLP, Worrall ST, De Silva DVSX, Fernando WAC, Kondoz AM (2011) A compound depth and image quality metric for measuring the effects of packet loss on 3D video. In: Digital Signal Processing (DSP), 2011 17th International Conference on, pp 1–7
Yea S, Vetro A (2009) View synthesis prediction for multiview video coding. Image Commun 24:89–100
Yebin L, Xun C, Qionghai D, Wenli X (2009) Continuous depth estimation for multi-view stereo. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, pp 2121–2128
Yeping S, Ming-Ting S (2006) Fast multiple reference frame motion estimation for H.264/AVC. Circ Syst Video Technol IEEE Trans 16:447–452
Ying C, Ye-Kui W, Hannuksela MM, Gabbouj M (2008) Single-loop decoding for multiview video coding. In: Multimedia and expo, 2008 IEEE International Conference on, pp 605–608
Yu-Cheng F, Shu-Fen W, Bing-Lian L (2011) Three-dimensional depth map motion estimation and compensation for 3D video compression. IEEE Trans Magn 47:691–695
Yu-Cheng F, Wei-Lun C, Jan-Hung S (2011) Depth map measurement and generation for multi-view video system. In: Instrumentation and Measurement Technology Conference (I2MTC), 2011 IEEE, pp 1–4
Zach C, Pock T, Bischof H (2007) A globally optimal algorithm for robust TV-L<sup>1</sup> range image integration. In: Computer vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp. 1–8
Zhang Z, Hou C, Jin Z (2010) Depth image-based techniques for compression, transmission and display of auto-stereo video. J Networks 5:1053–1059
Zhang S-j, Wu W (2010) Optimized volumetric visual hull reconstruction method based on CUDA. In: Audio language and image processing (ICALIP), 2010 International Conference on, pp 1423–1429
Zinger S, Do L, With PHNd (2010) Free-viewpoint depth image based rendering. J Vis Comun Image Represent 21:533–541
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ayatollahi, S.M., Moghadam, A.M.E. & Hosseini, M.S. A taxonomy of depth map creation methods used in multiview video compression. Multimed Tools Appl 72, 1887–1909 (2014). https://doi.org/10.1007/s11042-013-1474-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1474-0