Skip to main content
Log in

A taxonomy of depth map creation methods used in multiview video compression

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a taxonomy of depth map creation methods which are used in multiview video compression. The focus is on investigation of approaches that used in 3D video compression. 3D videos provide a more realistic and extended view that leads to achievement more details about the environment under study. Depth map based methods produce the best results in standard evaluations. We first review the process of creation three-dimensional video and then propose our taxonomy by classifying depth map based methods into three categories: a) hardware or software based depth estimation methods, b) level of algorithms and c) 2D/3D depth map production. Finally, a comparison is performed on software-based methods based on video quality measure to evaluate the performance of some well-known approaches in this area. The most popular benchmarks utilized in papers are also described in full detail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The authors can send details for the interested readers via email.

References

  1. Argyropoulos S, Tan AS, Thomos N, Arikan E, Strintzis MG (2007) Robust transmission of multi-view video streams using flexible macroblock ordering and systematic LT codes. In: 3DTV Conference, 2007, pp 1–4

  2. Artigas X, Angeli E, Torres L (2006) Side information generation for multiview distributed video coding using a fusion approach. In: Signal processing symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic, pp 250–253

  3. Azuma R, Baillot Y, Behringer R, Feiner S, Julier S, MacIntyre B (2001) Recent advances in augmented reality. IEEE Comput Graph Appl 21:34–47

    Article  Google Scholar 

  4. Bai X, Wang J, Simons D, Sapiro G (2009) Video snapcut: robust video object cutout using localized classifiers. ACM Trans Graph 28:1–11

    Article  Google Scholar 

  5. Barnard S (1989) Stochastic stereo matching over scale. Int J Comput Vis 3:17–32

    Article  Google Scholar 

  6. Bartczak B, Koch R (2009) Dense depth maps from low resolution time-of-flight depth and high resolution color views. Presented at the Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II, Las Vegas, Nevada

  7. Bilen C, Aksay A (2006) Multi-view codec using disparity compensation. In: Signal Processing and Communications Applications, 2006 IEEE 14th, pp 1–4

  8. Bilen C, Aksay A, Akar GB (2006) A multi-view video codec based on H.264. In: Image Processing, 2006 IEEE International Conference on, pp 541–544

  9. Bock AM (2009) Video compression systems from first principles to concatenated codecs, first published ed.: the Institution of Engineering and Technology

  10. Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. Pattern Anal Mach Intell IEEE Trans 23:1222–1239

    Article  Google Scholar 

  11. Campbell ND, Vogiatzis G, Hernandez C, Cipolla R (2008) Using multiple hypotheses to improve depth-maps for multi-view stereo. Presented at the Proceedings of the 10th European Conference on Computer Vision: Part I, Marseille, France

  12. Chen J, Ye F, Di J, Liu C, Men A (2012) Depth map compression via edge-based inpainting. Presented at the Picture Coding Symposium (PCS), Kraków, Poland

  13. Cheung KM (2003) Visual hull construction, alignment and refinement for human kinematic modeling, motion tracking and rendering. Doctoral dissertation, Robotics Institute Carnegie Mellon University, Pittsburgh, PA

  14. Daribo I, Miled W, Pesquet-Popescu B (2010) Joint depth-motion dense estimation for multiview video coding. J Vis Comun Image Represent 21:487–497

    Article  Google Scholar 

  15. Drose M, Clemens C, Sikora T (2006) Extending single-view scalable video coding to multi-view based on H.264/AVC. In: Image processing, 2006 IEEE International Conference on, pp 2977–2980

  16. Ekmekcioglu E, Worrall ST, Kondoz AM (2009) A temporal subsampling approach for multiview depth map compression. Circ Syst Video Technol IEEE Trans 19:1209–1213

    Article  Google Scholar 

  17. El-Yamany NA, Ugur K, Hannuksela MM, Gabbouj M (2010) Evaluation of depth compression and view synthesis distortions in multiview-video-plus-depth coding systems. In: 3DTV-conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4

  18. Favaro P (2007) Shape from focus and defocus: convexity, quasiconvexity and defocus-invariant textures. In: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp 1–7

  19. Fecker U, Barkowsky M, Kaup A (2006) Improving the prediction efficiency for multi-view video coding using histogram matching. Presented at the in Picture Coding Symposium (PCS 2006), Beijing, China

  20. Feng S, Gangyi J, Mei Y, Xiexiong C (2006) A new image correction method for multiview video system. In: Multimedia and expo, 2006 IEEE International Conference on, pp 205–208

  21. Feng C, Wu H-t, Qiao B, Hu P (2010) Combining camera calibration with hand-eye calibration and using in monocular vision. In: Computer, mechatronics, control and electronic engineering (CMCE), 2010 International Conference on, pp 21–24

  22. Flierl M, Girod B (2007) Multiview video compression exploiting inter-image similarities. IEEE Signal Proc Mag

  23. Florencio D, Cha D (2009) Multiview video compression and streaming based on predicted viewer position. In: Acoustics, speech and signal processing, 2009. ICASSP 2009. IEEE International Conference on, pp. 657–660

  24. Foix S, Alenya G, Torras C (2011) Lock-in time-of-flight (ToF) cameras: a survey. IEEE Sensors J 11:1917–1926

    Article  Google Scholar 

  25. Furukawa Y, Ponce J (2010) Accurate, dense, and robust multiview stereopsis. Pattern Anal Mach Intell IEEE Trans 32:1362–1376

    Article  Google Scholar 

  26. Gedik OS, Ozkalayci B, Alatan AA (2007) 3-D structure assisted reference view generation for H.264 based multi-view video coding. In: Signal processing and communications applications, 2007. SIU 2007. IEEE 15th, pp 1–4

  27. Gonzalez-Aguilera D, Gomez-Lahoz J, Rodriguez-Gonzalvez P (2011) An automatic approach for radial lens distortion correction from a single image. IEEE Sensors J 11:956–965

    Article  Google Scholar 

  28. Guo X, Lu Y, Wu F, Gao W, Li S (2006) Distributed multi-view video coding. SPIE Vis Commun Image Process

  29. Habbecke M, Kobbelt L (2007) A surface-growing approach to multi-view stereo reconstruction. In: Computer vision and pattern recognition, 2007. CVPR ‘07. IEEE Conference on, pp 1–8

  30. Halle M (1997) Autostereoscopic displays and computer graphics. SIGGRAPH Comput Graph 31:58–62

    Article  Google Scholar 

  31. Holliman NS, Dodgson NA, Favalora GE, Pockett L (2011) Three-dimensional displays: a review and applications analysis. IEEE Trans Broadcast 57:362–371

    Article  Google Scholar 

  32. Ijsselsteijn WA, de Ridder H, Vliegen J (2000) Subjective evaluation of stereoscopic images: effects of camera parameters and display duration. Circ Syst Video Technol IEEE Trans 10:225–233

    Article  Google Scholar 

  33. Ince S, Martinian E, Yea S, Vetro A (2007) Depth estimation for view synthesis in multimedia video coding. Presented at the 3DTV Conference (3DTV-CON)

  34. ISO/IECJTC1/SC29/WG11 (2004) Call for evidence on multiview video coding, vol. N6720, ed: MPEG document

  35. ISO/IECJTC1/SC29/WG11 (2005) Call for proposals on multiview video coding, vol. N7327, ed: MPEG document

  36. ISO/IECJTC1/SC29/WG11 (2008) Reference software of depth estimation and view synthesis for FTV/3DV, vol. M15836, ed

  37. ISO/IECJTC1/SC29/WG11 (2008) 1D parallel test sequences for MPEGFTV, vol. M15378, ed: MPEG document

  38. ISO/IECJTC1/SC29/WG11 (2008) Contribution for 3D video test material of outdoor scene, vol. M15371, ed: MPEG document

  39. ISO/IECJTC1/SC29/WG11 (2009) Philips response to new call for 3DV test material: arrive book & mobile, vol. 16420, ed: MPEG document

  40. ISO/IECJTC1/SC29/WG11 and ITUTSG16Q.6 (2006) Common test conditions for multiview video coding, vol. 207, ed: JVTT

  41. Iyer KN, Maiti K, Navathe B, Kannan H, Sharma A (2010) Multiview video coding using depth based 3D warping. In: Multimedia and expo (ICME), 2010 IEEE International Conference on, pp 1108–1113

  42. Jae Hoon K, PoLin L, Lopez J, Ortega A, Yeping S, Peng Y, Gomila C (2007) New coding tools for illumination and focus mismatch compensation in multiview video coding. Circ Syst Video Technol IEEE Trans 17:1519–1535

    Article  Google Scholar 

  43. Jae-Ho H, Sukhee C, Yung-Lyul L (2007) Adaptive local illumination change compensation method for H.264/AVC-based multiview video coding. Circ Syst Video Technol IEEE Trans 17:1496–1505

    Article  Google Scholar 

  44. Jessen JB (2009) Comparing FPGA and GPU performance for both sparse and dense stereo reconstruction. The Maersk Mc-Kinney Moller Institute Faculty of Engineering, University of Southern Denmark

  45. Jin Young L, Wey H, Du-Sik P (2010) A novel approach for efficient multi-view depth map coding. In: Picture Coding Symposium (PCS), 2010, pp 302–305

  46. Kalva H, Christodoulou L, Mayron L, Marques O, Furht B (2006) Challenges and opportunities in video coding for 3D TV. In: Multimedia and expo, 2006 IEEE International Conference on, pp 1689–1692

  47. Kawada R (2004) KDDI multiview video sequences for MPEG 3DAV use. In: In 68th MPEG Meeting. Munich, German

  48. Kolev K, Klodt M, Brox T, Cremers D (2009) Continuous global optimization in multiview 3D reconstruction. Int J Comput Vis 84:80–96

    Article  Google Scholar 

  49. Lee C, Ho Y-S (2009) View synthesis using depth map for 3D video. Presented at the Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee, Sapporo, Japan

  50. Liang C, Wong K-YK (2010) 3D reconstruction using silhouettes from unordered viewpoints. Image Vision Comput 28:579–589

    Article  Google Scholar 

  51. Li-Fu D, Pei-Kuei T, Shao-Yi C, Wei-Yin C, Liang-Gee C (2008) Content-aware prediction algorithm with inter-view mode decision for multiview video coding. Multimedia IEEE Trans 10:1553–1564

    Article  Google Scholar 

  52. Liu S, Lai P, Tian D, Chen CW (2011) New depth coding techniques with utilization of corresponding video. IEEE Trans Broadcast 57:551–561

    Article  Google Scholar 

  53. Liu S, Lai P, Tian D, Gomila C, Chen CW (2010) Sparse dyadic mode for depth map compression. Presented at the Image Processing (ICIP), 2010 17th IEEE International Conference on, Hong Kong

  54. Loop C, Zhengyou Z (1999) Computing rectifying homographies for stereo vision. In: Computer vision and pattern recognition, 1999. IEEE Computer Society Conference on, p 131 vol. 1

  55. Lü C, Zhang Y, Shen Y (2011) Color correction based on SIFT and GRNN for multi-view video. Presented at the Proceedings of the 2011 Fourth International Joint Conference on Computational Sciences and Optimization

  56. Lukac R (2007) Refined automatic white balancing. Electron Lett 43:445–446

    Article  Google Scholar 

  57. Martinian E, Behrens A, Jun X, Vetro A, Huifang S (2006) Extensions of H.264/AVC for Multiview Video Compression. In: Image processing, 2006 IEEE International Conference on, pp 2981–2984

  58. Martinian E, Behrens A, Xin J, Vetro A (2006) View synthesis for multiview video compression. Presented at the Picture Coding Symposium (PCS)

  59. Merkle P, Brust H, Dix K, Muller K, Wiegand T (2009) Stereo video compression for mobile 3D services. In: 3DTV conference: the true vision—capture, transmission and display of 3D video, 2009, pp 1–4

  60. Merkle P, Smolic A, Muller K, Wiegand T (2007) Efficient prediction structures for multiview video coding. Circ Syst Video Technol IEEE Trans 17:1461–1473

    Article  Google Scholar 

  61. Merrell P, Akbarzadeh A, Wang L, Frahm J-m, Nistér RYD (2007) Real-time visibility-based fusion of depth maps. Presented at the In Int. Conf. on Computer Vision and Pattern Recognition

  62. Mohammadi S, Javadi A (2010) An efficient technique for error-free implementation of H.264 using algebraic integer encoding. In: Signal acquisition and processing, 2010. ICSAP ‘10. International Conference on, pp 145–150

  63. Morales S, Penc J, Vaudrey T, Klette R (2009) In: Bayro-Corrochano E, Eklundh J-O (eds) Graph-cut versus belief-propagation stereo on real-world images vol. 5856. Springer, Berlin, pp 732–740

    Google Scholar 

  64. Morvan Y, Farin D, De With P (2008) System architecture for free-viewpoint video and 3D-TV. IEEE Trans Consum Electron 54:925–932

    Article  Google Scholar 

  65. Mourad O, Frederic D, Touradj E (2007) Multiview distributed video coding with encoder driven fusion. Presented at the The 2007 European Signal Processing Conference (EUSIPCO-2007), Poznan-Poland

  66. Muller K, Smolic A, Dix K, Merkle P, Wiegand T (2009) Coding and intermediate view synthesis of multiview video plus depth. In: Image processing (ICIP), 2009 16th IEEE International Conference on, pp 741–744

  67. Palaniappan K, Bunyak F, Kumar P, Ersoy I, Jaeger S, Ganguli K, Haridas A, Fraser J, Rao RM, Seetharaman G (2010) Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video. In: Information fusion (FUSION), 2010 13th Conference on, pp 1–8

  68. Pang D, Xiaoyu X, Jie L (2009) Multiview video coding using projective rectification-based view extrapolation and synthesis bias correction. In: Multimedia and expo, 2009. ICME 2009. IEEE International Conference on, pp 5–8

  69. Peng Z, Jiang G, Yu M, Dai Q (2008) Fast macroblock mode selection algorithm for multiview video coding. EURASIP J Image Video Process

  70. Pinson MH, Wolf S (2004) A new standardized method for objectively measuring video quality. Broadcasting IEEE Trans 50:312–322

    Article  Google Scholar 

  71. Pourazad MT, Nasiopoulos P, Ward RK (2006) An H.264-based video encoding scheme for 3D TV. Presented at the 14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy

  72. Remondino F, Fraser C (2006) Digital camera calibration methods: considerations and comparisons. Presented at the ISPRS Commission V Symposium ‘Image Engineering and Vision Metrology’, Dresden, Germany

  73. Sang-Beom L, Yo-Sung H (2010) View-consistent multi-view depth estimation for three-dimensional video generation. In: 3DTV-Conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4

  74. Sang-Tae N, Kwan-Jung O, Cheon L, Yo-Sung H (2008) Multi-view depth video coding using depth view synthesis. In: Circuits and systems, 2008. ISCAS 2008. IEEE International Symposium on, pp 1400–1403

  75. Seitz SM, Curless B, Diebel J, Scharstein D, Szeliski R (2006) A comparison and evaluation of multi-view stereo reconstruction algorithms. In: Computer vision and pattern recognition, 2006 IEEE Computer Society Conference on, pp. 519–528.

  76. Seuntiens P, Meesters L, IJsselsteijn WA (2003) Perceptual evaluation of JPEG-coded stereoscopic images. SPIE 5006 vol 215

  77. Shujie L, Ying C, Ye-Kui W, Gabbouj M, Hannuksela MM, Houqiang L (2008) Frame loss error concealment for multiview video coding. In: Circuits and systems, 2008. ISCAS 2008. IEEE International Symposium on, pp 3470–3473

  78. Smirnov S, Gotchev A, Sen S, Tech G, Brust H (2011) 3D video processing algorithms—part I

  79. Smolic A (2011) 3D video and free viewpoint video-from capture to display. Pattern Recogn 44:1958–1968

    Article  Google Scholar 

  80. Smolic A, Mueller K, Stefanoski N, Ostermann J, Gotchev A, Akar GB, Triantafyllidis G, Koz A (2007) Coding algorithms for 3DTV—a survey. Circ Syst Video Technol IEEE Trans 17:1606–1621

    Article  Google Scholar 

  81. Sourimant G (2010) A simple and efficient way to compute depth maps for multi-view videos. In: 3DTV-conference: the true vision—capture, transmission and display of 3D video (3DTV-CON), 2010, pp 1–4

  82. Starch K, Hilton (2008) Objective quality assessment in free-viewpoint video production . Presented at the 3DTV Conference: the true vision—capture, transmission and display of 3D video

  83. Tao Y, Ping A, Liquan S, Qian Z, Zhaoyang Z (2009) Rate control algorithm for multi-view video coding based on correlation analysis. In: Photonics and optoelectronics, 2009. SOPO 2009. Symposium on, pp 1–4

  84. Urey H, Chellappan KV, Erden E, Surman P (2011) State of the art in stereoscopic and autostereoscopic displays. Proc IEEE 99:540–555

    Article  Google Scholar 

  85. Vetro A (2010) Representation and coding formats for stereo and multiview video. Intell Multimedia Commun Tech Appl 280/2010:51–73

    Article  Google Scholar 

  86. Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proc IEEE 99:626–642

    Article  Google Scholar 

  87. Vogiatzis G, Hernandez C, Torr PHS, Cipolla R (2007) Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. Patt Anal Mach Intell IEEE Tran 29:2241–2246

    Article  Google Scholar 

  88. Waizenegger W, Atzpadin N, Schreer O, Feldmann I (2011) Patch-sweeping with robust prior for high precision depth estimation in real-time systems. In: Image processing (ICIP), 2011 18th IEEE International Conference on, pp 881–884

  89. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13:600–612

    Article  Google Scholar 

  90. Wang G, Zelek JS, Wu QMJ (2011) Spatial-and-temporal-weighted structure from motion. Presented at the Proceedings of the 2011 Canadian Conference on Computer and Robot Vision

  91. With PHN, Wiegand T (2009) The effects of multiview depth video compression on multiview rendering. Image Commun 24:73–88

    Google Scholar 

  92. Xiaomin W, Weizhang X, Nanhao Z, Zhanxin Y (2010) A fast motion estimation algorithm for H.264. In: Signal Acquisition and Processing, 2010. ICSAP ‘10. International Conference on, pp 112–116.

  93. Xun G, Yan L, Feng W, Debin Z, Wen G (2008) Wyner–Ziv-based multiview video coding. Circ Syst Video Technol IEEE Trans 18:713–724

    Article  Google Scholar 

  94. Yanjie Li LS (2010) A novel upsampling scheme for depth map compression in 3DTV system. Presented at the Picture Coding Symposium (PCS), Nagoya, Japan

  95. Yasakethu SLP, Worrall ST, De Silva DVSX, Fernando WAC, Kondoz AM (2011) A compound depth and image quality metric for measuring the effects of packet loss on 3D video. In: Digital Signal Processing (DSP), 2011 17th International Conference on, pp 1–7

  96. Yea S, Vetro A (2009) View synthesis prediction for multiview video coding. Image Commun 24:89–100

    Google Scholar 

  97. Yebin L, Xun C, Qionghai D, Wenli X (2009) Continuous depth estimation for multi-view stereo. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, pp 2121–2128

  98. Yeping S, Ming-Ting S (2006) Fast multiple reference frame motion estimation for H.264/AVC. Circ Syst Video Technol IEEE Trans 16:447–452

    Article  Google Scholar 

  99. Ying C, Ye-Kui W, Hannuksela MM, Gabbouj M (2008) Single-loop decoding for multiview video coding. In: Multimedia and expo, 2008 IEEE International Conference on, pp 605–608

  100. Yu-Cheng F, Shu-Fen W, Bing-Lian L (2011) Three-dimensional depth map motion estimation and compensation for 3D video compression. IEEE Trans Magn 47:691–695

    Article  Google Scholar 

  101. Yu-Cheng F, Wei-Lun C, Jan-Hung S (2011) Depth map measurement and generation for multi-view video system. In: Instrumentation and Measurement Technology Conference (I2MTC), 2011 IEEE, pp 1–4

  102. Zach C, Pock T, Bischof H (2007) A globally optimal algorithm for robust TV-L<sup>1</sup> range image integration. In: Computer vision, 2007. ICCV 2007. IEEE 11th International Conference on, pp. 1–8

  103. Zhang Z, Hou C, Jin Z (2010) Depth image-based techniques for compression, transmission and display of auto-stereo video. J Networks 5:1053–1059

    Google Scholar 

  104. Zhang S-j, Wu W (2010) Optimized volumetric visual hull reconstruction method based on CUDA. In: Audio language and image processing (ICALIP), 2010 International Conference on, pp 1423–1429

  105. Zinger S, Do L, With PHNd (2010) Free-viewpoint depth image based rendering. J Vis Comun Image Represent 21:533–541

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seyed Morteza Ayatollahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ayatollahi, S.M., Moghadam, A.M.E. & Hosseini, M.S. A taxonomy of depth map creation methods used in multiview video compression. Multimed Tools Appl 72, 1887–1909 (2014). https://doi.org/10.1007/s11042-013-1474-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1474-0

Keywords

Navigation