skip to main content
10.1145/3281505.3281513acmconferencesArticle/Chapter ViewAbstractPublication PagesvrstConference Proceedingsconference-collections
research-article

MR video fusion: interactive 3D modeling and stitching on wide-baseline videos

Published: 28 November 2018 Publication History

Abstract

A major challenge facing camera networks today is how to effectively organizing and visualizing videos in the presence of complicated network connection and overwhelming and even increasing amount of data. Previous works focus on 2D stitching or dynamic projection to 3D models, such as panorama and Augmented Virtual Environment (AVE), and haven't given an ideal solution. We present a novel method of multiple video fusion in 3D environment, which produces a highly comprehensive imagery and yields a spatio-temporal consistent scene. User initially interact with a newly designed background model named video model to register and stitch videos' background frames offline. The method then fuses the offline results to render videos in a real time manner. We demonstrate our system on 3 real scenes, each of which contains dozens of wide-baseline videos. The experimental results show that, our 3D modeling interface developed with the our presented model and method can efficiently assist the users to seamlessly integrate videos by comparing to commercial-off-the-shelf software with less operating complexity and more accurate 3D environment. The stitching method proposed by us is much more robust against the position, orientation, attribute differences among videos than the start-of-the-art methods. More importantly, this study sheds light on how to use the 3D techniques to solve 2D problems in realistic and we validate its feasibility.

Supplementary Material

ZIP File (a17-zhou.zip)
Supplemental files.
MP4 File (a17-zhou.mp4)

References

[1]
Ivanov Yuri A, Wren Christopher R, Sorokin Alexander, and Kaur Ishwinder. 2007. Visualizing the History of Living Spaces. TVCG 13, 6 (2007), 1153--1160.
[2]
Girgensohn Andreas, Kimber Don, Vaughan Jim, Yang Tao, Shipman Frank M, Turner Thea, Rieffel Eleanor G, Wilcox Lynn, Chen Francine, and Dunnigan Tony. 2007. DOTS: support for effective video surveillance. (2007), 423--432.
[3]
D Anguelov, C Dulong, D Filip, C Frueh, S Lafon, R Lyon, A Ogale, L Vincent, and J Weaver. 2010. Google Street View: Capturing the World at Street Level. Computer 43, 6 (2010), 32--38.
[4]
O Barnich and Droogenbroeck M Van. 2011. ViBe: a universal background subtraction algorithm for video sequences. IEEE Transactions on Image Processing 20, 6 (2011), 1709.
[5]
Lee D. C., Hebert M., and Kanade T. 2009. Geometric reasoning for single image structure recovery. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. 2136--2143.
[6]
Yan Pei Cao, Tao Ju, Zhao Fu, and Shi Min Hu. 2014. Interactive Image-Guided Modeling of Extruded Shapes. Computer Graphics Forum 33, 7 (2014), 101--110.
[7]
Che Han Chang, Yoichi Sato, and Yung Yu Chuang. 2014. Shape-Preserving Half-Projective Warps for Image Stitching. In IEEE Conference on Computer Vision and Pattern Recognition. 3254--3261.
[8]
Tao Chen, Zhe Zhu, Ariel Shamir, Shi Min Hu, and Daniel Cohen-Or. 2013. 3-Sweep:extracting editable objects from a single photo. Acm Transactions on Graphics 32, 6 (2013), 1--10.
[9]
Yu Sheng Chen and Yung Yu Chuang. 2016. Natural Image Stitching with the Global Similarity Prior. Springer International Publishing. 186--201 pages.
[10]
Chen Shen Chi, Lee Chung Yi, Lin Chih Wei, and Chan Iok Long. 2012. 2D and 3D visualization with dual-resolution for surveillance. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 23--30.
[11]
Hoiem Derek, Efros Alexei A., and Hebert Martial. 2007. Recovering Surface Layout from an Image. In Ijcv. 151--172.
[12]
Y Furukawa and J Ponce. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8 (2010), 1362--1376.
[13]
E. Guillou, D. Meneveaux, E. Maisel, and K. Bouatouch. 2000. Using vanishing points for camera calibration and coarse 3D reconstruction from a single image. Visual Computer 16, 7 (2000), 396--410.
[14]
Heng Guo, Shuaicheng Liu, Tong He, Shuyuan Zhu, Bing Zeng, and Moncef Gabbouj. 2016. Joint Video Stitching and Stabilization From Moving Cameras. IEEE Trans Image Process 25, 11 (2016), 5491--5503.
[15]
Bin He, Gang Zhao, and Qifang Liu. 2012. Panoramic video stitching in multi-camera surveillance system. In Image and Vision Computing New Zealand. 1--6.
[16]
Anton Van Den Hengel, Anthony Dick, Ben Ward, and Philip H. S. Torr. 2007. VideoTrace:rapid interactive scene modelling from video. 86.
[17]
V. H. Hiep, R. Keriven, P. Labatut, and J. P. Pons. 2009. Towards high-resolution large-scale multi-view stereo. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. 1430--1437.
[18]
Hua Huang, Hong Liu, and Lei Zhang. 2014. VideoWeb: Space-Time Aware Presentation of a Videoclip Collection. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 4, 1 (2014), 142--152.
[19]
Wei Jiang and Jinwei Gu. 2015. Video stitching with spatial-temporal content-preserving warping. In Computer Vision and Pattern Recognition Workshops. 42--48.
[20]
Kim Kihwan, Oh Sangmin, Lee Jeonggyu, and Essa Irfan. 2011. Augmenting aerial earth maps with dynamic information from videos. Virtual Reality 15, 2--3 (2011), 185--200.
[21]
Kwang In Kim, Kwang In Kim, Christian Theobalt, and Christian Theobalt. 2012. Videoscapes: exploring sparse, unstructured video collections. Acm Transactions on Graphics 31, 4 (2012), 68.
[22]
J. Li, W. Xu, J. Zhang, M. Zhang, Z. Wang, and X. Li. 2015. Efficient Video Stitching Based on Fast Structure Deformation. IEEE Transactions on Cybernetics 45, 12 (2015), 2707--2719.
[23]
Chung Ching Lin, Sharathchandra U. Pankanti, Karthikeyan Natesan Ramamurthy, and Aleksandr Y. Aravkin. 2015. Adaptive as-natural-as-possible image stitching. In Computer Vision and Pattern Recognition. 1155--1163.
[24]
Wen Yan Lin, Siying Liu, Y Matsushita, and Tian Tsong Ng. 2011. Smoothly varying affine stitching. In Computer Vision and Pattern Recognition. 345--352.
[25]
Alvarez Luis, Gomez Luis, and Sendra Rafael. 2010. Algebraic Lens Distortion Model Estimation. Image Processing on Line 1 (2010).
[26]
Ming Meng, Yi Zhou, Chong Tan, and Zhong Zhou. 2018. Viewpoint Quality Evaluation for Augmented Virtual Environment. In Advances in Multimedia Information Processing - PCM 2018 - 19th Pacific-Rim Conference on Multimedia, Hefei, China, September 21--22, 2018, Proceedings, Part III. 223--234.
[27]
Y. Nie, T. Su, Z. Zhang, H. Sun, and G. Li. 2017. Dynamic Video Stitching via Shakiness Removing. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society PP, 99 (2017), 1--1.
[28]
Sebe Ismail Oner, Hu Jinhui, You Suya, and Neumann Ulrich. 2003. 3D video surveillance with Augmented Virtual Environments. In Proc. ACM Int'l. Workshop on Video Surveillance, Usa, Nov. 107--112.
[29]
F. Perazzi, A. Sorkine-Hornung, H. Zimmer, P. Kaufmann, O. Wang, S. Watson, and M. Gross. 2015. Panoramic Video from Unstructured Camera Arrays. In Computer Graphics Forum. 57--68.
[30]
Decamp Philip, Shaw George, Kubat Rony, and Roy Deb. 2010. An immersive system for browsing and visualizing surveillance video. In ACM International Conference on Multimedia. 371--380.
[31]
Sawhney Harpreet S, Arpa Aydin, Kumar Rakesh, Samarasekera Supun, Aggarwal Manoj, Hsu Steven C, Nister David, and Hanna Keith J. 2002. Video flashlights: real time rendering of multiple videos for immersive model visualization. (2002), 157--168.
[32]
Nader Salman and Mariette Yvinec. 2009. Surface Reconstruction from Multi-View Stereo of Large-Scale Outdoor Scenes. International Journal of Virtual Reality Volume 9, Number 1 (2009), 19--26.
[33]
Grant Schindler, Panchapagesan Krishnamurthy, and Frank Dellaert. 2007. Line-Based Structure from Motion for Urban Environments. In International Symposium on 3d Data Processing, Visualization, and Transmission. 846--853.
[34]
S. N Sinha, D Steedly, and R Szeliski. 2009. Piecewise planar stereo for image-based rendering. In IEEE International Conference on Computer Vision. 1881--1888.
[35]
Sudipta N. Sinha, Drew Steedly, Richard Szeliski, Maneesh Agrawala, and Marc Pollefeys. 2008. Interactive 3D architectural modeling from unordered photo collections. In Acm Siggraph Asia. 1--10.
[36]
Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo Tourism: Exploring Photo Collections In 3D. Acm Transactions on Graphics 25, 3 (2006), págs. 835--846.
[37]
Tan Su, Yongwei Nie, Zhensong Zhang, Hanqiu Sun, and Guiqing Li. 2016. Video stitching for handheld inputs via combined video stabilization. In SIGGRAPH ASIA 2016 Technical Briefs. 25.
[38]
Neumann U., You Suya, Hu Jinhui, and Jiang Bolan. 2003. Augmented virtual environments (AVE): dynamic fusion of imagery and 3D models. In IEEE Virtual Reality. 61.
[39]
Miao Wang, Ariel Shamir, Guo Ye Yang, Jin Kun Lin, Guo Wei Yang, Shao Ping Lu, and Shi Min Hu. 2018. BiggerSelfie: Selfie Video Expansion with Hand-held Camera. IEEE Transactions on Image Processing (2018), 1--1.
[40]
Jianxiong Xiao, Tian Fang, Peng Zhao, Maxime Lhuillier, and Long Quan. 2009. Image-based street-side city modeling. Acm Transactions on Graphics 28, 5 (2009), 1--12.
[41]
Wang Yi, Krum David M., Coelho Enylton M., and Bowman Doug A. 2007. Contextualized Videos: Combining Videos with Environment Models to Support Situational Understanding. IEEE Trans Vis Comput Graph 13, 6 (2007), 1568--1575.
[42]
Julio Zaragoza, Tat Jun Chin, Michael S. Brown, and David Suter. 2013. As-Projective-As-Possible Image Stitching with Moving DLT. In IEEE Conference on Computer Vision and Pattern Recognition. 2339--2346.
[43]
G. Zhang, Y. He, W. Chen, J. Jia, and H. Bao. 2016. Multi-Viewpoint Panorama Construction With Wide-Baseline Images. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 25, 7 (2016), 3099.

Cited By

View all
  • (2023)Web-based Mixed Reality Video Fusion with Remote RenderingVirtual Reality & Intelligent Hardware10.1016/j.vrih.2022.03.0055:2(188-199)Online publication date: Apr-2023
  • (2023)Geometric-driven structure recovery from a single omnidirectional image based on planar depth map learningNeural Computing and Applications10.1007/s00521-023-09025-735:34(24407-24433)Online publication date: 3-Oct-2023
  • (2023)Mirror world: creating digital twins of the space and persons from video streamingsThe Visual Computer10.1007/s00371-023-03193-240:9(6689-6704)Online publication date: 28-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
VRST '18: Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology
November 2018
570 pages
ISBN:9781450360869
DOI:10.1145/3281505
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 November 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. augmented virtual environment
  2. immersive videos
  3. video fusion
  4. video popup
  5. video tourism

Qualifiers

  • Research-article

Funding Sources

Conference

VRST '18

Acceptance Rates

Overall Acceptance Rate 66 of 254 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)7
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Web-based Mixed Reality Video Fusion with Remote RenderingVirtual Reality & Intelligent Hardware10.1016/j.vrih.2022.03.0055:2(188-199)Online publication date: Apr-2023
  • (2023)Geometric-driven structure recovery from a single omnidirectional image based on planar depth map learningNeural Computing and Applications10.1007/s00521-023-09025-735:34(24407-24433)Online publication date: 3-Oct-2023
  • (2023)Mirror world: creating digital twins of the space and persons from video streamingsThe Visual Computer10.1007/s00371-023-03193-240:9(6689-6704)Online publication date: 28-Dec-2023
  • (2022) Real‐time fusion of multiple videos and 3D real scenes based on optimal viewpoint selection Transactions in GIS10.1111/tgis.1301927:1(198-223)Online publication date: 26-Dec-2022
  • (2022)Model-guided 3D stitching for augmented virtual environmentScience China Information Sciences10.1007/s11432-021-3323-266:1Online publication date: 29-Dec-2022
  • (2022)Fusing surveillance videos and three‐dimensional scene: A mixed reality systemComputer Animation and Virtual Worlds10.1002/cav.212934:1Online publication date: 5-Dec-2022
  • (2021)Distortion-Aware Room Layout Estimation from A Single Fisheye Image2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)10.1109/ISMAR52148.2021.00061(441-449)Online publication date: Oct-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media