Abstract
We introduce a new representation for video which facilitates a number of common editing tasks. The representation has some of the power of a full reconstruction of 3D surface models from video, but is designed to be easy to recover from a priori unseen and uncalibrated footage. By modelling the image-formation process as a 2D-to-2D transformation from an object's texture map to the image, modulated by an object-space occlusion mask, we can recover a representation which we term the "unwrap mosaic". Many editing operations can be performed on the unwrap mosaic, and then re-composited into the original sequence, for example resizing objects, repainting textures, copying/cutting/pasting objects, and attaching effects layers to deforming objects.
Supplemental Material
- 2d3 Ltd., 2008. Boujou 4: The virtual interchangeable with the real. http://www.2d3.com.Google Scholar
- Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S. M., Colburn, A., Curless, B., Salesin, D., and Cohen, M. F. 2004. Interactive digital photomontage. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 294--302. Google ScholarDigital Library
- Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M., and Szeliski, R. 2007. A database and evaluation methodology for optical flow. In Proc. ICCV.Google Scholar
- Bhat, P., Zitnick, C. L., Snavely, N., Agarwala, A., Agrawala, M., Cohen, M., Curless, B., and Kang, S. B. 2007. Using photographs to enhance videos of a static scene. In Eurographics Symposium on Rendering. Google ScholarDigital Library
- Black, M. J., and Anandan, P. 1993. A framework for the robust estimation of optical flow. In Proc. ICCV, 231--236.Google Scholar
- Blake, A., and Zisserman, A. 1987. Visual Reconstruction. MIT Press. Google ScholarDigital Library
- Boykov, Y., and Jolly, M.-P. 2001. Interactive graph cuts for optimal boundary and region segmentation of objects in n-D images. In Proc. ICCV, 105--112.Google Scholar
- Brand, M. 2001. Morphable 3D models from video. In Proc. CVPR, vol. 2, 456--463.Google Scholar
- Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proc. CVPR, 690--696.Google Scholar
- Brown, M., and Lowe, D. G. 2007. Automatic panoramic image stitching using invariant features. Intl. J. Comput. Vision 74, 1, 59--73. Google ScholarDigital Library
- Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. In Proc. ECCV, 25--36.Google Scholar
- Bruhn, A., Weickert, J., and Schnörr, C. 2005. Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods. Intl. J. of Computer Vision 61, 3, 211--231. Google ScholarDigital Library
- Costeira, J. P., and Kanade, T. 1998. A multibody factorization method for independently moving objects. Intl. J. of Computer Vision 29, 3, 159--179. Google ScholarDigital Library
- Cox, M., and Cox, M. A. A. 2001. Multidimensional Scaling. Chapman and Hall.Google Scholar
- Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs. In Proc. ACM Siggraph. Google ScholarDigital Library
- Fleet, D., Jepson, A., and Black, M. 2002. A layered motion representation with occlusion and compact spatial support. In Proc. ECCV, 692--706. Google ScholarDigital Library
- Frey, B. J., Jojic, N., and Kannan, A. 2003. Learning appearance and transparency manifolds of occluded objects in layers. In Proc. CVPR. Google ScholarDigital Library
- Gay-Bellile, V., Bartoli, A., and Sayd, P. 2007. Direct estimation of non-rigid registrations with image-based self-occlusion reasoning. In Proc. ICCV.Google Scholar
- Gu, X., Gortler, S. J., and Hoppe, H. 2002. Geometry images. ACM Trans. Graph. (Proc. of SIGGRAPH), 355--361. Google ScholarDigital Library
- Irani, M., Anandan, P., and Hsu, S. 1995. Mosaic based representations of video sequences and their applications. In Proc. ICCV. Google ScholarDigital Library
- Lempitsky, V., and Ivanov, D. 2007. Seamless mosaicing of image-based texture maps. In Proc. CVPR, 1--6.Google Scholar
- Li, Y., Sun, J., and Shum, H.-Y. 2005. Video object cut and paste. ACM Trans. Graph. (Proc. of SIGGRAPH) 24, 3, 595--600. Google ScholarDigital Library
- Rav-Acha, A., Kohli, P., Rother, C., and Fitzgibbon, A. 2008. Unwrap mosaics. Tech. rep., Microsoft Research. http://research.microsoft.com/unwrap.Google Scholar
- Sand, P., and Teller, S. J. 2006. Particle video: Longrange motion estimation using point trajectories. In Proc. CVPR, 2195--2202. Google ScholarDigital Library
- Seetzen, H., Heidrich, W., Stuerzlinger, W., Ward, G., Whitehead, L., Trentacoste, M., Ghosh, A., and Vorozcovs, A. 2004. High dynamic range display systems. ACM Trans. Graph. (Proc. of SIGGRAPH) 23, 3, 760--768. Google ScholarDigital Library
- Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR, vol. 1, 519--526. Google ScholarDigital Library
- Seymour, M. 2006. Art of optical flow. fxguide.com: Feature Stories (Dec.).Google Scholar
- Shade, J. W., Gortler, S. J., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Proc. ACM Siggraph, 231--242. Google ScholarDigital Library
- Shi, J., and Malik, J. 1997. Normalized cuts and image segmentation. In Proc. CVPR, 731--743. Google ScholarDigital Library
- Thormählen, T., and Broszio, H., 2008. Voodoo Camera Tracker: A tool for the integration of virtual and real scenes. http://www.digilab.uni-hannover.de/docs/manual.html.Google Scholar
- Toklu, C., Erdem, A. T., and Tekalp, A. M. 2000. Two-dimensional mesh-based mosaic representation for manipulation of video objects with occlusion. IEEE Trans. Image Proc. 9, 9, 1617--1630. Google ScholarDigital Library
- Torresani, L., Hertzmann, A., and Bregler, C. 2008. Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Trans. PAMI, (to appear). Google ScholarDigital Library
- Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. ACM Siggraph, 311--318. Google ScholarDigital Library
- van den Hengel, A., Dick, A., Thormählen, T., Ward, B., and Torr, P. H. S. 2007. VideoTrace: Rapid interactive scene modelling from video. ACM Trans. Graph. (Proc. of SIGGRAPH). Google ScholarDigital Library
- Wang, J. Y. A., and Adelson, E. H. 1994. Representing moving images with layers. IEEE Trans. Image Proc. 3, 5, 625--638.Google ScholarDigital Library
- Woodford, O. J., Reid, I. D., and Fitzgibbon, A. W. 2007. Efficient new-view synthesis using pairwise dictionary priors. In Proc. CVPR.Google Scholar
- Zhou, K., Wang, X., Tong, Y., Desbrun, M., Guo, B., and Shum, H.-Y. 2005. Texture-Montage: Seamless texturing of surfaces from multiple images. ACM Trans. Graph. (Proc. of SIGGRAPH), 1148--1155. Google ScholarDigital Library
- Zigelman, G., Kimmel, R., and Kiryati, N. 2002. Texture mapping using surface flattening via multi-dimensional scaling. IEEE Trans. on Visualization and Computer Graphics 8, 2, 198--207. Google ScholarDigital Library
Index Terms
- Unwrap mosaics: a new representation for video editing
Recommendations
Generalized Parallel-Perspective Stereo Mosaics from Airborne Video
Abstract--In this paper, we present a new method for automatically and efficiently generating stereoscopic mosaics by seamless registration of images collected by a video camera mounted on an airborne platform. Using a parallel-perspective ...
Panoramic Depth Imaging: Single Standard Camera Approach
In this paper we present a panoramic depth imaging system. The system is mosaic-based which means that we use a single rotating camera and assemble the captured images in a mosaic. Due to a setoff of the camera's optical center from the rotational ...
Model-based 2D&3D dominant motion estimation for mosaicing and video representation
ICCV '95: Proceedings of the Fifth International Conference on Computer VisionIt is fairly common in video sequences that a mostly fixed background (scene) is imaged with or without objects. The dominant background changes in the image plane mostly due to camera operations and motion (zoom, pan, tilt, track etc.). We address the ...
Comments