Abstract
The explosive growth of video data demands the video presentation technique which supports fast browsing of video content. In this paper, we present an automatic procedure for constructing a compact synthesized collage from a video sequence. The synthesized image, called “Video Collage”, is a kind of static video summary—to select the most representative images from video, to extract salient regions of interest (ROIs) from these images, and to seamlessly arrange ROIs on a given canvas with the temporal structure of video content preserved. We formulate the generation of Video Collage as a unified energy minimization problem in which each of above desirability is represented by an energy term. We start from the basic setting of Video Collage in which both the shape of ROIs and collage are fixed as rectangular, and then show how it can support arbitrary shapes of ROIs, as well as a variety of collage templates and region of interest (ROI) arrangement layouts (i.e., book, diagonal, and spiral). The experiments show its effectiveness to present a video in a very compact and visually appealing form while preserving the necessary information to understand the video.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agarwala, A., Dontcheva, M., Agrawala, M., et al.: Interactive digital photomontage. In: Proceedings of ACM SIGGRAGPH (2004)
Chen, J.C., Chu, W.T., Kuo, J.H., Weng, C.Y., Wu, J.L.: Tiling slideshow. In: Proceedings of ACM Multimedia (2006)
Chiu, P., Girgensohn, A., Liu, Q.: Stained-glass visualization for highly condensed video summaries. In: Proceedings of ICME, pp. 2059–2062 (2004)
Diakopoulos, N., Essa, I.: Mediating photo collage authoring. In: Proceedings of UIST, pp. 183–186 (2005)
Girgensohn, A., Bly, S., Shipman, F., Boreczky, J., Wilcox, L.: Home video editing made easy—balancing automation and use control. In: Proceedings of Human–Computer Interaction: INTERACT, pp. 464–471. Tokyo, Japan (2001)
Girgensohn, A., Shipman, F., Wilcox, L.: Hyper-hitchcock: authoring interactive videos and generating interactive summaries. In: Proceedings of ACM International Conference on Multimedia, pp. 92–93. Berkeley, CA, USA (2003)
Graham, J., Hull, J.J.: The video paper multimedia playback system. In: Proceedings of ACM International Conference on Multimedia, pp. 94–95. Berkeley, CA, USA (2003)
Hua, X.S., Li, S., Zhang, H.J.: Video booklet. In: Proceedings of ICME (2005)
Irani, M., Anandan, P.: Video indexing based on mosaic representations. Proc. IEEE 86(5), 905–921 (1998)
Kang, H., Matsushita, Y., Tang, X., Chen, X.: Space–time video montage. In: Proceedings of CVPR, pp. 1331–1338 (2006)
Kawai, Y., Sumiyoshi, H., Yagi, N.: Automated production of TV program trailer using electronic program guide. In: Proceedings of CIVR (2007)
Ma, Y.F., Zhang, H.J.: Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of ACM Multimedia, pp. 374–381 (2003)
Ma, Y.F., Zhang, H.J.: Video snapshot: A bird view of video sequence. In: Proceedings of International Multi-Media Modelling Conference, pp. 94–101. Melbourne, Australia (2005)
Ma, Y.F., Hua, X.S., Lu, L., Zhang, H.J.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimedia 7(5), 907–919 (2005)
Mei, T., Hua, X.S., Zhou, H.Q., Li, S., Zhang, H.J.: Efficient video mosaicing based on motion analysis. In: Proceedings of IEEE International Conference on Image Processing, pp. 861–864. Genoa, Italy (2005)
Mei, T., Zhu, C.Z., Zhou, H.Q., Hua, X.S.: Spatio-temporal quality assessments for home videos. In: Proceedings of ACM Multimedia (2006)
Mei, T., Hua, X.S., Zhu, C.Z., Zhou, H.Q., Li, S.: Home video visual quality assessment with spatiotemporal factors. IEEE Trans. Circuits Syst. Video Technol. 17(6), 699–706 (2007)
Rav-Acha, A., Pritch, Y., Peleg, S.: Making a long video short: dynamic video synopsis. In: Proceedings of CVPR, pp. 435–441 (2006)
Rother, C., Bordeaux, L., Hamadi, Y., Blake, A.: Autocollage. In: Proceedings of ACM Siggraph (2006)
Smith, M., Kanade, T.: Video skimming and characterization through the combination of image and language understanding techniques. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 775–781. San Juan, Puerto Rico (1997)
Taniguchi, Y., Akutsu, A., Tonomura, Y.: Panoramaexcerpts: extracting and packing panoramas for video browsing. In: Proceedings of ACM International Conference on Multimedia, pp. 427–436. Seattle, USA (1997)
Uchihashi, S., Foote, J., Girgensohn, A., Boreczky, J.: Video manga: generating semantically meaningful video summaries. In: Proceedings of ACM International Conference on Multimedia, pp. 383–392. Orlando, FL, USA (1999)
Ueda, H., Miyatake, T., Sumino, S., Nagasaka, A.: Automatic structure visualization for video editing. In: INTERCHI: Proceedings of Human Factors in Computing Systems, pp. 137–141. Amsterdam, Netherlands (1993)
Wang, G., Zhang, Y., Fei-Fei, L.: Using dependent regions for object categorization in a generative framework. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (2006)
Wang, T., Mei, T., Hua, X.S., Liu, X., Zhou, H.Q.: Video collage: a novel presentation of video sequence. In: IEEE International Conference on Multimedia & Expo, pp. 1479–1482. Beijing, China (2007)
Whitley, D.: A genetic algorithm tutorial. Stat. Comput. 4, 65–85 (1994)
Yang, B., Mei, T., Sun, L.F., Yang, S.Q., Hua, X.S.: Free-shaped video collage. In: International Conference on Multi-Media Modeling. Kyoto, Japan (2008)
Yeung, M.M., Yeo, B.L.: Video visualization for compact presentation and fast browsing of pictorial content. IEEE Trans. Circuits Syst. Video Technol. 7(5), 771–785 (1997)
Zhang, H.J., Kankanhalli, A., Smoliar, S.W.: Automatic partitioning of full-motion video. Multimedia Syst. 1(1), 10–28 (1993)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mei, T., Yang, B., Yang, SQ. et al. Video collage: presenting a video sequence using a single image. TVC 25, 39–51 (2009). https://doi.org/10.1007/s00371-008-0282-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-008-0282-4