Interactive multi-scale structures for summarizing video content

Wang, HongAn; Ma, CuiXia

doi:10.1007/s11432-013-4833-6

Interactive multi-scale structures for summarizing video content

Research Paper
Special Focus
Published: 24 May 2013

Volume 56, pages 1–12, (2013)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

HongAn Wang¹ &
CuiXia Ma²

163 Accesses
3 Citations
Explore all metrics

Abstract

Efficient video summarization leads to facilely exploring video content appropriate to the user’s intention with low cognitive demand. In this paper, we present a novel approach for summarizing videos in the form of multi-scale structures that exhibit different video features at different scale levels and allow exploration of video contents with multi-scale interaction. The semantic relationship between structures is addressed and user intention is also considered and integrated in the summarization and interaction. This paper first introduces the concept of multi-scale structures for summarizing video content and describes three different types of structures that present important features at different scale levels. Furthermore, a continuous zooming interaction for browsing multi-scale structures is provided to facilitate video browsing. Finally, an elaborate user study is conducted showing that user performance on understanding and browsing videos is improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Zhang X, Furnas G W. mCVEs: Using cross-scale collaboration to support user interaction with multiscale structures. Presence-Teleoper Virtual Env, 2005, 14: 31–46
Article Google Scholar
Tong L, Hong J Z. Automatic video scene extraction by shot grouping. In: 15th International Conference on Pattern Recognition, Barcelona, 2000. 39–42
Google Scholar
Zhai Y, Shah M. Video scene segmentation using Markov chain Monte Carlo. IEEE Trans Multimedia, 2006, 8: 686–697
Article Google Scholar
Liu Y J, Luo X, Xuan Y M, et al. Image retargeting quality assessment. Comput Graph Forum, 2011, 30: 583–592
Article Google Scholar
Gargi U, Kasturi R, Strayer S H. Performance characterization of video-shot-change detection methods. IEEE Trans Circuits Syst Video Technol, 2000, 10: 1–13
Article Google Scholar
Barnes C, Goldman D B, Shechtman E, et al. Video tapestries with continuous temporal zoom. ACM Trans Graph, 2010, 29: 89
Article Google Scholar
Liu Y J, Tang K, Gong W, et al. Industrial design using interpolatory discrete developable surfaces. Comput-Aided Des, 2011, 43: 1089–1098
Article Google Scholar
Ma C X, Liu Y J, Wang H A, et al. Sketch-based annotation and visualization in video authoring. IEEE Trans Multimedia, 2012, 14: 1153–1165
Article Google Scholar
Ueda H, Miyatake T, Summino S, et al. Automatic structure visualization for video editing. In: INTERACT’93, Amsterdam, 1993. 137–141
Google Scholar
Shingo U, Jonathan F, et al. Video Manga: generating semantically meaningful video summaries. In: The 7th ACM International Conference on Multimedia (Part 1), Orlando, 1999. 383–392
Google Scholar
Collomosse J P, McNeill G, Qian Y. Storyboard sketches for content based video retrieval. In: International Conference on Computer Vision, Kyoto, 2009. 245–252
Google Scholar
Goldman D B, Curless B, Salesin D, et al. Schematic storyboarding for video visualization and editing. ACM Trans Graph, 2006, 25: 862–871
Article Google Scholar
Liu Y J, Chen Z, Tang K. Construction of iso-contours, bisectors and Voronoi diagrams on triangulated surfaces. IEEE Trans Pattern Anal Mach Intell, 2011, 33: 1502–1517
Article Google Scholar
Bederson B. Quantum treemaps and bubblemaps for a zoomable image browser. In: The 14th Annual ACM Symposium on User Interface Systems and Technology, Orlando, 2001. 71–80
Google Scholar
Shipman F, Girgensohn A, Wilcox L. Generation of interactive multi-level video summaries. In: ACM Multimedia, Berkeley, 2003. 392–401
Google Scholar
Furnas G, Bederson B. Space-scale diagrams: understanding multiscale interfaces. In: The ACM Conference on Human Factors in Computing System, Denver, 1995. 234–241
Chapter Google Scholar
Perlin K, Fox D. Pad: an alternative approach to the computer interface. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1993. 57–64
Google Scholar
Albrecht J, Car A. GIS analysis for scale-sensitive environmental modelling based on hierarchy theory. In: Dikau R, Saurer H, eds. GIS for Earth Surface Systems. Borntraeger, 1999. 1–23
Google Scholar
Reitsma F, Bittner T. Scale in object and process ontologies. In: Kuhn W, Worboys M F, Timpf S, eds. Spatial Information Theory: Foundations of Geographic Information Science. Proceedings of COSIT’03. Berlin: Springer-Verlag, 2003. 13–30
Chapter Google Scholar
Andy C, Amy K, Benjamin B, et al. A review of overview + detail, zooming, and focus + context interfaces. ACM Comput Surv, 2008, 41: 2
Google Scholar
Bederson B, Hollan J. Pad++: a zooming graphical interface for exploring alternate interface physics. In: The ACM Symposium of User Interface Software and Technology, Marina del Rey, 1994. 17–26
Google Scholar
Bederson B, Meyer J, Good L. Jazz: an extensible zoomable user interface graphics toolkit in Java. In: The ACM Conference on User Interface and Software Technology, San Diego, 2000. 171–180
Google Scholar
Bederson B, Grosjean J, Meyer J. Toolkit design for interactive structured graphics. IEEE Trans Softw Eng, 2004, 30: 535–546
Article Google Scholar
Pietriga E. A toolkit for addressing HCI issues in visual language environments. In: IEEE Symposium on Visual Languages and Human-Centric Computing, Dallas, 2005. 145–152
Google Scholar
Bederson B, Clamage A, Czerwinski M P, et al. DateLens: a fisheye calendar interface for PDAs. ACM Trans Comput-Hum Interact, 2004, 11: 90–119
Article Google Scholar
Liu Y J, Tang K, Joneja A. Sketch-based free-form shape modelling with a fast and stable numerical engine. Comput Graph, 2005, 29: 778–793
Article Google Scholar
Liu Y J, Luo X, Joneja A, et al. User-adaptive sketch-based 3D CAD model retrieval. IEEE Trans Autom Sci Eng, 2013. doi: 10.1109/TASE.2012.2228481
Google Scholar
Liu Y J, Ma C X, Zhang D L. Easytoy: a plush toy design system using editable sketch curves. IEEE Comput Graph Appl, 2011, 31: 49–57
Article Google Scholar
Simoncelli E P, Olshausen B. Natural image statistics and neural representation. Annu Rev Neurosci, 2001, 24: 1193–1216
Article Google Scholar
Kraaij W, Smeaton A, OVER P, et al. Trecvid 2004—an overview. In: TRECVID 2004-Text REtrieval Conference TRECVID Workshop, Gaithersburg, 2004
Harel J, Koch C, Perona P. Graph-based visual saliency. In: NIPS’2006, Vancouver, 2006. 545–552
Google Scholar
Xipeng S, Matthew B, Jiebo L, et al. Multilabel machine learning and its application to semantic scene classification. In: Storage and Retrieval Methods and Applications for Multimedia’2004, San Jose, 2004. 188–199
Google Scholar
Heng D C, Xi H J, Angela S, et al. Color image segmentation: advances and prospects. Pattern Recognit, 2001, 4: 2259–2281
Google Scholar
Teuvo K. Self-Organizing Maps. 3rd ed. Berlin: Springer, 2001
MATH Google Scholar
Haykin S. Neural Network: A Comprehensive Foundation. 2nd ed. New York: Prentice-Hall, 1999
MATH Google Scholar
Kang H, Lee S, Chui C K. Coherent line drawing. In: Proceedings of the 5th International Symposium on Non-Photorealistic Animation and Rendering, San Diego, 2007. 43–50
Google Scholar
Bhat P, Zitnick C L, Cohen M, et al. GradientShop: a gradient-domain optimization framework for image and video filtering. ACM Trans Graph, 2010, 29: 10
Article Google Scholar
Rubine D. Specifying gestures by example. ACM SIGGRAPH Comput Graph, 1991, 25: 329–337
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
HongAn Wang
Beijing Key Laboratory of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
CuiXia Ma

Authors

HongAn Wang
View author publications
You can also search for this author in PubMed Google Scholar
CuiXia Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to CuiXia Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Ma, C. Interactive multi-scale structures for summarizing video content. Sci. China Inf. Sci. 56, 1–12 (2013). https://doi.org/10.1007/s11432-013-4833-6

Download citation

Received: 10 September 2012
Accepted: 18 February 2013
Published: 24 May 2013
Issue Date: May 2013
DOI: https://doi.org/10.1007/s11432-013-4833-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive multi-scale structures for summarizing video content

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

Guided Search 6.0: An updated model of visual search

Movie Description

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interactive multi-scale structures for summarizing video content

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

Guided Search 6.0: An updated model of visual search

Movie Description

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation