Paper
19 December 2001 Extracting movie scenes based on multimodal information
Author Affiliations +
Proceedings Volume 4676, Storage and Retrieval for Media Databases 2002; (2001) https://doi.org/10.1117/12.451109
Event: Electronic Imaging, 2002, San Jose, California, United States
Abstract
This research addresses the problem of automatically extracting semantic video scenes from daily movies using multimodal information. A 3-stage scene detection scheme is proposed. In the first stage, we use pure visual information to extract a coarse-level scene structure based on generated shot sinks. In the second stage, the audio cue is integrated to further refine scene detection results by considering various kinds of audio scenarios. Finally, in the third stage, we allow users to directly interact with the system so as to fine-tune the detection results to their own satisfaction. The generated scene structure can provide a compact yet meaningful abstraction of the video data, which will apparently facilitate the content access. Preliminary experiments on integrating multiple media cues for movie scene extraction have yielded encouraging results.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ying Li and C.-C. Jay Kuo "Extracting movie scenes based on multimodal information", Proc. SPIE 4676, Storage and Retrieval for Media Databases 2002, (19 December 2001); https://doi.org/10.1117/12.451109
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Video

Information visualization

Semantic video

Feature extraction

Signal detection

Sensors

RELATED CONTENT


Back to Top