Video Scenes Segmentation Based on Multimodal Genre Prediction

https://doi.org/10.1016/j.procs.2020.08.002Get rights and content
Under a Creative Commons license
open access

Abstract

Recent technologies’ understanding videos content remain limited due to its complexity and length. However, videos segmentation into small coherent units facilitates indexing and searching task. The subjectivity remains the essential constraint of videos, but the genre (drama, action...) does not present any conflict. In this paper, we present a new approach to video segmentation into scenes based on genre prediction. Initially, the video is divided into shots of equal duration. We used architecture, based on audio-visuals deep features extracted from trained neural networks for genre prediction, and we introduced a transition detection method based on the similarity calculation between shots genre. The originality of this method consists in using the highly level semantic relationship between successive shots for transition detection. We reached good performances on videos of the multi varied genre. We used the RAI dataset and BBC dataset to evaluate our method through a comparison with other state-of-the-art approaches.

Keywords

Segmentation
Transition Detection
Multimodal
Deep Features
Genre

Cited by (0)