A fuzzy logic approach for detection of video shot boundaries
Introduction
Multimedia applications have been extremely expanded over the past decade, and a hierarchical structure is needed for managing the video content. With such a demand for a hierarchical model of videos, many researchers are moving into this area and developing algorithms for visual content interpretation, analysis, and management, where content-based video retrieval and content-based video coding [1] stand out as the most representative examples in this trend.
To build a system and manage the structure of videos, it is widely recognised that the first step would be the automatic detection of shot cuts or their boundaries to divide the video sequence into manageable sections, where the visual content remains consistent in terms of camera operations and other visual events. From the media production side, meaningful stories and sceneries are generated via sequences of video editing, which resulted in a range of different shot boundaries, including abrupt shot boundary, dissolved shot boundary, fade in, and fade out. These shot boundaries can be grouped into two categories according to the duration of the change. One is called abrupt shot cut if an instantaneous change happens from one shot to another. This type of shot boundaries is primarily used by editors to cut scenes into consistent sections. The other one is called gradual shot boundary including fade in, fade out and dissolve because the shot changes gradually. A fade is defined as a shot appear or disappear through several of frames. Dissolve is made when a shot fades in whilst another shot fades out. Samples of these shot boundaries are illustrated in Fig. 1.
Corresponding to these shot boundaries as illustrated in Fig. 1, many algorithms have been proposed and reported in the published literature [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. Pixel-based difference analysis remains to be one of the most adopted approaches to find the dissimilarity between shot boundaries. Exploitation of temporal information, such as those based on motion estimation and compensation techniques [13], represents another major direction for shot cut detection, in which the large difference value caused by the global motion can be easily detected and extracted by a simple block matching procedure. In Ref. [2], shot cut boundaries are detected by calculating the normalised correlation among blocks and locating the maximum correlation coefficient in the frequency domain. In line with the motion information, spatial information such as edges etc. is often regarded as another important feature for shot cut detection. The edge tracking algorithm reported in [8] stands out as a representative example for this direction of research, where the proposed techniques are mainly based on the principle that, when most of the edge information is lost in the consecutive frames, a shot cut can be declared. In Ref. [4], the feature of edge pixel count is proposed for shot cut detection, where Sobel edge detector is used. But the most widely used feature is colour histogram, examples of which includes histogram intersection, “twin-comparison”, local histogram etc. [3], [4], [5], [6]. These histogram features, however, have drawbacks in the sense that, when the global motion is fast or the contrast of the frames is low, these colour histogram based techniques will miss many shot boundaries, and thus their recalling performance becomes poor. In Ref. [7], a visual rhythm-based technique is reported, which produced good experimental results in shot cut detection. Due to the complexity of the algorithm design, however, this technique can only be used for off-line shot cut detection. In contrast to most existing approaches, we propose a fuzzy logic approach in this paper to exploit the feature-based techniques and detect shot boundaries. The advantages of our contribution can be highlighted as: (i) a range of features can be integrated by fuzzy logic operation to exploit their individual strength collectively; and (ii) while directly thresholding features remains sensitive to noises, selecting threshold in fuzzy domain provides a buffered operation and thus makes the detection more reliable. The rest of this paper is organised as follows. Section 2 describes detailed design of the proposed algorithm. Section 3 reports experimental results in comparison with three existing algorithms, and Section 4 provides conclusions.
Section snippets
Feature extraction and mode selector design
To detect relatively abrupt scene changes between boundaries of shots, we propose to combine a range of representative features to construct a hybrid feature for shot cut detection. By representative, we mean those features that: (i) are widely used in existing shot cut detection algorithms; (ii) are relatively mature that substantial evaluations have been reported and supportive results obtained in the literature; and (iii) whose implementation does not incur intensive computing cost and their
Experimental result
To evaluate the proposed temporal segmentation algorithms, we need to address two important issues to prepare for experiments design. The first issue is the test data set and the second issue is the selection of a benchmark out existing work in relevant areas. In order to ensure that the proposed algorithm can be evaluated by any researcher in comparison with any other algorithms to be developed in the future, we select two open test data sets, one is publicly available via Internet download
Conclusions
In this paper, we proposed a fuzzy logic approach for temporal segmentation of videos, where a number of features are integrated towards better performances of shot cut detection. These features include color histogram intersection, motion compensation, texture change, and edge variances. Experimental results support that the proposed algorithm is effective in video segmentation benchmarked by three existing algorithms and measured by precision and recall rates.
Acknowledgment
The authors wish to acknowledge the financial support from the EU IST Framework-6 programme under the IP Project: Live staging of media events (IST-04-027312).
References (19)
- et al.
Video segmentation using a histogram-based fuzzy c-means clustering algorithm
Comput. Standards & Interfaces
(2001) - et al.
A fuzzy theoretic approach for video segmentation using syntactic features
Pattern Recognition Lett.
(2001) - et al.
Video segmentation based on 2D image analysis
Pattern Recognition Lett.
(2003) - et al.
A new cut detection algorithm with constant false-alarm ratio for video segmentation
J. Vis. Commun. Image R.
(2004) - et al.
Video extraction for fast content access to MPEG compressed videos
IEEE Trans. Circuits, Systems Video Technol.
(2004) - et al.
Temporal video segmentation and classification of edit effects
Image Vision Comput.
(2003) Shot-boundary detection: unraveled and resolved?
IEEE Trans. Circuits Syst. Video Technol.
(2002)- et al.
A robust scene-change detection method for video segmentation
IEEE Trans. Circuits Syst. Video Technol.
(2001) - A. Whitehead, P. Bose, R. Laganiere, Feature based cut detection with automatic threshold selection, Proceedings,...
Cited by (71)
Exploring global diverse attention via pairwise temporal relation for video summarization
2021, Pattern RecognitionCitation Excerpt :Finally, we conclude this paper. Video summarization [15] has been a long-standing problem in multimedia analysis with great practical potential, and lots of relevant works have been explored in recent years. Traditional approaches can be generally divided into two categories, i.e., unsupervised learning and supervised learning.
A vague set approach for identifying shot transition in videos using multiple feature amalgamation
2019, Applied Soft Computing JournalCitation Excerpt :For precise detection of shot boundaries, fuzzy logic, artificial neural networks, genetic algorithms, support vector machines, rough sets etc. have been applied in several works. Fang et al. [116] adopted an approach based on fuzzy logic to integrate several features extracted from the frames. A mode selector is used to combine several algorithms in order to detect abrupt and gradual transitions.
Video shot-boundary detection: issues, challenges and solutions
2024, Artificial Intelligence ReviewWOA-FNN: An innovative hybrid optimization technique for effective detection of shot boundaries
2023, Conference Proceedings - 2023 IEEE Silchar Subsection Conference, SILCON 2023Adaptive Multiview Graph Difference Analysis for Video Summarization
2022, IEEE Transactions on Circuits and Systems for Video Technology