A fuzzy theoretic approach for video segmentation using syntactic features

doi:10.1016/S0167-8655(01)00041-1

Pattern Recognition Letters

Volume 22, Issue 13, November 2001, Pages 1359-1369

https://doi.org/10.1016/S0167-8655(01)00041-1 Get rights and content

Abstract

This paper is concerned with the development of a fuzzy-logic-based framework for segmentation of video sequences. We have proposed a scheme for fuzzification of the frame-to-frame property difference values using the Rayleigh distribution. The difference values have been characterized by fuzzy terms like small, significant, large, etc. These terms have been used to design fuzzy rules for detecting abrupt changes and gradual changes. Fuzzy rules have provided a mechanism for integrating evidences based on different properties. The decompositional inference strategy has been used for fuzzy reasoning over the set of fuzzy rules. Gradual changes have been further classified as fade-in, fade-out and others (including dissolves, wipes, etc). Experimental results have shown that the proposed scheme can detect changes reliably.

Introduction

Partitioning of the video sequence by detecting scene changes is essential for indexing, parsing, characterization and categorization of video. Basically, there are two types of scene changes: abrupt change and gradual change. Camera breaks or abrupt changes are apparently easy to detect because the difference in image properties between consecutive frames is expected to be large. However, the amount of change can vary from one sequence to another depending upon the content of video and context of the scene change. Detection of gradual transition is more difficult because the change takes place over a period of frames. In this paper, we propose a fuzzy theoretic framework for video segmentation. The fuzzy theoretic scheme provides a mechanism for characterizing scene transitions through soft decisions. Subjectivity involved in the video partitioning process can be adequately captured by this framework.

The problem of video segmentation has received considerable attention in the recent times. Zhang et al. (1993) propose segmentation by using a threshold setting on differences of color histograms. They also deal with special effects, such as dissolving sequences, making use of a double threshold technique. Jain et al. (1995) present a model-based method, most suited to well-established video production processes. In particular, video segmentation is formulated as a production-model-based classification problem, and segmentation error measures are also defined. Aigrain et al. (1997) propose a rule-based system for video segmentation, which makes use of temporally located information present in video frames. These rules allow for the identification of more macroscopical changes consistent with the editing practices. In the work by Ardizzonei and Cascia (1996), a technique employing neural network is used. Essentially to detect a scene cut, two successive frames are preprocessed and a pixel-by-pixel difference is computed. This difference is then processed by a multi-layer perceptron that determines if the frames belong to the same shot. Color information present in input sequences is discarded and only luminance is used for the cut detection. Zabih et al. (1995) suggest a feature-based segmentation technique for classifying scene breaks. This approach can be used for detecting dissolves as a sequence of fade-out and fade-in effects. Xiong and Lee (1998) use a step-variable-based algorithm for scene change detection. The MPEG encoded video data can be segmented directly using the technique suggested by Arman et al. (1993). Here the DCT coefficients are analyzed to find frames where camera breaks take place.

Most of these video segmentation techniques characterize scene changes using static thresholds. Due to the time varying and subjective nature of the video data, these thresholds are not expected to work reliably under all possible conditions. Also, these approaches have used different image-based attributes for measuring the change. However, a particular property may not be always adequate for identifying the same type of transitions for different video sequences. In order to overcome these difficulties, we have proposed a fuzzy theoretic framework.

The benefits of a fuzzy framework in characterizing video data are twofold. Firstly, ambiguities and uncertainties involved in the selection of thresholds are taken care of by fuzzy rules. Secondly, because all inputs are represented as fuzzy sets, the complicated process of combination of various feature differences is simplified to straightforward application of standard fuzzy implications and compositions over these fuzzy relations.

In this paper we have presented a general scheme for fuzzifying frame-to-frame property differences. We have also indicated the patterns of fuzzy rules required for classifying scene transitions using multiple features. The actual scheme implemented in this paper is hierarchical in nature. A fusion of various syntactic features has been done to detect shot boundaries in a reliable fashion. An abrupt change is detected using histogram intersection, a gradual change is detected using a combination of pixel difference and histogram intersection while a fade is detected using a combination of pixel difference, histogram intersection and edge-pixel-count. The other syntactic features like motion (Cherfaoui and Bertin, 1995), moments difference (Arman et al., 1994) have also been tested. The motion feature has not been found to be very effective for shot detection as reported by Boreczky and Rowe (1996). There are a large number of false positives due to camera motion. However, we have found that optic-flow-based features are useful for detecting gradual transitions. Experimental results show that the proposed set of features produces reliable results.

The rest of the paper is organized as follows. We briefly discuss various syntactic features used for segmentation in Section 2. In Section 3, we have presented the fuzzy theoretic framework for characterization of the scene transitions. Experimental results are reported in Section 4. Finally, Section 5 presents the conclusions.

Section snippets

Feature extraction

Selection of an appropriate feature for segmenting video data is the most critical issue. Several such features have been suggested in the literature (Zhang et al., 1993; Tanaka and Nagasaka, 1992), like histogram difference, pixel difference, optical flow, etc. All these measures have their own advantages and disadvantages, but none is general enough to account for all types of changes in the video data. Thus instead of using a single feature, we have experimented with various combinations of

Fuzzy characterization of scene transitions

In this section we present a fuzzy theoretic framework for segmentation of video sequences. The framework is characterized by appropriate formulation of relevant fuzzy variables, fuzzy rules and inference methods.

Experimental results

The proposed scheme has been successfully implemented on Sun SPARC Ultra-1 workstation. The implementation is done in C language using XIL. We have tested the proposed scheme on a number of video sequences which include sports video, news video, documentaries, movies, etc. The corpus used to test the segmentation scheme is heterogenous in terms of effect types. It consists of some of the common cases of possible error of segmentation like sequences with fast motion, high-illumination change,

Conclusions

In this paper we have presented a fuzzy-logic-based video segmentation technique. This scheme takes care of subjectivity of video data while detecting shot breaks. Fuzziness involved in categorizing video data is explored for segmentation purposes. Frame-to-frame difference values are fuzzified to obtain linguistic variables. These variables appropriately quantify the amount of change occurring between consecutive frames. These variables are then used for developing fuzzy rules to detect shot

References (16)

W. Xiong et al.
Efficient scene change detection and camera motion annotation for video classification
Comput. Vision Image Understanding
(1998)
L. Zadeh
Fuzzy sets
Inform. Control
(1965)
Aigrain, P., Joly, P., Longueville, V., 1997. Medium knowledge-based macro-segmentation of video into sequences. In:...
Ardizzonei, E., Cascia, M.L., 1996. Automatic video database indexing and retrieval. In: Multimedia Tools and...
Arman, F., Depommier, R., Hsu, A., Chiu, M., 1994. Content-based browsing of video sequences. In: Proc. ACM Multimedia,...
Arman, F., Hsu, A., Chiu, M., 1993. Image processing on compressed data for large video databases. In: Proc. 1st ACM...
Boreczky, J., Rowe, L., 1996. Comparison of video shot boundary detection techniques. In: Proc. IS&T/SPIE Conf. on...
Cherfaoui, M., Bertin, C., 1995. Temporal segmentation of videos: A new approach. In: Proc. IS&T/SPIC Conf. on Digital...

There are more references available in the full text version of this article.

Cited by (43)

A vague set approach for identifying shot transition in videos using multiple feature amalgamation
2019, Applied Soft Computing Journal
Citation Excerpt :
Chakraborty et al. [118] used fuzzy correlation measure in order to detect hard cuts in videos. In a work by Jadon et al. [119], the authors rely on a fuzzification method for frame dissimilarity based on Rayleigh distribution. Fuzzy terms are used to denote the amount of changes and fuzzy rules are built for the purpose.
Shot boundary detection (SBD) is the preliminary and most significant step in Content Based Video Retrieval (CBVR). As such the effectiveness of a CBVR system depends heavily on reliable detection of shot boundaries. In this work, a simple yet effective technique for amalgamating several distance features extracted from video frames has been proposed. The aim here is to develop a technique which is able to produce a better distance feature from the existing ones by hybridizing several distance metrics. In the proposed model, any number of distance features can be incorporated and fused together. The resultant feature is not only more robust but also immune to features which are inefficient. Robustness of the proposed method is tested by combining several low performing features with the more efficient ones. Several statistical amalgamation functions are also tested for determining the most efficient one in terms of F1 score. The power of vague sets has been harnessed to detect the shot boundaries effectively using the resultant distance feature. The proposed method is proved to be effective by means of the results obtained, which show that multiple feature amalgamation can lead to a hybrid distance feature which performs better than the best feature incorporated for SBD. The proposed technique is analyzed using ANOVA. A comparison with the other existing methods portray the efficacy of the proposed approach. This method can also be applied for other research problems where several features are to be fused together for producing superior results than the ones obtained by individual methods.
Fuzzy color histogram-based video segmentation
2010, Computer Vision and Image Understanding
Citation Excerpt :
Fang et al. [16] propose a fuzzy logic approach for temporal segmentation of videos, where color histogram intersection, motion compensation, texture change and edge variances are integrated for cut detection. In [17], histogram differences of consecutive frames are characterized as fuzzy terms, such as small, significant and large, and fuzzy rules for detecting abrupt and gradual transitions are formulated in a fuzzy-logic-based framework for segmentation of video sequences. Das et al. [18] define a unified interval type-2 fuzzy rule based model using fuzzy histogram and fuzzy co-occurrence matrix to detect cuts and various types of gradual transitions.
We present a fuzzy color histogram-based shot-boundary detection algorithm specialized for content-based copy detection applications. The proposed method aims to detect both cuts and gradual transitions (fade, dissolve) effectively in videos where heavy transformations (such as cam-cording, insertions of patterns, strong re-encoding) occur. Along with the color histogram generated with the fuzzy linking method on L*a*b* color space, the system extracts a mask for still regions and the window of picture-in-picture transformation for each detected shot, which will be useful in a content-based copy detection system. Experimental results show that our method effectively detects shot boundaries and reduces false alarms as compared to the state-of-the-art shot-boundary detection algorithms.
Video shot-boundary detection: issues, challenges and solutions
2024, Artificial Intelligence Review
WOA-FNN: An innovative hybrid optimization technique for effective detection of shot boundaries
2023, Conference Proceedings - 2023 IEEE Silchar Subsection Conference, SILCON 2023
Estimating Latent Linear Correlations from Fuzzy Frequency Tables
2022, Communications in Mathematics and Statistics
ALO-SBD: A Hybrid Shot Boundary Detection Technique for Video Surveillance System
2022, Lecture Notes in Electrical Engineering

View all citing articles on Scopus

View full text