A model for dynamic object segmentation with kernel density estimation based on gradient features

https://doi.org/10.1016/j.imavis.2008.08.004Get rights and content

Abstract

The dynamic object segmentation in videos taken from a static camera is a basic technique in many vision surveillance applications. In order to suppress fake objects caused by dynamic cast shadows and reflection images, this paper presents a novel segmentation model with the function of cast shadow and reflection image suppression. This model is a kernel density estimation model based on dynamic gradient features. Unlike the conventional kernel density estimation model which can only suppress cast shadows in color videos, this model can also suppress them in intensity videos, and under the circumstance of diffusion it can suppress reflection images effectively. Although this model may cause the increase of the false negative rate, its function of fake object suppression is remarkable. Furthermore, the false negative rate can be reduced with other convenient methods. Some experimental results by real videos are also presented in this paper to demonstrate the effectiveness of this model.

Introduction

The dynamic object segmentation techniques are necessary in many vision surveillance applications. Because the scenes in real videos are complicated, the research for high-quality segmentation methods is challenging. In recent years, related research has shown some effective methods in this field. The mixture-of-Gaussians method [6], [7], [8] is the typical example of parametric non-predictive methods, and this method can adapt to continuous background changes. However, the adaptability of background changes of this method is limited by the fixed number of experiential background types. The segmentation methods based on kernel density estimation [2], [3] are non-parametric non-predictive methods, and they can adapt to more complex background changes than the mixture-of-Gaussians methods, because the number of background types is dynamic and based on recent samples. Besides the mixture-of-Gaussians and kernel density estimation models, some other probabilistic background models also have been presented [5].

Dynamic cast shadows and reflection images in videos are considered as fake objects in most applications, and they deteriorate the segmentation quality seriously in many presented methods. There are mainly two kinds of approaches for cast shadow suppression. One is to eliminate cast shadow regions with object shape structure [9], and the other is to classify cast shadow pixels into background with pixel color space [10], [14]. Kernel density estimation methods based on pixel color space can suppress the dynamic cast shadows effectively [1]. But for intensity videos, the methods of conventional kernel density estimation cannot suppress cast shadows because of its inherent requirement for color. Besides, for reflection images, these methods cannot suppress them at all. Furthermore, because reflection images do not always obviously exist in all videos, most presented methods do not involve them. In order to solve these problems, we provide a different kernel density estimation model based on dynamic gradient features.

Four pixel features used in kernel density estimation models will be described in this paper. These features are intensity, color, gradient magnitude, and gradient direction. The last two features together may be labeled as gradient or gradient features. Three kernel density estimation models will be described and compared in this paper. The first one is the intensity model based on pixel intensity [1]. This model is a basic one which cannot suppress cast shadows and reflection images, but it provides a basic segmentation theory for the other two models. The second one is the rgs model based on pixel color and intensity [1]. This model combines color into the intensity model and it can suppress cast shadows in color videos. The third one is the new model presented in this paper, which is based on pixel gradient and intensity. We refer to it as xgs model here. This model can suppress cast shadows in intensity videos. Furthermore, under the circumstance of diffusion which is common in natural scenes, it can suppress parts of reflection images in intensity videos, whereas the reflection images are ignored in both intensity and rgs models.

In fact, the kernel density estimation has been used in edge extraction already [4], but in this paper we are going to combine gradient computation and kernel density estimation for dynamic object segmentation. Furthermore, our research about the xgs model is different from the known spatial–temporal background modeling works, such as reference [11], [12]. First, in spatial–temporal background model, the spatial gradient and the temporal gradient are considered as two elements of one integral vector, and they are not suitable to be detached for analysis [11]. However, in xgs model, the research focus is the spatial gradient and the change of the spatial gradient over time. Secondly, the purpose of the spatial–temporal background model aims to solve problems such as object segmentation with dropped frames, camera motions, weak assumptions about object appearance, instead of the cast shadow and reflection image suppression, which are the main purpose of the xgs model.

We will organize the content as follows. In Section 2, we describe the intensity model and rgs model presented in reference [1] in brief. In Section 3, we will describe the gradient properties of cast shadow and reflection image suppression, and we will compare probability estimations of different pixel features in different situations. In Section 4, we will introduce the xgs model. In Section 5, we present the experimental results based on real videos, and we will draw conclusion from our research work.

Section snippets

Intensity model

Let {x1, x2 ,… , xN} be an intensity sample set of a pixel, and each sample is from the corresponding frame in a video. Based on the sample set, we can represent an estimate of the pixel intensity pdf (probability density function) [1]. Given the intensity observation xt of this pixel at time t, we can estimate the probability of this observation as the following formula [1]:Pr(xt)=1Ni=1N12πσ2e-12(xt-xi)2σ2The Gaussian in formula (1) is a kernel function. This formula presents the basic idea of

Analysis and comparison of dynamic pixel features

In this section, we will compare the probability estimations of four pixel features in different situations, which are the probability estimations of intensity, color vector, gradient magnitude, and direction. The gradient features here are all intensity gradient features, and we adopt the classic definitions of the gradient magnitude and the gradient direction at a pixel. The gradient magnitude is represented as |g|. The gradient direction is a unit vector with two dimensions, we represent it

Kernel density estimation model based on dynamic gradient features

In this section, we present a new kernel density estimation model, the xgs model. The probability estimations of intensity, gradient magnitude, and gradient direction are adopted in the xgs model, and we label these three probability estimations as Prs, Prg, and Prx separately. The functions of the three pixel features in the xgs model are summarized in Table 1. The pixel displacement probability PN and component displacement probability PC based on the intensity sample set, which are defined

Experiment results analysis and conclusion

In Fig. 8, we provide eight segmentation results of different models based on eight different video clips. The segmentation results of rgs model are based on color videos, and that of intensity and xgs model are both based on intensity videos. In experiments, fixed thresholds are applied to all possibility estimations in the intensity, rgs, and xgs models. The thresholds of Prs, PN, and PC are the same in the intensity, rgs, and xgs models. And, in rgs model, the probability estimation

Acknowledgement

We are grateful to the science foundation of Sichuan province (2006J13-092) of Republic of China for financial support.

References (14)

  • A. Elgmmal, R. Duraiswami, D. Harwood, L.S. Davis, Background and foreground modeling using nonparametric kernel...
  • A. Mittal, N. Paragios, Motion-based background subtraction using adaptive kernel density estimation, in: CVPR’04, vol....
  • A. Elgmmal, R. Duraiswami, L.S. Davis, Efficient non-parametric adaptive color modeling using fast gauss transform, in:...
  • G. Economou, A. Fotinos, S. Makrogiannis, S. Fotopoulos, Color image edge detection based on nonparametric density...
  • J. Rittscher, J. Kato, S. Joga, A. Blake, A probabilistic background model for tracking, in: ECCV’00, Springer-Verlag,...
  • C. Stauffer, W.E.L. Grimson, Adaptive background mixture models for realtime tracking, in: CVPR’99, vol. 2, IEEE...
  • W.E.L. Grimson, C. Stauffer, R. Romano, L. Lee, Using adaptive tracking to classify and monitor activities in a site,...
There are more references available in the full text version of this article.

Cited by (13)

  • On the role and the importance of features for background modeling and foreground detection

    2018, Computer Science Review
    Citation Excerpt :

    ADM achieves similar results than ViBe [347] and PBAS [463]. In other works, Li et al. [464] and Gu et al. [465] used the gradient feature with the KDE model [440]. The reader can see how the color features and edge features are fused in these different approaches in Section 17.

  • First-order kernel density estimation of abdomen medical image intensity and spatial information and application to segmentation

    2014, Optik
    Citation Excerpt :

    Many medical image features are extracted based on the density estimate and are used for image retrieval [6], image classification [7] and image categorization [8]. Most published methods estimate the density function using histograms [9,10] or kernel density estimators [11,12] (often called Parzen Windows estimators). These approaches have the advantage of being nonparametric, so they are generally applicable.

  • A nonparametric Riemannian framework on tensor field with application to foreground segmentation

    2012, Pattern Recognition
    Citation Excerpt :

    One major disadvantage of these block-based methods is that the boundary of the foreground objects cannot be delineated exactly. In recent years, researchers have been concentrating more on incorporating spatial aspect into background modeling to take advantage of the correlation that exists between neighboring pixels [34]. Thus, the background model of a pixel also depends on its neighbors.

  • Squeezing the DCT to Fight Camouflage

    2020, Journal of Mathematical Imaging and Vision
View all citing articles on Scopus
View full text