Mean Shift tracking with multiple reference color histograms
Introduction
The target’s color histogram is widely used for visual tracking (e.g. [1], [2], [3], [4], [5]) and, as was shown by Comaniciu et al. [3], [6], tracking using this feature may be performed very quickly via the Mean Shift procedure [7]. This paper extends Comaniciu et al.’s tracker in [3], [6], which will be referred to in this paper by its common name Mean Shift tracker.
The Mean Shift tracker works by searching in each frame for the location of an image region whose color histogram is closest to the reference color histogram of the target. The distance between two histograms is measured using their Bhattacharyya coefficient, and the search is performed by seeking the target location via Mean Shift iterations beginning from the target location estimated in the previous frame (the tracker is outlined in Section 2).
In the Mean Shift tracker, as well as in the other trackers cited previously, the reference color histogram is approximated according to a single view of the target, typically as it appears in the first frame of the sequence. Although using this method for obtaining the reference histogram proved to be very robust in many scenarios, it produces, in many cases, a poor representation of the target, which might result in poor tracking. More seriously, the support of a reference histogram obtained by this method may become non-overlapping with the support of the target’s histogram as it appears in the sequence, usually resulting in target loss. Indeed, for many objects, any viewing direction may be replaced with a different viewing direction where all the object’s colors apparent in the latter view differ from those in the former. An (unscrambled) Rubik Cube is an extreme example of such an object; each side is a different color, and three sides at most are visible from any viewing direction. Major changes in the apparent colors of a target may also result from changes in the actual target’s colors, as when a person puts on or removes a piece of clothing, or as in the case of an alternating street advertisement.
Often, several different views of the target are available prior to the tracking, either from images that were previously acquired (e.g. [8], [9], [10], [11]) or when performing off-line tracking (e.g. [12], [13], [14], [15]). In these contexts, this paper extends the Mean Shift tracker to using multiple reference color histograms. At first we suggest a simple method to combine these histograms into a single histogram that is more appropriate for tracking the target. In order to enhance the tracking further, we then propose an extension to the Mean Shift tracker, where the convex hull of these histograms is used as the target model. That is, rather than searching for the image region whose color histogram is closest to a single reference histogram, we search for the image region by minimizing the distance of its color histogram from the convex hull of several reference histograms.
Time-varying histograms of colors (e.g. [5], [16]) or of other features such as filter responses (e.g. [17]) have been used for target modeling before, and many trackers have modeled the target’s 2D appearance as being time-varying within a subspace (e.g. [8], [9], [18], [19], [20]). In the latter group of trackers, the search for the target (and possibly for additional transformation parameters) is performed by minimizing the distance of its appearance in the current frame from that subspace. This approach is applied here by modeling the target’s color histogram as being a time-varying linear combination of several reference histograms, under the restriction that the mixture coefficients are nonnegative and sum to unity (so that the linear combination will be a histogram mixture).
Section 2 outlines the original Mean Shift tracker [3]. A simple method for combining multiple reference histograms into one is proposed in Section 3. Section 4 describes the proposed extension of the Mean Shift tracker to use the convex hull-based target model. Experimental results are described in Section 5, Section 6 includes a discussion, and a paper summary is provided in Section 7.
Section snippets
The Mean Shift tracker
In this section we outline the Mean Shift tracker described in [3]. The notations used here are similar to those in [3], with minor modifications to suit the subsequent sections.
Combining multiple histograms into one
Sometimes no view of the target yields a reasonable approximation of its circumferential color histogram. An extreme example is presented in Fig. 1. This figure shows the results of the Mean Shift tracker for Sequence I, where a Rubik Cube is tracked. The reference color histogram was set in the first frame, where the visible colors of the cube are orange,
Convex hull-based target model
As different sides of the target face the camera, the target’s histogram changes. To accommodate for a time-varying target histogram, we propose to extend the reference target model used by the Mean Shift tracker to include the convex hull of multiple reference histograms obtained from different target views. That is, the target model is approximated as the mixture of M reference histogramswhere the mixture proportions vary with time.
Thus, the
Experimental results
Results of testing the Mean Shift tracker with the convex hull-based target model are presented for seven sequences. All the targets tracked in the experiments were such that their color histogram could not be reasonably modeled from a single view. In all experiments the RGB color space was used. Each color band was equally divided into eight bins, except for Sequence III, where each color band had to be divided into 32 bins because the target’s colors were very similar to the background’s.
The
Discussion
There appears to be a resemblance between the problem dealt with in this work and those in the papers by Bajramovic et al. [24] and by Maggio and Cavallaro [25], which also enhance the tracking by employing multiple reference histograms. However, the problems are distinct. Here, we are concerned with the problem of temporal changes in the target’s features (e.g., due to rotations in space), whereas [24] deals with the problem of fusing different types of features in the tracking process. These
Conclusion
While the commonly used, Mean Shift tracker [3] proved to be robust in many tracking scenarios, there are cases where no single view suffices to produce a reference color histogram appropriate for tracking the target.
This paper presented a method for immunizing the Mean Shift tracker against the above problem by using multiple reference color histograms. These histograms are obtained from different target views or for different target states. A simple method for combining these histograms into
References (26)
- et al.
Mean shift blob tracking with kernel histogram filtering and hypothesis testing
Pattern Recognition Letters
(2005) - et al.
Tracking colour objects using adaptive mixture models
Image and Vision Computing
(1999) - S. Birchfield, Elliptical head tracking using intensity gradients and color histograms, in: Proceedings of the 1998...
- R.T. Collins, Mean-shift blob tracking through scale space, in: Proceedings of the 2003 IEEE Computer Society...
- et al.
Kernel-based object tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2003) - P. Pérez, C. Hue, J. Vermaak, M. Gangnet, Color-based probabilistic tracking, in: Proceedings of the 7th European...
- D. Comaniciu, V. Ramesh, P. Meer, Real-time tracking of non-rigid objects using mean shift, in: Proceedings of the 2000...
- et al.
Mean shift: a robust approach toward feature space analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2002) - et al.
EigenTracking: robust matching and tracking of articulated objects using a view-based representation
International Journal of Computer Vision
(1998) - F. De la Torre, C.J.G. Rubio, E. Martinez, Subspace eyetracking for driver warning, in: Proceedings of the 2003...
Support vector tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cited by (147)
Visual tracking in video sequences based on biologically inspired mechanisms
2024, Computer Vision and Image UnderstandingHuman Event Recognition in Smart Classrooms Using Computer Vision: A Systematic Literature Review
2023, Programming and Computer SoftwareA Fast Calibration Method for Pneumotachograph with a 3L Syringe
2023, BioengineeringUEQMS: UMAP Embedded Quick Mean Shift Algorithm for High Dimensional Clustering
2023, Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023Generative online learning of appearance modeling approaches for visual tracking
2023, Journal of Optics (India)