Color enhanced local binary patterns in covariance matrices descriptors (ELBCM)☆
Graphical abstract
Introduction
In computer vision, many applications require to describe regions or objects in order to be recognized or matched: tracking, texture classification, human detection or object retrieval are just a few examples among others. A good descriptor should be distinctive i.e. it should capture the most representative characteristics of the region’s appearance while being invariant to most of the following phenomena and artifacts such as geometric transformation, occlusion, viewpoint, lighting changes, blur, noise, especially for embedded applications. Ideally it should be compact so as to save memory and time as much as possible.
Regions can be described in various ways [1], by a set of feature points [2], [3], by their contours, by color or gradient histograms [4]. However, the choice made on a descriptor generally depends on the application, on the acquisition conditions and on the object under consideration: whether it is colorful, textured, structured; whether it has a rigid motion or not. Covariance region descriptors proposed by Tuzel et al. [5] provide a compact model that embeds various image cues (e.g., intensity, directional gradients, spatial coordinates, texture representations, optical flow) using their correlations, leading to a Symmetric Positive Definite (SPD) matrix. To some extent, these descriptors are invariant to uniform illumination changes, and can offer a good resilience to scale and rotation variations. Their computation can be speeded up by using integral images, applying structure transforms or using specific instructions [6], then the computation time becomes the same whatever the region size.
Based on these qualities, covariance descriptors have proven their good behavior in various applications: object detection [5], texture classification [7], [8], [9], [10], human detection [11], [12], tracking [13], [14] and re-identification [15], face recognition [16], action recognition using temporal derivatives [17], [9] or fire and flame detection [18] where flames are considered as a dynamic texture of characteristic color.
Several tracks arise from these various works: the number and the nature of the features that feed the covariance descriptor as well as the way they are included; the spatial arrangement of the covariance matrices on the object to be described and possibly the weights given to each of them; finally, the way to handle metrics on the Riemannian manifold to which the covariance matrices belong.
Concerning the latter issue, the manifold can be viewed intuitively as a continuous surface lying in a high dimensional space. The question to address is then how to handle distances in such a complex space and therefore how to apply classification, comparison or machine learning techniques in a rigorous way. Several approaches have been proposed: using specific metrics based on Lie algebra [19], [13], mapping the points on the manifold to the tangent space of the identity matrix [17], using the properties of Grassmann manifolds and their embedding in Reproducing Kernel Hilbert Spaces (RKHS) [9] or using a Log-Euclidean framework [8].
Concerning now the first track, the nature of features depends on the application. For region matching, the descriptors embed space, color and texture features. Any colorspace can be imagined, from RGB in most works [5] to invariant colorspaces [20] or can be chosen depending on the database. As for texture, let us mention spatial derivatives [5], Gabor filter responses [16] and Local Binary Patterns (LBP). The latter features have gain in interest due to their good performance and gave birth to LBCM [21] or GLRCD [22].
While many works deal with integrating texture in the region descriptor, fewer works are dedicated for color LBP features and even less concerning their use for covariance descriptors. However, this can increase the discriminant power of the descriptor and provide a better invariance to illumination changes. To that aim, color invariant features and Local Binary Patterns focused our attention. First, a LBP descriptor called Enhanced LBP (ELBP) is proposed for monochrome images and its marginal and vectorial color extensions are given. The previous work [23] which proposed gray ELBP with a far less thorough evaluation. After an analysis of their sensitivity to color appearance changes, the features are used for region description by covariance matrices. These descriptors are evaluated experimentally for different datasets designed for texture classification, object retrieval and person re-identification. A study is made on the choice of color cues to be included in the descriptor, which can have a big impact on the descriptiveness, the compactness and the invariance to distortions.
The remainder of this paper is organized as follows. Section 2 recalls the underlying theory of covariance matrix descriptors and discusses the features generally included. The proposed LBP features are presented in Section 3. Then, the experiments are detailed in Section 4. Finally our conclusions are given in Section 6.
Section snippets
Covariance descriptors
After a short explanation of the key principles of covariance region descriptors in 2.1, the existing spatial, color and texture features are introduced in 2.2.
Enhanced local binary covariance matrices (ELBCM)
After a brief explanation of LBP features in 3.1, the proposed monochrome ELBP feature is introduced in Section 3.2 as well as its color extensions in Section 3.3. To finish, section provides a summary of the descriptors that will be compared in the experiments.
Experiments
After introducing the datasets used in the experiments in Section 4.1, the different LBP descriptors are evaluated under illumination changes in Section 4.2. Texture classification, object matching and person re-identification are then successively addressed in Sections 4.3 Texture classification, 4.5 Person re-identification, 4.3.1 Evaluation of gray texture descriptors, 4.4 Object recognition, 4.5 Person re-identification.
Discussions
The computation of the enhanced LBP is very fast because look-up tables can be used to make a direct correspondence between the uLBP values and the trigonometric quantities. However, four texture features are used instead of one so the resulting feature vector is necessarily larger than for classical LBP descriptors. Consequently, the computation of the covariance matrix can take 50 % more complexity than when including uLBP, and around 25 % compared to LCVBP [35]. The performances are however
Conclusions
This paper has proposed a new way to embed LBP in covariance region descriptors. From the nriLBP (non-rotation invariant uniform LBP), the angular portion corresponding to the ‘1’ is described by their trigonometric values. Therefore the resulting LBP feature offers the same separability power as nriLBP with an increased resilience to small rotations or noise. In addition, the transform is fast because the cosine and sine values can be precomputed once for all and stored into a look-up table.
References (43)
- et al.
A performance evaluation of local descriptors
IEEE Trans. Pattern Anal. Mach. Intell.
(2005) Distinctive image features from scale-invariant keypoints
Int. J. Comput. Vis.
(2004)- et al.
SURF: Speeded up robust features
- et al.
Histograms of oriented gradients for human detection
- O. Tuzel, F. Porikli, P. Meer, Region covariance: A fast descriptor for detection and classification, Computer...
- et al.
Color tracking with contextual switching: real-time implementation on CPU
J. Real-Time Image Process.
(2013) - et al.
Gabor filters as feature images for covariance matrix on texture classification problem
- et al.
Local log-Euclidean covariance matrix (L2ECM) for image representation and its applications
- et al.
Kernel analysis over Riemannian manifolds for visual recognition of actions, pedestrians and textures
- et al.
Texture classification using Rao’s distance on the space of covariance matrices
Pedestrian detection via classification on Riemannian manifolds
IEEE Trans. Pattern Anal. Mach. Intell.
Fast human detection from joint appearance and foreground feature subset covariances
Comput. Vision Image Understand.
Covariance tracking using model update based on lie algebra
Re-identification by covariance descriptors
Person Re-Identification
Gabor-based region covariance matrices for face recognition
IEEE Trans. Circ. Syst. Video Technol.
Action recognition using sparse representation on covariance manifolds of optical flow
Covariance matrix-based fire and flame detection method in video
Mach. Vis. Appl.
A metric for covariance matrices
Quo vadis geodesia
Multi-scale color local binary patterns for visual object classes recognition
Facial expression recognition using local binary covariance matrices
Cited by (0)
- ☆
This paper has been recommended for acceptance by Zicheng Liu.