Non-negative dictionary based sparse representation classification for ear recognition with occlusion
Introduction
The ear modality offers some distinct advantages when applied in human identification scenario: wealthy structural features that are permanent with increasing age from about 8 to 70, not sensitive to the expression variations [1]. Current research on ear recognition has exploited the possibility of using 2D ear image or 3D ear model for identification. 3D ear recognition gets better performance in illumination variation or pose variation [2], but requires special equipment and expensive computation. While most recent work is focused on ear recognition using 2D images since 2D images are more consistent with the surveillance deployment in real application scenarios [3]. Research results from open literatures have shown that ear biometric can get good recognition performance under constrained environment [4]. Based on the features extracted, ear recognition in 2D can be categorized as follows: structural feature based recognition like active shape model of the outer ear contour [5], geometric features of the inner ear [6]; subspace method based recognition like force field transformation [7], eigen-ear [8], fisher-ear [9], local model fusion based recognition like guided model-based analysis [10], multi-matcher model [11], frequency feature based recognition like Gabor feature [12], LBP feature [13] or Quadrature features [14] etc. In constrained environment, the proposed methods perform well. But in un-constrained scenarios, partial occlusion over the ear part will always be a problem that we cannot avoid.
There has been some research work on ear recognition with occlusion. One solution is to extract local features using NMF [15] or Gabor transformation [16] instead of using holistic features like eigen-ear or fisher-ear, the test sample is projected onto the basis space and then recognition is done by comparing the similarity between the feature vector of the test image and those of the training images. The second solution is to segment ear images into blocks, then apply subspace method to identify those top discriminating blocks, and combine these blocks to form an ensembled classifier [17]. The third solution is to match the key points using SIFT between the gallery samples and the probe samples [10]. Motivated by the successful application of sparse representation based classification (SRC) for robust face recognition with occlusion and corruption problems [18], the most recent solution is to apply SRC to represent a test image with occlusion as the combination of sparse linear combination of training samples and sparse error incurred by image noise. SRC method has also shown promising results for ear recognition by using an identity occlusion dictionary to code the occluded portion [19], [20], [21]. In the SRC model, the dictionary is consisted of two parts: the feature dictionary and the occlusion dictionary. The feature dictionary is usually constructed with feature vectors extracted from the source images such as PCA features or LDA features [19], Gabor features [21] et al. The occlusion dictionary is usually constructed using the identity matrix with the same dimension as the feature dimensions in the feature dictionary. Experimental results have shown that this method can improve the recognition rate [22]. But the problem with this scheme is the huge number of atoms in the dictionary, which will bring a heavy computation load for the SRC solving model. So the motivation here is to design a more compact dictionary with less number of atoms and higher discriminative ability.
Yang et al. [23] proposed to use compressed Gabor features to construct the occlusion dictionary. However, through experiments we notice that using such an occlusion dictionary the sparse coding representation of a test sample with partial occlusion was still very dense. Therefore, in this paper, we propose to add non-negative constraints to the learning process of the Gabor Occlusion Dictionary in order to improve the sparseness of the sparse coding representation. The Gabor feature dictionary and the non-negative occlusion dictionary are then combined to solve the non-negative sparse representation classification model. The motivation of using non-negative dictionary is to make the basis atoms in the dictionary more visual and physical meaningful. The motivation of getting non-negative sparse coding coefficients is to conform to the intuitive notion of combing parts to form a whole and more consistent with the biological modeling of visual data and obtain a more accurate and sparser representation of the testing samples. Fig. 1 shows the system diagram of our Non-negative Dictionary based Non-negative Sparse Representation Classification (ND_NSRC) method. The proposed method is applied on the USTB ear database with random block occlusion. Experimental results have shown that the proposed method gets satisfying recognition performance with much sparser coding coefficients. The rest of the paper is organized as follows: Section 2 details the proposed non-negative dictionary based sparse representation classification with the dictionary learning algorithm, Section 3 presents experimental results on available ear image dataset to verify the proposed method, Section 4 conclude the paper.
Section snippets
Sparse representation based classification (SRC)
SRC [22], [24] has been successfully applied in face recognition. Suppose we have classes of samples, the class is denoted as , where is the vector stretched by the training sample. For a test sample from the class, can be represented via the linear combination of the samples within, i.e., where are the coefficients. Let be the concatenation of all the training samples
Non-negative dictionary based non-negative sparse representation classification
In this paper, we propose a non-negative dictionary based non-negative sparse representation classification algorithm. The non-negative dictionary includes the Gabor feature dictionary and the non-negative occlusion dictionary. Then this non-negative dictionary is applied for solving the non-negative sparse representation classification model.
Experimental results and discussion
In Section 4, we conduct extensive experiments to evaluate the proposed ND_NSRC on different image datasets in different scenarios. These experiments includes ear recognition with natural occlusion, face recognition with real face disguise, ear recognition with random occlusion, multimodal recognition using face and ear with random block occlusion.
Conclusion
In this paper, we propose a non-negative dictionary based NSRC and apply it on ear recognition with partial occlusion and multimodal recogniton. The augmented Gabor features can extract structural information of the ear in different scales and orientations and represent the local properties. We propose to learn a non-negative Gabor feature based occlusion dictionary instead of using an identity matrix as occlusion dictionary. The sparse coding coefficients with these two dictionaries are all
Acknowledgment
This paper is supported by the National Natural Science Foundation of China (Grant no. 61300075).
Li Yuan received the BS and MS degree in control theory and control engineering from the Dalian University of Technology of China and the PhD degree in pattern recognition and intelligent system from University of Science and Technology Beijing in 2006. She is currently an associate professor with the School of Automation and Electrical Engineering at University of Science and Technology Beijing, China. Her research interests cover image processing and pattern recognition. She is a member of
References (30)
- et al.
Robust ear identification using sparse representation of local texture descriptors
Pattern Recog.
(2013) - et al.
Efficient recognition of highly similar 3D objects in range images
IEEE Trans. Pattern Anal. Mach. Intell.
(2009) - et al.
Symmetrical null space for face and ear recognition
Neurocomputing
(2007) - et al.
On guided model-based analysis of ear biometrics
Comput. Vis. Image Underst.
(2011) - et al.
Automated human identification using ear imaging
Pattern Recog.
(2012) - et al.
Reliable ear identification using 2D quadrature filters
Pattern Recog. Lett.
(2012) - et al.
Ear recognition based on local information fusion
Pattern Recog. Lett.
(2012) - et al.
A robust face and ear based multimodal biometric system using sparse representation
Pattern Recog.
(2013) - et al.
Gabor feature based robust representation and classification for face recognition with gabor occlusion dictionary
Pattern Recog.
(2013) - et al.
Toward unconstrained ear recognition from two-dimensional images
IEEE Trans. Syst., Man Cybern, A: Syst. Hum.
(2010)
A survey on ear biometrics
ACM Trans. Embed. Comput. Syst.
Shape and structural feature based ear recognition. Proceedings of sinobiometrics 2004
Lect. Notes Comput. Sci.
Towards unconstrained ear recognition from two dimensional images
IEEE Trans. Syst., Man, Cybern., A
Comparison and combination of ear and face images in appearance-based biometrics
IEEE Trans. Pattern Anal. Mach. Intell.
Cited by (0)
Li Yuan received the BS and MS degree in control theory and control engineering from the Dalian University of Technology of China and the PhD degree in pattern recognition and intelligent system from University of Science and Technology Beijing in 2006. She is currently an associate professor with the School of Automation and Electrical Engineering at University of Science and Technology Beijing, China. Her research interests cover image processing and pattern recognition. She is a member of the IEEE Computer Society.
Wei Liu is a graduate student with the School of Automation and Electrical Engineering, University of Science and Technology Beijing, China. Her current research interests include computer vision, pattern recognition, and biometrics.
Yang Li received the PhD degree from the Chinese Academy of Social Sciences in July 2006. She is currently an associate professor with the School of International Studies, Communication University of China. Her current research interests include cross-cultural communication, TV news program compilation and production.