Dispersion constraint based non-negative sparse coding algorithm☆
Introduction
A fundamental problem on digital image processing is to find suitable representation for the image data for tasks such as feature extraction, compression and denoising [1], [2]. Image representations are often based on discrete linear transformations of the observed data. Many important linear representation methods for images have been given by methods of wavelet analysis, Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), Independent Component Analysis (ICA), Sparse Coding (SC), Non-negative Sparse Coding (NNSC) and so on [1]. These methods are all successful in image feature extraction [1], [2], [3]. And it is noted that SC and NNSC have roots in visual neuroscience, and have the same linear generative model. Moreover, in contrast to wavelets, SC and NNSC are adaptive methods, and in contrast to PCA, they are suitable to high dimension data. So, published documents about the two algorithms in signal and image data processing are much more [1], [2], [3], [4], [5], [6].
The early SC algorithm was proposed by Olshausen and Field in 1996 [2] and be viewed as the basic SC algorithm. This SC algorithm can well reveal underlying relationships between the environmental information and internal representation of the visual cortex in V1 in the brain of mammals, and it shows that linear SC of natural images yielded features qualitatively very similar to receptive fields of simple cells in V1 [3], [4], [5], [6], and it learns basis functions that capture higher level features in the data [2], [3], [4]. Moreover, SC can produce localized bases when applied to other natural stimuli such as speech and video [2], [4], [5], [6], [7], [8], [9]. Unlike some other unsupervised learning techniques such as PCA, SC can train overcomplete basis sets, in which the number of bases is greater than the input dimension [7]. And the SC algorithm can also model inhibition between the bases by sparsifying their activations. Because of the above-mentioned SC model׳s advantages, this algorithm has been widely used and developed in the research fields of image processing and pattern recognition [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21]. However, Hoyer pointed out that there are at least two obvious ways where the basic SC algorithm is unrealistic as an algorithm of V1 simple cell behavior [22], [23]. The most distinct discrepancy is the fact that in SC model, each sparse unit can be either positively or negatively active, in addition to being effectively silent (close to zero) [22], [23]. Another major difference between basic SC and neural reality is that the input data in basic SC algorithm is signed, whereas V1 receives the visual data from the Lateral Geniculate Nucleus (LGN) in the form of separated ON and OFF channels, moreover, the input data of ON and OFF channels must be negative [22], [23]. Therefore, basic SC algorithm is not suitable in V1 simple-cell behavior. To solve the defect of basic SC, simultaneously drawing support from the idea of NMF [24], [25], which is a method for factorizing a matrix as the product of tow matrices, in which all elements are non-negative, and has been used widely in many different areas including pattern recognition, clustering and dimension reduction [26], [27], [28], Hoyer proposed a Non-negative Sparse Coding (NNSC) algorithm [22], viewed as basic NNSC now. In this NNSC model, input data are preprocessed in separate ON and OFF channels, and sparse coefficient matrix and feature basis matrix are all non-negative.
As basic SC model, Hoyer׳s NNSC algorithm also has been proved to be efficient to find an efficient representation for high dimension data, and used widely in the image processing field [22], [23], [24], [25], [26]. However, Hoyer׳s algorithm only considers two terms [26]. The first one is the image reconstruction error and another is the sparse priori distribution of sparse coefficients. In training Hoyer׳s algorithm [4], [5], [6], the combination of gradient projection and auxiliary function-based multiplicative updating method is used, so its performance is influenced hardly by the iterative step size in gradient projection, and the convergence precision cannot be very high, at the same time, the sparsity of coefficients cannot be ensured in self-adaptive [29]. Moreover, when being used to feature recognition task, Hoyer׳s algorithm cannot obtain high classification precision. In order to solve these faults above-mentioned, to this day, many researchers have developed some modified NNSC models [31], [32], [33], for example, the one published in the document [30], [31] is easy to implement and can ensure the sparseness and the smaller image reconstruction error, but it cannot well represent an image׳s structure information. Combining the advantages of Empirical Mode Decomposition (EMD) and basic NNSC, the document [32] proposed an image base learning method using EMD based NNSC algorithm and used it to recognize face images. This method can certainly represent an image׳s structure information, but, because EMD algorithm was introduced in NNSC, the computation became more complex. Li et al. proposed a stable and efficient algorithm for NNSC (denoted by SENSC) in the document [33]. The SENSC algorithm has quicker convergence and much more stability than Hoyer׳s NNSC. And the solution of SENSC algorithm is better than that of Hoyer׳s NNSC algorithm [33], therefore, the SENSC algorithm has stronger capability of adjusting sparseness. Although, these NNSC algorithms above-mentioned have respective advantages, they all ignore an important constraint: prior class information of input data, or in other words, data separation in high dimensional space. Image features extracted by these NNSC algorithm have been testified to be efficient in image reconstruction [32], [33], image denoising [34], image compression and so on [35], [36], however, the classification task realized by using these features is not very satisfied and discussed commonly.
Also for the feature learning methods, a distinction exists between generative and discriminative approaches, depending on whether the learned feature subspace supports reconstruction of the data or classification. Based on the understanding and NNSC algorithm explored, and considering the constraint of within-class dispersion and between-class dispersion, a novel NNSC algorithm is proposed by us, which is denoted by DCB-NNSC algorithm in this paper. The goal is to improve the spatial separability of sparse feature coefficients and enforce the classification capability in pattern recognition task. Differing from other NNSC models above-mentioned, the DCB-NNSC algorithm combines the feature discriminability constraint supervised by classification task and the maximized sparseness criteria. Here, the maximized sparseness is measured by using the Normal Inverse Gaussian Density (NIG) density model [37]. And the dispersion ratio of within-class and between-class of feature coefficients is used to improve the feature reparability in the feature extraction task. In the test, the PolyU palmprint database developed by the Hong Kong Polytechnic University is used. To testify feature bases extracted by DCB-NNS, the image reconstruction work is first discussed by using different image patches. Experimental results show that the DCB-NNSC algorithm is efficient in image feature extraction. Further, using these DCB-NNSC features, the classification task is also testified by utilizing classifiers of Radial Basis Probabilistic Neural Network (RBPNN), Radial Basis Function Neural Network (RBFNN) and Euclidean Distance. According to simulation results, it is clear to see that DCB-NNSC algorithm not only can capture some significant receptive fields with clearer sparsity and image structure, but also well favor the image classification task. Furthermore, in the same experimental condition, comparing this algorithm with basic NNSC algorithm, experimental results also show that DCB-NNSC algorithm is in indeed effective and has a good potential in the research task of feature extraction and classification for pattern recognition.
The other parts of this paper are designed as follows: after reviewing related work of SC and NNSC described in Section 2, the dispersion constraint rule of feature coefficients is given in section. Section 3 explains mainly the new DCB-NNSC algorithm proposed in this paper. And simulation results of image reconstruction and classification are discussed in Section 4 in order to prove the efficiency of DCB-NNSC algorithm. Finally, some conclusions are given in Section 5.
Section snippets
SC model of images
Assuming that denotes an image, and denotes the -dimension sparse component matrix (usually ). Furthermore, let denote the feature basis matrix with the size of . Thus, the input image can be modeled as a linear superposition of some features , namely,where are mutually independent sparse coefficients of image features, are feature basis vectors, are the pixel coordinates, and is the Gaussian noise.
The dispersion constraint of sparse coefficients
Assuming that denotes the class sparse coefficient set with dimensionality, namely the class input sample, where (,) is the sample vector of , is the class number of all training samples, and denotes the sample number of the class sample set . Let be the total number of all training samples, and denote the mean of the class sample , then the within-class dispersion matrix and between-class
The cost function of DCB-NNSC algorithm
To ensure the maximized sparsity of feature sparse coefficients and improve the feature pattern discrimination for the classification task, at the same time considered the adaptive-self sparsity of data, referred to algorithms of Hoyer׳s NNSC and other published NNSC, a new NNSC algorithm based on dispersion constraint (DCB-NNSC) model is proposed by us in this paper, and the cost function is defined as follows:where matrixes of , , and
Feature basis learning
In test, the PolyU palmprint image database provided by Hong Kong Polytechnic University was used to testify DCB-NNSC algorithm. The database selected includes 600 palmprint images from 100 individuals with 6 images from each [39], [40]. Here, 600 images are used to learn feature bases. Each original palmprint image was pre-processed by the region of interesting (ROI) extraction method [41], [42]. Further, to reduce greatly the calculation in learning process, the dimension of each ROI image
Conclusions
On the basis of early NNSC model, a novel NNSC algorithm with dispersion constraint, denoting the DCB-NNSC model, is proposed in this paper. In this model, to ensure the sparsity of feature coefficients, the sparse penalty function is selected as the feasible NIG density function, which is self-adaptive to measure the sparseness of images. At the same time, in order to enhance the capability of feature separability in feature subspace and being suitable to be used in feature classification
Acknowledgment
This work was supported by two National Natural Science Foundation of China (Grant nos. 61373098 and 61370109), the Grant from Natural Science Foundation of Anhui Province (No. 1308085MF85), and the Innovative Team Foundation of Suzhou Vocational University (No. 3100125).
Li Shang received the B.Sc. degree and M.Sc. degree in Xi׳an Mine University in June 1996 and June 1999, respectively. And in June 2006, she received the Doctor׳s degree in Pattern Recognition & Intelligent System in University of Science & Technology of China (USTC), Hefei, China. From July 1999 to July 2006, she worked at USTC, and applied herself to teaching. Now, she works at the Department of Communication Technology, Electronic Information Engineering College, Suzhou Vocational
References (51)
- et al.
Sparse coding of sensory inputs
Curr. Opin. Neurobiol.
(2004) - et al.
Sparse coding with an overcomplete basis set: a strategy employed by V1?
Vis. Res.
(1997) - et al.
A multi-layer sparse coding network learns contour coding from nature images
Vis. Res.
(2002) Modeling receptive fields with non-negative sparse coding
Neurocomputing
(2003)et al.Sparse coding and decorrelation in primary visual cortex during natural vision
Science
(2000)- et al.
Noise removal using a novel non-negative sparse coding shrinkage technique
Neurocomputing
(2006) An improved image coding scheme ofr non-negative sparse coding
Comput. Eng. Sci.
(2010)- et al.
Image feature extracting and denoising by sparse coding
Pattern Anal. Appl.
(1999) - et al.
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Nature
(1996) Systematic Theory of Neural Networks for Pattern Recognition
(1996)- et al.
Efficient sparse coding algorithms
Adv. Neural Inf. Process. Syst.
(2006)
Learning overcomplete representations
Neural Comput.
Face recognition using localized features based on non-negative sparse coding
Mach. Vis. Appl.
Timecourse of neural signatures of object recognition
J. Vis.
Robust face recognition via sparse representation
IEEE Trans. Pattern Anal. Mach. Intell.
Sparse coding shrinkage: denoising of nongaussian data by maximum Likelihood estimation
Neural Comput.
Image denoising by sparse code shrinkage
Two-stage image denoising based on sparse representations
J. Electron. Inf. Technol.
Learning multiscale sparse representations for image and video restoration
SIAM Multiscale Model. Simul.
Image denoising via sparse and redundant representations over learned dictionaries
IEEE Trans. Image Process.
Image super-resolution via sparse representation
IEEE Trans. Image Process.
Combining reconstruction and discrimination with class-specific sparse coding
Neural Comput.
Non-negative matrix factorization with sparseness constraints
J. Mach. Learn. Res.
Cited by (0)
Li Shang received the B.Sc. degree and M.Sc. degree in Xi׳an Mine University in June 1996 and June 1999, respectively. And in June 2006, she received the Doctor׳s degree in Pattern Recognition & Intelligent System in University of Science & Technology of China (USTC), Hefei, China. From July 1999 to July 2006, she worked at USTC, and applied herself to teaching. Now, she works at the Department of Communication Technology, Electronic Information Engineering College, Suzhou Vocational University. At present, her research interests include Image processing, Artificial Neural Networks and Intelligent Computing.
Xin Wang received the B.Sc. degree in Zhejiang University, China in July 2011. From July 2011 to July 2013, he worked as a research assistant at the College of Computer Science and Technology, Zhejiang University. He is currently pursuing the Ph.D. degree at the School of computing science of Simon Fraser University in Canada, in the area of Social Network and Recommender Systems. At present, his research interests include Machine Learning, Pattern Recognition, Recommender Systems and Social Networks.
Zhan-Li Sun received the Ph.D. degree from the University of Science and Technology of China, in 2005. Since 2006, he has worked with The Hong Kong Polytechnic University, Nanyang Technological University, and National University of Singapore. Now, he is currently a Professor with School of Electrical Engineering and Automation, Anhui University, China. His research interests include Machine Learning, Pattern Recognition, and Image and Signal Processing.
- ☆
Preliminary version of this manuscript has been selected as one of the best papers in International Conference on Intelligent Computing (ICIC 2014), 2014 (Paper ID: 135) and sub-selected in the Neurocomputing journal.