Hessian sparse coding
Introduction
Recently, sparse coding has received a lot of attention in machine learning, signal processing, neuroscience and statistics. Given an input data matrix, sparse coding aims to learn a set of basis vectors (i.e., dictionary) that capture high-level semantics, and the sparse coordinates with respect to the dictionary.
Sparse coding has several advantages for data representation. First, it yields sparse representations such that each data point is represented as a linear combination of a small number of basis vectors. Thus, the results are easier for interpretation. Second, sparse representation naturally makes for an indexing scheme that would allow quick retrieval. Finally, there is considerable evidence that biological vision uses sparse representations in early visual areas [1], [2].
The current research of sparse coding is mainly focused on four aspects: pursuit methods for solving the optimization problem, such as basis pursuit [3], feature-sign search/Lagrange dual methods [4]; design of the dictionary, such as supervised dictionary learning [5], group sparse coding [6]; development of theoretical frameworks, such as combining the Linear Discriminant Analysis with sparse representation for classification [7], locally constrained sparse coding [8], locality-constrained Linear coding [9]; applications of the sparse representation, such as image processing [10], [11], image classification [12], [13] and image annotation [14].
One of the major drawbacks of the existing sparse coding algorithms is that it fails to consider the geometrical structure of the data space which has proven to be very useful for classification [15] and clustering. Recent studies have shown that human generated image data is probably sampled from a submanifold of the ambient Euclidean space [16], [17], [18]. In fact, the image data cannot possibly fill up the high dimensional Euclidean space uniformly. Therefore, the intrinsic manifold structure needs to be considered while learning the sparse coordinates.
Recently, Zheng et al. [19] proposed an algorithm named Graph regularized Sparse Coding (GraphSC), which introduces the Laplacian regularization into sparse coding to address this problem. However, the GraphSC suffers from the fact that sparse coordinates are biased toward a constant and the Laplacian embedding often cannot preserve local topology well as we expected [20].
In this paper, we propose a novel sparse coding algorithm called Hessian Sparse Coding (HessianSC) to overcome this drawback. HessianSC is based on the second-order Hessian energy, which favors functions whose values vary linearly with respect to geodesic distance. We use the Hessian energy as a smooth operator to preserve the local manifold structure. Specifically, the Hessian energy is incorporated into the sparse coding objective function. This way, the obtained representations vary smoothly along the geodesics of the data manifold. By preserving locality much better, HessianSC can have more discriminating power comparing to traditional sparse coding algorithms and GraphSC.
The rest of this paper is organized as follows. In Section 2, we provide a brief description of the sparse coding problem and the GraphSC algorithm. We then introduce the HessianSC algorithm, as well as the optimization scheme, in Section 3. The experimental results on two image data sets are presented in Section 4. Finally, we conclude the paper in Section 5.
Section snippets
Related works
In this section, we provide a brief description of the sparse coding problem and the recently proposed graph regularized sparse coding algorithm which explicitly takes into account the local manifold structure of the data while learning the sparse representations.
Hessian sparse coding
In this section, we present our Hessian Sparse Coding algorithm which introduces the Hessian energy as a regularizer to take into account the local manifold structure of the data space.
Experiments
In this section, we compare our HessianSC algorithm with GraphSC, sparse coding (SC) and some state-of-the-art clustering algorithms on the image clustering task. Three data sets are used in the experiment to show the effectiveness of our proposed algorithm.
- •
The first data set is the Yale database, which contains 165 gray scale images of 15 individuals. All images demonstrate variations in lighting condition (left-light, center-light, right-light), facial expression (normal, happy, sad, sleepy,
Conclusions
In this paper, we have presented a novel sparse coding method for considering the local manifold structure called Hessian Sparse Coding (HessianSC). Comparing with Laplacian regularizer, the major advantage of Hessian energy is that it prefers functions which vary linearly with respect to the geodesics on the data manifold. As a result, HessianSC can have more discriminating power than the traditional sparse coding and Laplacian sparse coding. Experimental results on image clustering show that
Acknowledgement
This work was supported in part by National Science Foundation of China under grant 61173185 and 61170142, National High Technology Research and Development Program of China (863 Program) under grant 2013AA040601.
Miao Zheng received the BS degree in Computer Science from Zhejiang University, China, in 2008. He is currently a candidate for a PhD degree in Computer Science at Zhejiang University. His research interests include machine learning, information retrieval, and data mining.
References (35)
- et al.
Sparse coding with an overcomplete basis seta strategy employed by v1?
Vision Research
(1997) - et al.
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Nature
(1996) - et al.
Atomic decomposition by basis pursuit
SIAM Journal on Scientific Computing
(1999) - H. Lee, A. Battle, R. Raina, A.Y. Ng, Efficient sparse coding algorithms, in: Advances in Neural Information Processing...
- J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Supervised dictionary learning, in: Advances in Neural...
- S. Bengio, F. Pereira, Y. Singer, D. Strelow, Group sparse coding, in: Advances in Neural Information Processing...
- K. Huang, S. Aviyente, Sparse representation for signal classification, in: Advances in Neural Information Processing...
- K. Yu, T. Zhang, Y. Gong, Nonlinear learning using local coordinate coding, in: Advances in Neural Information...
- J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong, Locality-constrained linear coding for image classification, in:...
- et al.
Stable recovery of sparse overcomplete representations in the presence of noise
IEEE Transactions on Information Theory
(2006)
Sparse representation for color image restoration
IEEE Transactions on Image Processing
Manifold regularizationa geometric framework for learning from labeled and unlabeled examples
The Journal of Machine Learning Research
Nonlinear dimensionality reduction by locally linear embedding
Science
Cited by (0)
Miao Zheng received the BS degree in Computer Science from Zhejiang University, China, in 2008. He is currently a candidate for a PhD degree in Computer Science at Zhejiang University. His research interests include machine learning, information retrieval, and data mining.
Jiajun Bu received the BS and PhD degrees in Computer Science from Zhejiang University, China, in 1995 and 2000, respectively. He is a Professor in College of Computer Science, Zhejiang University. His research interests include embedded system, data mining, information retrieval and mobile database.
Chun Chen received the BS degree in Mathematics from Xiamen University, China, in 1981, and his MS and PhD degrees in Computer Science from Zhejiang University, China, in 1984 and 1990, respectively. He is a Professor in College of Computer Science, Zhejiang University. His research interests include information retrieval, data mining, computer vision, computer graphics and embedded technology.