Elsevier

Neurocomputing

Volume 382, 21 March 2020, Pages 162-173
Neurocomputing

Local manifold sparse model for image classification

https://doi.org/10.1016/j.neucom.2019.11.084Get rights and content

Abstract

How to discriminate the types of images is very important for image understanding. A large number of classifiers have been designed for the automatic classification of image. In recent years, sparse representation has widely used in the field of image classification. However, most of the sparse classification methods are based on the sparse reconstruction which ignores the intrinsic structure of data. Therefore, in this paper, we proposed a local manifold sparse classifier (LMSC) on the basis of the sparse manifold assumption and the local structure of data. The proposed method uses the sparse manifold assumption and the local neighbors of data to construct a sparse representation model. Then, we can obtain the sparse coefficients with the proposed sparse model. Finally, we can calculate the probability of the unknown image in each class and assign the class label of the maximization probability to this image. LMSC can reveal the intrinsic structure of data and improve the accuracy of image classification. Experiments on two handwritten digit image data sets (MNIST and Semeion) and a hyperspectral remote sensing image data set (Pavia university) show that the proposed method can achieve better representation and classification accuracy compared to some state-of-the-art classification methods.

Introduction

In recent years, with the rapid development of artificial intelligence, image has become one of the most common carriers to transfer information in life [1], [2]. With different imaging devices, various kinds of images can be captured to carry different information, such as face image [3], CT image [4], scene image [5], and hyperspectral image [6], [7], etc. In computer vision, images have been applied in many fields, such as face recognition [8], CT reconstruction [9], scene recognition [10], and land cover classification [11], etc. In these applications, it is very important to effectively discriminate the categories of images with the image information [12].

Image classification is an effective technique to automatically discriminate the classes of images with some known information [13], [14]. The process of classification aims to construct a classification model with some priori data, and then this classification model can be used to predict the categories of unknown images. The traditional machine learning classification algorithms include logistic regression (LR) [15], nearest neighbor classifier (NN) [16], and Bayesian classifier [17], etc. Logistic regression is a statistical model that is usually considered a classification problem as a binary dependent variable. The LR model uses a sigmoid function to represent the probability of an event [18]. The predicted result can be obtained by the sigmoid function value of the linear combination of predictor variables. The NN classifier determines the type of an unknown sample through its nearest neighbor sample from the known data [19]. The predicted label can be achieved by the types of its nearest neighbor. The Bayesian classifier algorithm is based on the principle of maximum probability, that is to say, for a sample to be classified, the posterior probability of this sample in each category can be transformed by the Bayesian criterion to calculate the priori probability and then the sample is classified to this category with maximum probability [20]. Although it is very convenient to use these traditional classifiers, the results are easily affected by the distribution of samples and they usually have low classification accuracy.

To solve these problems, sparse representation has been introduced for image classification inspired by the field of signal processing [21], [22]. Sparse representation can represent a signal by few atoms with an over-complete dictionary [23], [24], and it has been applied to image analysis such as image classification [25], image reconstruction [26], image denoising [27], image compression [28], [29] and image super-resolution [30], etc. In the application of image classification, sparse representation classification (SRC) [31] and orthogonal matching pursuit (OMP) [32] were proposed to encode a test sample as a sparse linear combination of all training samples. Then, the minimum representation error in each class is used to discriminate the class of the test sample. For face recognition, SRC not only greatly improves the classification accuracy but also possesses strong robustness to occlusion. However, it doesn’t consider the priori information of data, which limits the application of SRC. Therefore, a series of extension algorithms of SRC have been proposed to further improve the effectiveness of classification. Deng et al. [33] proposed an extended SRC (ESRC) with an intra-class variant dictionary for face recognition, which can obtain good classification accuracy even in the case of a few training samples per person or single training samples. ESRC represents a possible change between the training image and the test image by constructing an auxiliary intra-class variation dictionary, where the intra-class variation dictionary consists of intra-class variation bases. One year later, Deng et al. [34] proposed a new model called superposed SRC (SSRC) method. Actually, SSRC is a simple variant of SRC, which generates from a “prototype plus mutation” representation model for sparsity based face recognition. The dictionary consists of the class centroid and the sample-to-centroid differences. SSRC can perform well with the dictionary collected under uncontrolled training conditions. Lai et al. [35] proposed a class-wise sparse representation (CSR) method for face recognition. CSR seeks an optimum representation of a test sample by minimizing the class-wise sparsity of training samples. Huang et al. [36] proposed a class specific sparse representation (CSSR) classifier that uses the class information in the representation learning. CSSR divides the samples into groups with class labels and uses the groups to represent test samples for classification. In addition, some researchers consider that the sparse representation methods can improve the classification accuracy because of the ℓ1 norm guaranteed the sparseness. However, Zhang et al. [37] have different opinions and consider that the collaborative representation, which will play an important role for the effect of classification, is neglected in SRC. Therefore, a collaborative representation classification (CRC) algorithm was proposed on the basis of the regularized least squares [38], and the authors demonstrated that the effectiveness of sparse representation classification is not due to the role of the ℓ1 norm. Although these classifiers based on sparse representation mostly improve the classification accuracy and reduce the dependence on the completeness of training samples, they don’t consider the intrinsic structure of samples that represents a certain property of data.

To reveal the intrinsic structure of samples, the local information has been widely used in the previous works [39]. Based the ideas of local information, many manifold methods have been developed to reveal the intrinsic structure of data, such as isometric mapping (ISOMAP) [40], locally linear embedding (LLE) [41], and Laplacian eigenmaps (LE) [42]. These methods utilize the local information to discover the low-dimensional manifold structure of high-dimensional data, which reflects the intrinsic properties of data and improves the classification results [43]. In addition, Wang et al. [44] proposed a locality-constrained linear coding (LLC) method to represent the local structure of data based on the collaborative representation. LLC considers that locality is more essential than sparsity and locality must lead to sparsity but not necessary vice versa. However, LLC just uses the local information which neglects the homogeneity of data. To discover the homogenous data, Elhamifar et al. [45] proposed the sparse manifold clustering and embedding (SMCE) method, and it assumes that each data point possesses a small neighborhood in which only the data points from the same manifold approximately lie in a low-dimensional affine subspace [46]. SMCE uses the non-zero elements from the solution of the sparse representation to discriminate the data points in the same manifold and achieve the clustering and embedding of data. However, SMCE selects the same manifold data from the global data, which is very time-consuming and will be impacted by the data noise.

Therefore, we introduced the local information into SMCE and proposed a local manifold sparse classifier (LMSC) in this paper. Firstly, we construct a local constraint matrix by the neighbor points of an unknown sample in the priori samples. Then, we use the local constraint matrix to construct the sparse representation model on the basis of SMCE. Finally, we calculate the similar probability in each class with the sparse coefficients and classify the unknown sample to the class possessing the maximum similar probability. LMSC combines the local manifold structure and sparse properties to discriminate the classes of the unknown samples. The experiments on two handwritten digit image data sets (Semeion and MNIST) and a hyperspectral remote sensing image data set (Pavia University) demonstrate that the proposed method can achieve better representation and classification accuracy compared with some state-of-the-art methods.

For the remainder of this paper, we organized as follows. In the second section, we introduced the related works. In the third section, we proposed a new classification method. In the fourth section, we showed the experiments on three data sets and analyzed the results of the experiment. Finally, we made a summary of our work and did a prospect for the future research.

Section snippets

Sparse representation

Sparse representation is an extension of the traditional signal processing method such as Fourier and wavelet transform. It represents a signal with an over-complete dictionary that the number of atoms in a dictionary is larger than the dimensionality. For the representation coefficients, most of them are zero and only few of them are nonzero that their corresponding atoms can reveal the intrinsic properties of this signal [47], [48]. Sparse representation has been successfully applied in

Local manifold sparse classifier

In this section, we propose a new classification method, named local manifold sparse classifier (LMSC).

Experiments and discussion

In this section, we selected two handwritten digit images (Semeion and MNIST data sets) and a hyperspectral remote sensing image (Pavia University data set) to demonstrate the effectiveness of the proposed method and compared it with some state-of-the-art methods such as NN, least squares regression algorithm (LSR) [55], CRC, LLC, OMP, SRC, SSRC, CSR, CSSR. For each condition, we repeated the experiments 5 times under each condition and showed the average overall classification accuracy (OA)

Conclusion

In this paper, we propose a local manifold sparse classification (LMSC) method. The proposed method constructs a local manifold sparse representation model based on the sparse manifold assumption and the local information of data. With this model, we can obtain the sparse representation coefficients of an unknown sample. Using the coefficients, we can calculate the probability distribution of this unknown sample in each class, and the class label can be obtained with the maximization

Declaration of Competing Interest

We introduce the local information of data into a sparse manifold assumption model and construct a classification method based on the sparse coefficients from this model.

Acknowledgments

The authors would like to thank the anonymous reviewers for their comments on this paper. This work is supported in part by the National Natural Science Foundation of China under Grant 61801336, by the National Key Technology Research and Development Program under Grant 2018YFA0605500, by the Open Research Fund of State Key Laboratory of Integrated Services Networks under Grant ISN20-12, by the Open Research Fund of Key Laboratory of Digital Earth Science, Institute of Remote Sensing and

Fulin Luo (S’16-M’18) received the Ph.D and M.S. degree in Instrument Science and Technology from Chongqing University, Chongqing, China, in 2016 and 2013, respectively. He is currently an Associate Researcher with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University. His research interests are hyperspectral image classification, image processing, sparse representation and manifold learning in general.

References (62)

  • Q. Gao et al.

    Discriminative sparsity preserving projections for image recognition

    Pattern Recognition

    (2015)
  • T. Guo et al.

    Data induced masking representation learning for face data analysis

    Knowl. Based Syst.

    (2019)
  • S. Zhan et al.

    Unsupervised feature extraction by low-rank and sparsity preserving embedding

    Neural Netw.

    (2019)
  • Z. Wang, B. Du, W. Tu, L. Zhang, D. Tao, Incorporating distribution matching into uncertainty for multiple kernel...
  • J. Shi et al.

    Hallucinating face image by regularization models in high-resolution feature space

    IEEE Trans. Image Process.

    (2018)
  • J. Shao et al.

    Tracking objects from satellite videos: a velocity feature based correlation filter

    IEEE Trans. Geosci. Remote Sensing

    (2019)
  • F. Luo et al.

    Local geometric structure feature for dimensionality reduction of hyperspectral imagery

    Remote Sensing

    (2017)
  • L. Tong et al.

    An improved multiobjective discrete particle swarm optimization for hyperspectral endmember extraction

    IEEE Trans. Geosci. Remote Sens.

    (2019)
  • W. Wu et al.

    Non-local low-rank cube-based tensor factorization for spectral CT reconstruction

    IEEE Trans. Medical Imaging

    (2019)
  • B. Du et al.

    Robust and discriminative labeling for multi-label active learning based on maximum correntropy criterion

    IEEE Trans. Image Process

    (2017)
  • F. Luo et al.

    Semisupervised sparse manifold discriminative analysis for feature extraction of hyperspectral images

    IEEE Trans. Geosci. Remote Sens.

    (2016)
  • F. Luo et al.

    Sparse-Adaptive Hypergraph Discriminant Analysis for Hyperspectral Image Classification

    IEEE Geosci. Remote Sens. Lett.

    (2019)
  • B. Du et al.

    Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion

    IEEE Trans. Cybern.

    (2019)
  • K. Song et al.

    Rank-k2-d multinomial logistic regression for matrix data classification

    IEEE Trans. Neural Netw. Learn. Syst.

    (2018)
  • F. Liu et al.

    Indefinite kernel logistic regression with concave-inexact-convex procedure

    IEEE Trans. Neural Netw. Learn. Syst.

    (2019)
  • X. Huang et al.

    Kernel k-nearest neighbor classifier based on decision tree ensemble for sar modeling analysis

    Anal. Methods

    (2014)
  • D. Qian et al.

    Drowsiness detection by bayesian-copula discriminant classifier based on eeg signals during daytime short nap

    IEEE Transactions on Bio-Med. Eng.

    (2016)
  • B. Du et al.

    Beyond the sparsity-based target detector: a hybrid sparsity and statistics-based detector for hyperspectral images

    IEEE Trans. Image Process

    (2016)
  • Z. Zhang et al.

    A survey of sparse representation: algorithms and applications

    IEEE Access

    (2017)
  • Z. Zhang, J. Ren, W. Jiang, Z. Zhang, R. Hong, S. Yan, M. Wang, Joint subspace recovery and enhanced locality driven...
  • Z. Zhao et al.

    Discriminative sparse flexible manifold embedding with novel graph for robust visual representation and label propagation

    Pattern Recogn.

    (2017)
  • Cited by (0)

    Fulin Luo (S’16-M’18) received the Ph.D and M.S. degree in Instrument Science and Technology from Chongqing University, Chongqing, China, in 2016 and 2013, respectively. He is currently an Associate Researcher with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University. His research interests are hyperspectral image classification, image processing, sparse representation and manifold learning in general.

    Yajuan Huang received the Bachelor degree in engineering from Anhui University of Finance and Economics in 2017. She is currently a master student with the School of Computer Science, Wuhan University. Her research interests are sparse representation and manifold learning.

    Weiping Tu received the Ph.D and M.S. degree in Communication and Information System from Wuhan University, Wuhan, China, in 2011 and 2002, respectively. She is currently an Associate Professor with the National Engineering Research Center for Multimedia Software, Wuhan University. Her research interests are audio signal classification, speech separation and image processing.

    Jiamin Liu received the M. S. and the Ph. D. degrees in Instrument Science and Technology from Chongqing University, China, in 1998 and 2001, respectively. He is currently an associate professor at Chongqing University. His research interests are biometrics, image processing and pattern recognition in general.

    View full text