Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification

Yang, Meng; Zhang, Lei; Feng, Xiangchu; Zhang, David

doi:10.1007/s11263-014-0722-8

Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification

Published: 13 May 2014

Volume 109, pages 209–232, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Meng Yang¹,
Lei Zhang²,
Xiangchu Feng³ &
…
David Zhang²

5963 Accesses
408 Citations
Explore all metrics

Abstract

The employed dictionary plays an important role in sparse representation or sparse coding based image reconstruction and classification, while learning dictionaries from the training data has led to state-of-the-art results in image classification tasks. However, many dictionary learning models exploit only the discriminative information in either the representation coefficients or the representation residual, which limits their performance. In this paper we present a novel dictionary learning method based on the Fisher discrimination criterion. A structured dictionary, whose atoms have correspondences to the subject class labels, is learned, with which not only the representation residual can be used to distinguish different classes, but also the representation coefficients have small within-class scatter and big between-class scatter. The classification scheme associated with the proposed Fisher discrimination dictionary learning (FDDL) model is consequently presented by exploiting the discriminative information in both the representation residual and the representation coefficients. The proposed FDDL model is extensively evaluated on various image datasets, and it shows superior performance to many state-of-the-art dictionary learning methods in a variety of classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of unsupervised feature selection methods

Article 29 January 2019

Sparse Recovery of Hyperspectral Signal from Natural RGB Images

SSCNet: learning-based subspace clustering

Article Open access 08 April 2024

Notes

Illuminations {0,1,3,4, 6,7,8,11,13,14,16,17,18,19}.
Illuminations {0,2,4,6,8,10,12,14,16,18}.

References

Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(1), 4311–4322.
Article MathSciNet Google Scholar
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.
Article MATH MathSciNet Google Scholar
Bengio, S., Pereira, F., Singer, Y., & Strelow, D. (2009). Group sparse coding. In Proceedings of the Neural Information Processing Systems
Bobin, J., Starck, J., Fadili, J., Moudden, Y., & Donoho, D. (2007). Morphological component analysis: An adaptive thresholding strategy. IEEE Transactions on Image Processing, 16(11), 2675–2681.
Article MATH MathSciNet Google Scholar
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge university press.
Book MATH Google Scholar
Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.
Article Google Scholar
Candes, E. (2006). Compressive sampling. International Congress of Mathematicians, 3, 1433–1452.
MathSciNet Google Scholar
Castrodad, A., & Sapiro, G. (2012). Sparse modeling of human actions from motion imagery. International Journal of Computer Vision, 100, 1–15.
Article Google Scholar
Cooley, J. W., & Tukey, J. W. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19, 297–301.
Article MATH MathSciNet Google Scholar
Deng, W. H., Hu, J. N., & Guo, J. (2012). Extended SRC: Undersampled face recognition via intraclass variation dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1864–1870.
Article Google Scholar
Duda, R., Hart, P., & Stork, D. (2000). Pattern classification (2nd ed.). New York: Wiley-Interscience.
Google Scholar
Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15(12), 3736–3745.
Article MathSciNet Google Scholar
Engan, K., Aase, S. O., & Husoy, J. H. (1999). Method of optimal directions for frame design. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
Fernando, B., Fromont, E., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In: Proceedings of the European Conference Computer Vision
Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In: Proceedings of the International Conference Computer Vision
Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.
Article Google Scholar
Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-PIE. Image and Vision Computing, 28, 807–813.
Article Google Scholar
Guha, T., & Ward, R. K. (2012). Learning sparse representations for human action recognition. IEEE Transactions on Pattern Analysis and Machine Learning, 34(8), 1576–1888.
Article Google Scholar
Guo, Y., Li, S., Yang, J., Shu, T., & Wu, L. (2003). A generalized Foley–Sammon transform based on generalized Fisher discrimination criterion and its application to face recognition. Pattern Recognition Letter, 24(1), 147–158.
Article MATH Google Scholar
Hoyer, P. O. (2002). Non-negative sparse coding. In: Proceedings of the IEEE Workshop Neural Networks for Signal Processing
Huang, K., & Aviyente, S. (2006). Sparse representation for signal classification. In: Proceedings of the Neural Information and Processing Systems
Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5), 550–554.
Article Google Scholar
Jenatton, R., Mairal, J., Obozinski, G., & Bach, F. (2011). Proximal methods for hierarchical sparse coding. Journal of Machine Learning Research, 12, 2234–2297.
Google Scholar
Jia, Y. Q., Nie, F. P., & Zhang, C. S. (2009). Trace ratio problem revisited. IEEE Transactions on Neural Network, 20(4), 729–735.
Article Google Scholar
Jiang, Z. L., Lin, Z., & Davis, L. S. (2013). abel consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 533.
Article Google Scholar
Jiang, Z. L., Zhang, G. X., & Davis, L. S. (2012). Submodular dictionary learning for sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
Kim, S. J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). A interior-point method for large-scale $l_{1}$-regularized least squares. IEEE Journal on Selected Topics in Signal Processing, 1, 606–617.
Article Google Scholar
Kong, S., & Wang, D. H. (2012). A dictionary learning approach for classification: Separating the particularity and the commonality. In: Proceedings of the European Conference on Computer Vision.
Li, H., Jiang, T., & Zhang, K. (2006). Efficient and robust feature extraction by maximum margin criterion. IEEE Transactions on Neural Network, 17(1), 157–165.
Article Google Scholar
Lian, X. C., Li, Z. W., Lu, B. L., & Zhang, L. (2010). Max-Margin Dictionary Learning for Multi-class Image Categorization. In: Proceedings of the European Conference on Computer Vision
Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791–804.
Article Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zissserman, A. (2008b). Learning discriminative dictionaries for local image analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Supervised dictionary learning. In: Proceedings of the Neural Information and Processing Systems
Mairal, J., Elad, M., & Sapiro, G. (2008a). Sparse representation for color image restoration. IEEE Transactions on Image Processing, 17(1), 53–69.
Article MathSciNet Google Scholar
Mairal, J., Leordeanu, M., Bach, F., Hebert, M., & Ponce, J. (2008c). Discriminative sparse image models for class-specific edge detection and image interpretation. In: Proceedings of the European Conference on Computer Vision
Mallat, S. (1999). A wavelet tour of signal processing (2nd ed.). San Diego: Academic Press.
MATH Google Scholar
Martinez, A., & Benavente, R. (1998). The AR face database (p. 24). Report No: CVC Tech.
Nesterov, Y., & Nemirovskii, A. (1994). Interior-point polynomial algorithms in convex programming. Philadelphia: SIAM.
Book MATH Google Scholar
Nilsback, M., & Zisserman, A. (2006). A visual vocabulary for flower classification. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
Okatani, T., & Deguchi, K. (2007). On the Wiberg algorithm for matrix factorization in the presence of missing components. Internationall Journal of Computer Vision, 72(3), 329–337.
Article Google Scholar
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–174.
Article MATH Google Scholar
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.
Article Google Scholar
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37(23), 3311–3325.
Article Google Scholar
Pham, D., & Venkatesh, S. (2008). Joint learning and dictionary construction for pattern recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
Phillips, P. J., Flynn, P. J., Scruggs, W. T., Bowyer, K. W., Chang, J., Hoffman, K., et al. (2005). Overiew of the face recognition grand challenge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Qiu, Q., Jiang, Z. L., & Chellappa, R. (2011). Sparse dictionary-based representation and recognition of action attributes. In: Proceedings of the International Conference on Computer Vision
Ramirez, I., Sprechmann, P., & Sapiro, G. (2010). Classification and clustering via dictionary learning with structured incoherence and shared features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Rodriguez, F., & Sapiro, G. (2007). Sparse representation for image classification: Learning discriminative and reconstructive non-parametric dictionaries (p. 2213). Preprint: IMA.
Rodriguez, M., Ahmed, J., & Shah, M. (2008). A spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Rosasco, L., Verri, A., Santoro, M., Mosci, S., & Villa, S. (2009). Iterative Projection Methods for Structured Sparsity Regularization. MIT Technical Reports, MIT-CSAIL-TR-2009-050, CBCL-282.
Rubinstein, R., Bruckstein, A. M., & Elad, M. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.
Article Google Scholar
Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activeity in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Shen, L., Wang, S. H., Sun, G., Jiang, S. Q., & Huang, Q. M. (2013). Multi-level discriminative dictionary learning towards hierarchical visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Song, F. X., Zhang, D., Mei, D. Y., & Guo, Z. W. (2007). A multiple maximum scatter difference discriminant criterion for facial feature extraction. IEEE Transactions on Systems, Man, and Cybernetics Part B, 37(6), 1599–1606.
Article Google Scholar
Sprechmann, P., & Sapiro, G. (2010). Dictionary learning and sparse coding for unsupervised clustering. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing
Szabo, Z., Poczos, B., & Lorincz, A. (2011). Online group-structured dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Tropp, J. A., & Wright, S. J. (2010). Computational methods for sparse solution of linear inverse problems. Proceedings of the IEEE Conference Special Issue on Applications of Compressive Representation, 98(6), 948–958.
Google Scholar
Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71–86.
Article Google Scholar
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.
Article Google Scholar
Wagner, A., Wright, J., Ganesh, A., Zhou, Z. H., Mobahi, H., & Ma, Y. (2012). Toward a practical face recognition system: Robust alignment and illumination by sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2), 373–386.
Article Google Scholar
Wang, H., Ullah, M., Klaser, A., Laptev, I., & Schmid C. (2009). Evaluation of local spatio-temporal features for actions recognition. In: Proceedings of the British Machine Vision Conference.
Wang, H., Yan, S.C., Xu, D., Tang, X.O., & Huang, T. (2007). Trace ratio versus ratio trace for dimensionality reduction. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition.
Wang, H. R., Yuan, C. F., Hu, W. M., & Sun, C. Y. (2012). Supervised class-specific dictionary learning for sparse modeling in action recognition. Pattern Recognition, 45(11), 3902–3911.
Article Google Scholar
Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009b). Robust face recognition via sparse representation. IEEE Trans Pattern Analysis and Machine Intelligence, 31(2), 210–227.
Article Google Scholar
Wright, J. S., Nowak, D. R., & Figueiredo, T. A. M. (2009a). Sparse reconstruction by separable approximation. IEEE Transactions on Signal Processing, 57(7), 2479–2493.
Article MathSciNet Google Scholar
Wu, Y. N., Si, Z. Z., Gong, H. F., & Zhu, S. C. (2010). Learning active basis model for object detection and recognition. International Journal of Computer Vision, 90, 198–235.
Article MathSciNet Google Scholar
Xie, N., Ling, H., Hu, W., & Zhang, X. (2010). Use bin-ratio information for category and scene classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yang, A.Y., Ganesh, A., Zhou, Z. H., Sastry, S. S., & Ma, Y. (2010a). A review of fast $l_{1}$-minimization algorithms for robust face recognition. arXiv:1007.3753v2.
Yang, J. C., Wright, J., Ma, Y., & Huang, T. (2008). Image super-resolution as sparse representation of raw image patches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yang, J. C., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yang, J. C., Yu, K., & Huang, T. (2010b). Supervised Translation-Invariant Sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
Yang, M., & Zhang, L. (2010). Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Proceedings of the European Conference on Computer Vision
Yang, M., Zhang, L., Feng, X. C., & Zhang, D. (2011b). Fisher discrimination dictionary learning for sparse representatio. In: Proceedings of the International Conference on Computer Vision
Yang, M., Zhang, L., Yang, J., & Zhang, D. (2010c). Metaface learning for sparse representation based face recognition. In: Proceedings of the IEEE Conference on Image Processing
Yang, M., Zhang, L., Yang, J., & Zhang, D. (2011a). Robust sparse coding for face recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition
Yang, M., Zhang, L., & Zhang, D. (2012). Efficient misalignment robust representation for real-time face recognition. In: Proceedings of the European Conference on Computer Vision
Yao, A., Gall, J., & Gool, L. V. (2010). A hough transform-based voting framework for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Ye, G. N., Liu, D., Jhuo, I.-H., & Chang, S.-F. (2012). Robust late fusion with rank minimization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition
Yu, K., Xu, W., & Gong, Y. (2009). Deep learning with kernel regularization for visual recognition. In: Advances in Neural Information Processing Systems, p. 21.
Yuan, X. T., & Yan, S. C. (2010). Visual classification with multitask joint sparse representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Zhang, L., Yang, M., & Feng, X. C. (2011). Sparse representation or collaborative representation: which helps face recognition?. In: Proceedings of the International Conference on Computer Vision
Zhang, Q., & Li, B. X. (2010). Discriminative K-SVD for dictionary learning in face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Zhang, Z. D., Ganesh, A., Liang, X., & Ma, Y. (2012). TILT: Transformation invariant low-rank textures. International Journal of Computer Vision, 99, 1–24.
Zhou, M. Y., Chen, H. J., Paisley, J., Ren, L., Li, L. B., Xing, Z. M., et al. (2012). Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images. IEEE Transactions on Image Processing, 21(1), 130–144.
Article MathSciNet Google Scholar
Zhou, N., & Fan, J. P. (2012). Learning inter-related visual dictionary for object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Zou, H., & Hastie, T. (2005). Regularization and variable selection via elastic net. Journal of the Royal Statistical Society B, 67(Part 2), 301–320.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science & Software Engineering, Shenzhen University, Shenzhen, China
Meng Yang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
Lei Zhang & David Zhang
Department of Applied Mathematics, Xidian University, Xi’an, China
Xiangchu Feng

Authors

Meng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangchu Feng
View author publications
You can also search for this author in PubMed Google Scholar
David Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Zhang.

Additional information

Communicated by K. Ikeuchi.

Appendices

Appendix 1: ${{\varvec{tr}}}({{\varvec{S}}}_{B}({{\varvec{X}}}))$ when $X_{i}^j =0,\;j\ne i$

Denote by ${{\varvec{m}}}_{i}^{i}, {{\varvec{m}}}_{i}$ and ${{\varvec{m}}}$ the mean vectors of $\varvec{X}_{i}^i , {{\varvec{X}}}_{i}$ and ${{\varvec{X}}}$, respectively. Because $\varvec{X}_{i}^j =0$ for $j\ne i$, we can rewrite $\varvec{m}_{i} =\left[ {\mathbf{0};\ldots ;\varvec{m}_{i}^i ;\ldots ;\mathbf{0}} \right] $ and $\varvec{m}={\left[ {n_1 \varvec{m}_1^1 ;\ldots ;n_{i} \varvec{m}_{i}^i ;\ldots ;n_K \varvec{m}_K^K } \right] }\Big /n$. Therefore, the between-class scatter, i.e. $tr\left( {\varvec{S}_B \left( \varvec{X} \right) } \right) =\sum _{i=1}^K {n_{i} \left\| {\varvec{m}_{i} -\varvec{m}} \right\| _2^2 } ,$ becomes

$$\begin{aligned}&\varvec{S}_B \left( \varvec{X} \right) =\sum \nolimits _{i=1}^K {n_{i} }/{n^{2}}\left[ -n_1 \varvec{m}_1^1 ;\ldots ;\left( {n-n_{i} } \right) \varvec{m}_{i}^i ;\right. \\&\left. \quad \ldots ;-n_K \varvec{m}_K^K \right] \\&\quad \left[ -n_1 \varvec{m}_1^1 ;\ldots ;\left( {n-n_{i} } \right) \varvec{m}_{i}^i ;\ldots ;-n_K \varvec{m}_K^K \right] ^{T}. \end{aligned}$$

Denote by $\varvec{\kappa }_{i}=1-n_{i}/n$. After some derivation, the trace of ${{\varvec{S}}}_{B}({{\varvec{X}}})$ becomes

$$\begin{aligned}&tr\left( {\varvec{S}_B \left( \varvec{X} \right) } \right) =\sum _{i=1}^K {n_{i} }\Big /{n^{2}}\\&\left\| \left[ -n_1 \varvec{m}_1^1 ;\ldots ;\left( {n-n_{i} } \right) \varvec{m}_{i}^i ;\ldots ;-n_K \varvec{m}_K^K \right] \right\| _2^2\\&=\sum _{i=1}^K {\varvec{\kappa }_{i} n_{i} \left\| {\varvec{m}_{i}^i } \right\| _2^2 } . \end{aligned}$$

Because $\varvec{m}_{i}^i $ is the mean representation vector of the samples from the same class, which will generally have non-neglected values, the trace of between-class scatter will have big energy in general.

Appendix 2: The Derivation of Simplified FDDL Model

Denote by $\varvec{m}_{i}^i $ and ${{\varvec{m}}}_{i}$ the mean vector of $\varvec{X}_{i}^i $ and ${{\varvec{X}}}_{i}$, respectively. Because $\varvec{X}_{i}^j =0$ for $j\ne i$, we can rewrite $\varvec{m}_{i} =\left[ {\mathbf{0};\ldots ;\varvec{m}_{i}^i ;\ldots ;\mathbf{0}} \right] $. So the within-class scatter changes to

$$\begin{aligned} \varvec{S}_W \left( \varvec{X} \right) =\sum \nolimits _{i=1}^K {\sum \nolimits _{{\varvec{x}}_k \in {\varvec{X}}_{i} } {\left( {{{\varvec{x}}}_k^i -\varvec{m}_{i}^i } \right) \left( {{{\varvec{x}}}_k^i -\varvec{m}_{i}^i } \right) ^{T}} } . \end{aligned}$$

The trace of within-class scatter is

$$\begin{aligned} tr\left( {\varvec{S}_W \left( \varvec{X} \right) } \right) =\sum \nolimits _{i=1}^K {\sum \nolimits _{{\varvec{x}}_k \in {\varvec{X}}_{i} } {\left\| {\varvec{x}_k^i -\varvec{m}_{i}^i } \right\| _2^2 } }. \end{aligned}$$

Based on Appendix 1, the trace of between-class scatter is $tr\left( {\varvec{S}_B \left( \varvec{X} \right) } \right) =\sum \nolimits _{i=1}^K {\varvec{\kappa }_{i} n_{i} \left\| {\varvec{m}_{i}^i } \right\| _2^2 } $, where $\kappa _{i}=1-n_{i}/n$. Therefore the discriminative coefficient term, i.e. $f\left( \varvec{X} \right) =\left( {\varvec{S}_W \left( \varvec{X} \right) -\varvec{S}_B \left( \varvec{X} \right) } \right) +\eta \left\| \varvec{X} \right\| _F^2 $, could be simplified to

$$\begin{aligned} f\left( \varvec{X} \right)&= \sum \nolimits _{i=1}^K \left( \sum \nolimits _{{\varvec{x}}_k \in {\varvec{X}}_{i} } {\left\| {\varvec{x}_k^i -\varvec{m}_{i}^i } \right\| _2^2 }\right. \\&\quad \left. +\,\kappa _{i} \left( {\left\| {\varvec{X}_{i}^i } \right\| _F^2 -n_{i} \left\| {\varvec{m}_{i}^i } \right\| _2^2 } \right) +\left( {\eta -\kappa _{i} } \right) \left\| {\varvec{X}_{i}^i } \right\| _F^2 \right) . \end{aligned}$$

Denote by $\varvec{E}_{i}^j =\left[ 1 \right] _{n_{i} \times n_j } $ the matrix of size $n_{i}\times n_{j}$ with all entries being 1, then $\varvec{M}_{i}^i =\left[ {\varvec{m}_{i}^i } \right] _{1\times n_{i} } ={\varvec{X}_{i}^i \varvec{E}_{i}^i }/{n_{i} }$. Because ${{\varvec{I}}}-{\varvec{E}_{i}^i }/{n_{i} }\left( {{\varvec{E}_{i}^i }/{n_{i} }} \right) ^{T}= ({{\varvec{I}}}-{\varvec{E}_{i}^i }/{n_{i} })({{\varvec{I}}}-{\varvec{E}_{i}^i }/{n_{i} })^{T}$, we have

$$\begin{aligned}&\left\| {\varvec{X}_{i}^i } \right\| _F^2 -n_{i} \left\| {\varvec{m}_{i}^i } \right\| _2^2 =\left\| {\varvec{X}_{i}^i } \right\| _F^2 -\left\| {\left[ {\varvec{m}_{i}^i } \right] _{1\times n_{i} } } \right\| _F^2\\&\quad =tr\left( {\varvec{X}_{i}^i \left( {\varvec{I}-{\varvec{E}_{i}^i }/{n_{i} }\left( {{\varvec{E}_{i}^i }/{n_{i} }} \right) ^{T}} \right) \left( {\varvec{X}_{i}^i } \right) ^{T}} \right) \\&\quad =tr\left( {\varvec{X}_{i}^i \left( {\varvec{I}-{\varvec{E}_{i}^i }/{n_{i} }} \right) \left( {\varvec{I}-{\varvec{E}_{i}^i }/{n_{i} }} \right) ^{T}\left( {\varvec{X}_{i}^i } \right) ^{T}} \right) \\&\quad =\left\| {\varvec{X}_{i}^i -\left[ {\varvec{m}_{i}^i } \right] _{1\times n_{i} } } \right\| _F^2 =\left\| {\varvec{X}_{i}^i -\varvec{M}_{i}^i } \right\| _F^2 \end{aligned}$$

Then the discriminative coefficient term could be written as

$$\begin{aligned} f\left( \varvec{X} \right)&=\sum \nolimits _{i=1}^K \left( {\sum \nolimits _{{\varvec{x}}_k \in {\varvec{X}}_{i} } {\left\| {{{\varvec{x}}}_k^i -\varvec{m}_{i}^i } \right\| _2^2 } +\kappa _{i} \left\| {\varvec{X}_{i}^i -\varvec{M}_{i}^i } \right\| _F^2}\right. \\&\quad \left. {+\left( {\eta -\kappa _{i} } \right) \left\| {\varvec{X}_{i}^i } \right\| _F^2 } \right) \\&=\sum \nolimits _{i=1}^K {\left( {\left( {1\!+\!\kappa _{i} } \right) \left\| {\varvec{X}_{i}^i -\varvec{M}_{i}^i } \right\| _F^2 +\left( {\eta -\kappa _{i} } \right) \left\| {\varvec{X}_{i}^i } \right\| _F^2 } \right) } \end{aligned}$$

(21)

With the constraint that $\varvec{X}_{i}^j =0$ for $j\ne i$ in Eq. (10), we have

$$\begin{aligned} \left\| {\varvec{A}_{i} -\varvec{DX}_{i} } \right\| _F^2 =\left\| {\varvec{A}_{i} -\varvec{D}_{i} \varvec{X}_{i}^i } \right\| _F^2 \end{aligned}$$

(22)

With Eqs. (21) and (22), the model of simplified FDDL [i.e. Eq. (10] could be written as

where ${\lambda }'_1 ={\lambda _1 }/2,{\lambda }'_2 ={\lambda _2 \left( {1+\kappa _{i} } \right) }/2$, and ${\lambda }'_3 ={\lambda _2 \left( {\eta -\kappa _{i} } \right) }/2$.

Appendix 3: The convexity of ${{\varvec{f}}}_{i}({{\varvec{X}}})$

Let $\varvec{E}_{i}^j =\left[ 1 \right] _{n_{i} \times n_j } $ be a matrix of size $n_{i} \times n_{j}$ with all entries being 1, and let $\varvec{N}_{i} =\varvec{I}_{n_{i} \times n_{i} } -{\varvec{E}_{i}^i }/{n_{i} },\,\varvec{P}_{i} ={\varvec{E}_{i}^i }/{n_{i} }-{\varvec{E}_{i}^i }/n,\,\varvec{C}_{i}^j ={\varvec{E}_{i}^j }/n$, where $\varvec{I}_{n_{i} \times n_{i} } $ is an identity matrix of size $n_{i}\times n_{i}$.

From $f_{i} \left( {\varvec{X}_{i} } \right) =\left\| {\varvec{X}_{i} -\varvec{M}_{i} } \right\| _F^2 -\sum \nolimits _{k=1}^K {\left\| {\varvec{M}_k -M} \right\| _F^2 } +\eta \left\| {\varvec{X}_{i} } \right\| _F^2 $, we can derive that

$$\begin{aligned} f_{i} \left( {\varvec{X}_{i} } \right)&=\left\| {\varvec{X}_{i} \varvec{N}_{i} } \right\| _F^2 -\left\| {\varvec{X}_{i} \varvec{P}_{i} -\varvec{G}} \right\| _F^2\\&\quad -\sum \nolimits _{k=1,k\ne i}^K {\left\| {\varvec{Z}_k -\varvec{X}_{i} \varvec{C}_i^k } \right\| _F^2 } +\eta \left\| {\varvec{X}_{i} } \right\| _F^2 \end{aligned}$$

(24)

where $\varvec{G}=\sum \nolimits _{k=1,k\ne i}^K {\varvec{X}_k \varvec{C}_k^i } ,\,\varvec{Z}_k ={\varvec{X}_k \varvec{E}_k^k }/{n_k }-\sum \nolimits _{j=1,j\ne i}^K \varvec{X}_j \varvec{C}_j^k$.

Rewrite ${{\varvec{X}}}_{i}$ as a column vector, $\varvec{\chi }_{i} \!=\!\left[ {\varvec{r}_{i,1} ,\varvec{r}_{i,2} ,\ldots ,\varvec{r}_{i,d} } \right] ^{T}$, where ${{\varvec{r}}}_{i,j}$ is the $j^\mathrm{th}$ row vector of ${{\varvec{X}}}_{i}$, and $d$ is the total number of row vectors in ${{\varvec{X}}}_{i}$. Then $f_{i}({{\varvec{X}}}_{i})$ equals to

$$\begin{aligned}&\left\| {\hbox {diag}\left( {\varvec{N}_{i} ^{T}} \right) \varvec{\chi }_{i} } \right\| _2^2 -\left\| \hbox {diag}\left( {\varvec{P}_{i} ^{T}} \right) \varvec{\chi }_{i}\right. \\&-\left. \hbox {vec}\left( {\varvec{G}^{T}} \right) \right\| _2^2 -\sum \nolimits _{k=1,k\ne i}^K \left\| \hbox {diag}\left( {\left( {\varvec{C}_{i}^k } \right) ^{T}} \right) \varvec{\chi }_{i}\right. \\&-\left. \hbox {vec}\left( {\varvec{Z}_k^T } \right) \right\| _2^2 +\eta \left\| {\varvec{\chi }_{i} } \right\| _2^2 \end{aligned}$$

where diag $({{\varvec{T}}})$ is to construct a block diagonal matrix with each block on the diagonal being matrix ${{\varvec{T}}}$, and vec(${{\varvec{T}}})$ is to construct a column vector by concatenating all the column vectors of ${{\varvec{T}}}$.

The convexity of $f_{i}(\varvec{\chi }_{i})$ depends on whether its Hessian matrix $\nabla ^{2}f_{i}(\varvec{\chi }_{i})$ is positive definite or not (Boyd and Vandenberghe 2004). We could write the Hessian matrix of $f_{i}(\varvec{\chi }_{i})$ as

$$\begin{aligned} \nabla ^{2}f_{i} \left( {\varvec{\chi }_{i} } \right)&= 2\hbox {diag}\left( {\varvec{N}_{i} \varvec{N}_{i} ^{T}} \right) -2\hbox {diag}\left( {\varvec{P}_{i} \varvec{P}_{i} ^{T}} \right) \\&\quad -\sum \nolimits _{k=1,k\ne i}^K {\hbox {2diag}\left( {\varvec{C}_{i}^k \left( {\varvec{C}_{i}^k } \right) ^{T}} \right) } +2\eta \varvec{I}. \end{aligned}$$

$\nabla ^{2}f_{i}(\varvec{\chi }_{i})$ will be positive definite if the following matrix ${{\varvec{S}}}$ is positive definite:

$$\begin{aligned} \varvec{S}=\varvec{N}_{i} \varvec{N}_{i}^T -\left( {\varvec{P}_{i} \varvec{P}_{i}^T +\sum _{k=1,k\ne i}^K {\varvec{C}_{i}^k \left( {\varvec{C}_{i}^k } \right) ^{T}} } \right) +\eta \varvec{I}. \end{aligned}$$

After some derivations, we have

$$\begin{aligned} \varvec{S}=\left( {1+\eta } \right) \varvec{I}-\varvec{E}_{i}^i \left( {2/{n_{i} -2/n}+\sum \nolimits _{k=1}^K {{n_k }/{n^{2}}} } \right) . \end{aligned}$$

In order to make ${{\varvec{S}}}$ positive define, each eigenvalue of ${{\varvec{S}}}$ should be greater than 0. Because the maximal eigenvalue of ${{\varvec{E}}}_{i}^{i}$ is $n_{i}$, we should ensure

$$\begin{aligned} \left( {1+\eta } \right) -n_{i} \left( {2/{n_{i} -2/n}+\sum _{k=1}^K {{n_k }/{n^{2}}} } \right) >0 \end{aligned}$$

For $n=n_{1}+n_{2}+{\ldots }+n_{K}$, we have $\eta >\kappa _{i}$, which could guarantee that $f_{i}({{\varvec{X}}}_{i})$ is convex to ${{\varvec{X}}}_{i}$. Here $\kappa _{i}=1-n_{i}/n$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, M., Zhang, L., Feng, X. et al. Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification. Int J Comput Vis 109, 209–232 (2014). https://doi.org/10.1007/s11263-014-0722-8

Download citation

Received: 05 October 2012
Accepted: 29 March 2014
Published: 13 May 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11263-014-0722-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Sparse Recovery of Hyperspectral Signal from Natural RGB Images

SSCNet: learning-based subspace clustering

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: \({{\varvec{tr}}}({{\varvec{S}}}_{B}({{\varvec{X}}}))\) when \(X_{i}^j =0,\;j\ne i\)

Appendix 2: The Derivation of Simplified FDDL Model

Appendix 3: The convexity of \({{\varvec{f}}}_{i}({{\varvec{X}}})\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Sparse Recovery of Hyperspectral Signal from Natural RGB Images

SSCNet: learning-based subspace clustering

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: \({{\varvec{tr}}}({{\varvec{S}}}_{B}({{\varvec{X}}}))\) when \(X_{i}^j =0,\;j\ne i\)

Appendix 2: The Derivation of Simplified FDDL Model

Appendix 3: The convexity of \({{\varvec{f}}}_{i}({{\varvec{X}}})\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation