Skip to main content
Log in

Subspace Learning by \(\ell ^{0}\)-Induced Sparsity

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Subspace clustering methods partition the data that lie in or close to a union of subspaces in accordance with the subspace structure. Such methods with sparsity prior, such as sparse subspace clustering (SSC) (Elhamifar and Vidal in IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781, 2013) with the sparsity induced by the \(\ell ^{1}\)-norm, are demonstrated to be effective in subspace clustering. Most of those methods require certain assumptions, e.g. independence or disjointness, on the subspaces. However, these assumptions are not guaranteed to hold in practice and they limit the application of existing sparse subspace clustering methods. In this paper, we propose \(\ell ^{0}\)-induced sparse subspace clustering (\(\ell ^{0}\)-SSC). In contrast to the required assumptions, such as independence or disjointness, on subspaces for most existing sparse subspace clustering methods, we prove that \(\ell ^{0}\)-SSC guarantees the subspace-sparse representation, a key element in subspace clustering, for arbitrary distinct underlying subspaces almost surely under the mild i.i.d. assumption on the data generation. We also present the “no free lunch” theorem which shows that obtaining the subspace representation under our general assumptions can not be much computationally cheaper than solving the corresponding \(\ell ^{0}\) sparse representation problem of \(\ell ^{0}\)-SSC. A novel approximate algorithm named Approximate \(\ell ^{0}\)-SSC (A\(\ell ^{0}\)-SSC) is developed which employs proximal gradient descent to obtain a sub-optimal solution to the optimization problem of \(\ell ^{0}\)-SSC with theoretical guarantee. The sub-optimal solution is used to build a sparse similarity matrix upon which spectral clustering is performed for the final clustering results. Extensive experimental results on various data sets demonstrate the superiority of A\(\ell ^{0}\)-SSC compared to other competing clustering methods. Furthermore, we extend \(\ell ^{0}\)-SSC to semi-supervised learning by performing label propagation on the sparse similarity matrix learnt by A\(\ell ^{0}\)-SSC and demonstrate the effectiveness of the resultant semi-supervised learning method termed \(\ell ^{0}\)-sparse subspace label propagation (\(\ell ^{0}\)-SSLP).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Continuous distribution here indicates that the data distribution is non-degenerate in the sense that the probability measure of any hyperplane of dimension less than that of the subspace is 0.

  2. Even one would stick to the very original formulation without noise tolerance, (4) is still equivalent to (7) with some Lagrangian multiplier \(\lambda \).

References

  • Asuncion, A. D. N. (2007). UCI Machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html.

  • Bhatia, K., Jain, P., & Kar, P. (2015). Robust regression via hard thresholding. In Advances in neural information processing systems 28: Annual conference on neural information processing systems 2015 (pp 721–729). Montreal, December 7–12, 2015.

  • Bolte, J., Sabach, S., & Teboulle, M. (2014). Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146(1–2), 459–494.

    Article  MathSciNet  MATH  Google Scholar 

  • Candes, E., & Tao, T. (2005). Decoding by linear programming. IEEE Transactions on Information Theory, 51(12), 4203–4215.

    Article  MathSciNet  MATH  Google Scholar 

  • Cands, E. J. (2008). The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, 346(910), 589–592.

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng, H., Liu, Z., Yang, L., & Chen, X. (2013). Sparse representation and learning in visual recognition: Theory and applications. Signal Processing, 93(6), 1408–1425.

    Article  Google Scholar 

  • Cheng, B., Yang, J., Yan, S., Fu, Y., & Huang, T. S. (2010). Learning with l1-graph for image analysis. IEEE Transactions on Image Processing, 19(4), 858–866.

    Article  MathSciNet  MATH  Google Scholar 

  • Dyer, E. L., Sankaranarayanan, A. C., & Baraniuk, R. G. (2013). Greedy feature selection for subspace clustering. Journal of Machine Learning Research, 14, 2487–2517.

    MathSciNet  MATH  Google Scholar 

  • Elhamifar, E., & Vidal, R. (2011). Sparse manifold clustering and embedding. In NIPS (pp. 55–63).

  • Elhamifar, E., & Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2765–2781.

    Article  Google Scholar 

  • Hyder, M., & Mahata, K. (2009). An approximate 10 norm minimization algorithm for compressed sensing. In IEEE international conference on acoustics, speech and signal processing, 2009. ICASSP 2009 (pp. 3365–3368)

  • Jenatton, R., Mairal, J., Bach, F. R., & Obozinski, G. R. (2010). Proximal methods for sparse hierarchical dictionary learning. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 487–494).

  • Karasuyama, M., & Mamitsuka, H. (2013). Manifold-based similarity adaptation for label propagation. In Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013 (pp. 1547–1555). Proceedings of a meeting held December 5–8, 2013, Lake Tahoe.

  • Li, J., Kong, Y., Fu, Y. (2017). Sparse subspace clustering by learning approximation \(\mathscr {l}\)0 codes. In Proceedings of the 31st AAAI conference on artificial intelligence (pp. 2189–2195). San Francisco, February 4–9, 2017.

  • Liu, G., Lin, Z., & Yu, Y. (2010). Robust subspace segmentation by low-rank representation. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 663–670). Haifa, June 21–24, 2010.

  • Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., & Ma, Y. (2013). Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 171–184.

    Article  Google Scholar 

  • Mairal, J., Bach, F. R., Ponce, J., Sapiro, G., Zisserman, A. (2008). Supervised dictionary learning. In Advances in neural information processing systems 21, Proceedings of the 22nd annual conference on neural information processing systems (pp. 1033–1040). Vancouver, December 8–11, 2008.

  • Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.

    MathSciNet  MATH  Google Scholar 

  • Mancera, L., & Portilla, J. (2006). L0-norm-based sparse representation through alternate projections. In 2006 IEEE international conference on image processing (pp. 2089–2092).

  • Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In NIPS (pp. 849–856).

  • Park, D., Caramanis, C., & Sanghavi, S. (2014). Greedy subspace clustering. In Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014 (pp. 2753–2761). Montreal, December 8–13, 2014.

  • Peng, X., Yi, Z., & Tang, H. (2015) Robust subspace clustering via thresholding ridge regression. In AAAI conference on artificial intelligence (AAAI) (pp. 3827–3833). AAAI

  • Plummer, D., & Lovász, L. (1986). Matching theory. Amsterdam: North-Holland Mathematics Studies, Elsevier.

    MATH  Google Scholar 

  • Soltanolkotabi, M., & Cands, E. J. (2012). A geometric analysis of subspace clustering with outliers. The Annals of Statistics, 40(4), 2195–2238.

    Article  MathSciNet  MATH  Google Scholar 

  • Soltanolkotabi, M., Elhamifar, E., & Cands, E. J. (2014). Robust subspace clustering. The Annals of Statistics, 42(2), 669–699.

    Article  MathSciNet  MATH  Google Scholar 

  • Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242.

    Article  MathSciNet  MATH  Google Scholar 

  • Vidal, R. (2011). Subspace clustering. IEEE Signal Processing Magazine, 28(2), 52–68.

    Article  Google Scholar 

  • Wang, Y., & Xu, H. (2013). Noisy sparse subspace clustering. In Proceedings of the 30th international conference on machine learning, ICML 2013 (pp. 89–97). Atlanta, 16–21 June 2013.

  • Wang, Z., Gu, Q., Ning, Y., & Liu, H. (2015). High dimensional EM algorithm: Statistical optimization and asymptotic normality. In Advances in neural information processing systems 28: Annual conference on neural information processing systems 2015 (pp. 2521–2529). Montreal, December 7–12, 2015.

  • Wang, Y., Wang, Y. X., & Singh, A. (2016). Graph connectivity in noisy sparse subspace clustering. In CoRR abs/1504.01046.

  • Wang, Y. X., Xu, H., & Leng, C. (2013). Provable subspace clustering: When LRR meets SSC. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Weinberger (Eds.), Advances in Neural Information Processing Systems 26 (pp. 64–72). New York: Curran Associates, Inc.

    Google Scholar 

  • Yan, S., & Wang, H. (2009). Semi-supervised learning by sparse representation. In SDM (pp. 792–801).

  • Yang, Y., Feng, J., Jojic, N., Yang, J., & Huang, T. S. (2016). \(\mathscr {l}\) 0 \(\mathscr {l}\) 0 -sparse subspace clustering. In Computer vision—ECCV 2016—14th European conference. Amsterdam, October 11–14, 2016, Proceedings, Part II (pp. 731–747). https://doi.org/10.1007/978-3-319-46475-6_45.

  • Yang, Y., Wang, Z., Yang, J., Han, J., & Huang, T. (2014a). Regularized l1-graph for data clustering. In Proceedings of the British machine vision conference. BMVA Press.

  • Yang, Y., Wang, Z., Yang, J., Wang, J., Chang, S., & Huang, T. S. (2014b). Data clustering by laplacian regularized l1-graph. In Proceedings of the 28th AAAI conference on artificial intelligence (pp. 3148–3149). Québec City, July 27–31, 2014.

  • Yang, J., Yu, K., Gong, Y., & Huang, T. S. (2009) Linear spatial pyramid matching using sparse coding for image classification. In CVPR (pp. 1794–1801).

  • You, C., Robinson, D., & Vidal, R. (2016). Scalable sparse subspace clustering by orthogonal matching pursuit. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 3918–3927). Las Vegas, NV, June 27–30, 2016. https://doi.org/10.1109/CVPR.2016.425.

  • Zhang, T., Ghanem, B., Liu, S., Xu, C., & Ahuja, N. (2013). Low-rank sparse coding for image classification. In IEEE international conference on computer vision, ICCV 2013 (pp. 281–288). Sydney, December 1–8, 2013.

  • Zhang, C. H., & Zhang, T. (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statistical Science, 27(4), 576–593.

    Article  MathSciNet  MATH  Google Scholar 

  • Zheng, X., Cai, D., He, X., Ma, W. Y., & Lin, X. (2004). Locality preserving clustering for image database. In Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04 (pp. 885–891). New York: ACM

  • Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., et al. (2011). Graph regularized sparse coding for image representation. IEEE Transactions on Image Processing, 20(5), 1327–1336.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schölkopf, B. (2003). Learning with local and global consistency. In advances in neural information processing systems 16. Neural Information Processing Systems, NIPS 2003. Vancouver, December 8–13, 2003.

  • Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 28th international conference machine learning (ICML 2003) (pp. 912–919). Washington, August 21–24, 2003.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingzhen Yang.

Additional information

Communicated by Edwin Hancock, Richard Wilson, Will Smith, Adrian Bors, Nick Pears.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This material is based upon work supported by the findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. The work of Yingzhen Yang was supported in part by an IBM gift grant to Beckman Institute, UIUC. The work of Jiashi Feng was partially supported by National University of Singapore startup Grant R-263-000-C08-133 and Ministry of Education of Singapore AcRF Tier One Grant R-263-000-C21-112.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Feng, J., Jojic, N. et al. Subspace Learning by \(\ell ^{0}\)-Induced Sparsity. Int J Comput Vis 126, 1138–1156 (2018). https://doi.org/10.1007/s11263-018-1092-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-018-1092-4

Keywords

Navigation