Subspace Learning by $$\ell ^{0}$$ -Induced Sparsity

Yang, Yingzhen; Feng, Jiashi; Jojic, Nebojsa; Yang, Jianchao; Huang, Thomas S.

doi:10.1007/s11263-018-1092-4

Subspace Learning by $\ell ^{0}$-Induced Sparsity

Published: 17 July 2018

Volume 126, pages 1138–1156, (2018)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yingzhen Yang ORCID: orcid.org/0000-0003-0502-6122^1,2,
Jiashi Feng³,
Nebojsa Jojic⁴,
Jianchao Yang¹ &
…
Thomas S. Huang²

1020 Accesses
Explore all metrics

Abstract

Subspace clustering methods partition the data that lie in or close to a union of subspaces in accordance with the subspace structure. Such methods with sparsity prior, such as sparse subspace clustering (SSC) (Elhamifar and Vidal in IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781, 2013) with the sparsity induced by the $\ell ^{1}$-norm, are demonstrated to be effective in subspace clustering. Most of those methods require certain assumptions, e.g. independence or disjointness, on the subspaces. However, these assumptions are not guaranteed to hold in practice and they limit the application of existing sparse subspace clustering methods. In this paper, we propose $\ell ^{0}$-induced sparse subspace clustering ($\ell ^{0}$-SSC). In contrast to the required assumptions, such as independence or disjointness, on subspaces for most existing sparse subspace clustering methods, we prove that $\ell ^{0}$-SSC guarantees the subspace-sparse representation, a key element in subspace clustering, for arbitrary distinct underlying subspaces almost surely under the mild i.i.d. assumption on the data generation. We also present the “no free lunch” theorem which shows that obtaining the subspace representation under our general assumptions can not be much computationally cheaper than solving the corresponding $\ell ^{0}$ sparse representation problem of $\ell ^{0}$-SSC. A novel approximate algorithm named Approximate $\ell ^{0}$-SSC (A$\ell ^{0}$-SSC) is developed which employs proximal gradient descent to obtain a sub-optimal solution to the optimization problem of $\ell ^{0}$-SSC with theoretical guarantee. The sub-optimal solution is used to build a sparse similarity matrix upon which spectral clustering is performed for the final clustering results. Extensive experimental results on various data sets demonstrate the superiority of A$\ell ^{0}$-SSC compared to other competing clustering methods. Furthermore, we extend $\ell ^{0}$-SSC to semi-supervised learning by performing label propagation on the sparse similarity matrix learnt by A$\ell ^{0}$-SSC and demonstrate the effectiveness of the resultant semi-supervised learning method termed $\ell ^{0}$-sparse subspace label propagation ($\ell ^{0}$-SSLP).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

$$\ell ^{0}$$ -Sparse Subspace Clustering

Structural Reweight Sparse Subspace Clustering

Article 01 June 2018

Robust Affine Subspace Clustering via Smoothed $\ell _{0}$-Norm

Article 06 December 2018

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

Continuous distribution here indicates that the data distribution is non-degenerate in the sense that the probability measure of any hyperplane of dimension less than that of the subspace is 0.
Even one would stick to the very original formulation without noise tolerance, (4) is still equivalent to (7) with some Lagrangian multiplier $\lambda $.

References

Asuncion, A. D. N. (2007). UCI Machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html.
Bhatia, K., Jain, P., & Kar, P. (2015). Robust regression via hard thresholding. In Advances in neural information processing systems 28: Annual conference on neural information processing systems 2015 (pp 721–729). Montreal, December 7–12, 2015.
Bolte, J., Sabach, S., & Teboulle, M. (2014). Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146(1–2), 459–494.
Article MathSciNet MATH Google Scholar
Candes, E., & Tao, T. (2005). Decoding by linear programming. IEEE Transactions on Information Theory, 51(12), 4203–4215.
Article MathSciNet MATH Google Scholar
Cands, E. J. (2008). The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, 346(910), 589–592.
Article MathSciNet MATH Google Scholar
Cheng, H., Liu, Z., Yang, L., & Chen, X. (2013). Sparse representation and learning in visual recognition: Theory and applications. Signal Processing, 93(6), 1408–1425.
Article Google Scholar
Cheng, B., Yang, J., Yan, S., Fu, Y., & Huang, T. S. (2010). Learning with l1-graph for image analysis. IEEE Transactions on Image Processing, 19(4), 858–866.
Article MathSciNet MATH Google Scholar
Dyer, E. L., Sankaranarayanan, A. C., & Baraniuk, R. G. (2013). Greedy feature selection for subspace clustering. Journal of Machine Learning Research, 14, 2487–2517.
MathSciNet MATH Google Scholar
Elhamifar, E., & Vidal, R. (2011). Sparse manifold clustering and embedding. In NIPS (pp. 55–63).
Elhamifar, E., & Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2765–2781.
Article Google Scholar
Hyder, M., & Mahata, K. (2009). An approximate 10 norm minimization algorithm for compressed sensing. In IEEE international conference on acoustics, speech and signal processing, 2009. ICASSP 2009 (pp. 3365–3368)
Jenatton, R., Mairal, J., Bach, F. R., & Obozinski, G. R. (2010). Proximal methods for sparse hierarchical dictionary learning. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 487–494).
Karasuyama, M., & Mamitsuka, H. (2013). Manifold-based similarity adaptation for label propagation. In Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013 (pp. 1547–1555). Proceedings of a meeting held December 5–8, 2013, Lake Tahoe.
Li, J., Kong, Y., Fu, Y. (2017). Sparse subspace clustering by learning approximation $\mathscr {l}$0 codes. In Proceedings of the 31st AAAI conference on artificial intelligence (pp. 2189–2195). San Francisco, February 4–9, 2017.
Liu, G., Lin, Z., & Yu, Y. (2010). Robust subspace segmentation by low-rank representation. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 663–670). Haifa, June 21–24, 2010.
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., & Ma, Y. (2013). Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 171–184.
Article Google Scholar
Mairal, J., Bach, F. R., Ponce, J., Sapiro, G., Zisserman, A. (2008). Supervised dictionary learning. In Advances in neural information processing systems 21, Proceedings of the 22nd annual conference on neural information processing systems (pp. 1033–1040). Vancouver, December 8–11, 2008.
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.
MathSciNet MATH Google Scholar
Mancera, L., & Portilla, J. (2006). L0-norm-based sparse representation through alternate projections. In 2006 IEEE international conference on image processing (pp. 2089–2092).
Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In NIPS (pp. 849–856).
Park, D., Caramanis, C., & Sanghavi, S. (2014). Greedy subspace clustering. In Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014 (pp. 2753–2761). Montreal, December 8–13, 2014.
Peng, X., Yi, Z., & Tang, H. (2015) Robust subspace clustering via thresholding ridge regression. In AAAI conference on artificial intelligence (AAAI) (pp. 3827–3833). AAAI
Plummer, D., & Lovász, L. (1986). Matching theory. Amsterdam: North-Holland Mathematics Studies, Elsevier.
MATH Google Scholar
Soltanolkotabi, M., & Cands, E. J. (2012). A geometric analysis of subspace clustering with outliers. The Annals of Statistics, 40(4), 2195–2238.
Article MathSciNet MATH Google Scholar
Soltanolkotabi, M., Elhamifar, E., & Cands, E. J. (2014). Robust subspace clustering. The Annals of Statistics, 42(2), 669–699.
Article MathSciNet MATH Google Scholar
Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242.
Article MathSciNet MATH Google Scholar
Vidal, R. (2011). Subspace clustering. IEEE Signal Processing Magazine, 28(2), 52–68.
Article Google Scholar
Wang, Y., & Xu, H. (2013). Noisy sparse subspace clustering. In Proceedings of the 30th international conference on machine learning, ICML 2013 (pp. 89–97). Atlanta, 16–21 June 2013.
Wang, Z., Gu, Q., Ning, Y., & Liu, H. (2015). High dimensional EM algorithm: Statistical optimization and asymptotic normality. In Advances in neural information processing systems 28: Annual conference on neural information processing systems 2015 (pp. 2521–2529). Montreal, December 7–12, 2015.
Wang, Y., Wang, Y. X., & Singh, A. (2016). Graph connectivity in noisy sparse subspace clustering. In CoRR abs/1504.01046.
Wang, Y. X., Xu, H., & Leng, C. (2013). Provable subspace clustering: When LRR meets SSC. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Weinberger (Eds.), Advances in Neural Information Processing Systems 26 (pp. 64–72). New York: Curran Associates, Inc.
Google Scholar
Yan, S., & Wang, H. (2009). Semi-supervised learning by sparse representation. In SDM (pp. 792–801).
Yang, Y., Feng, J., Jojic, N., Yang, J., & Huang, T. S. (2016). $\mathscr {l}$ 0 $\mathscr {l}$ 0 -sparse subspace clustering. In Computer vision—ECCV 2016—14th European conference. Amsterdam, October 11–14, 2016, Proceedings, Part II (pp. 731–747). https://doi.org/10.1007/978-3-319-46475-6_45.
Yang, Y., Wang, Z., Yang, J., Han, J., & Huang, T. (2014a). Regularized l1-graph for data clustering. In Proceedings of the British machine vision conference. BMVA Press.
Yang, Y., Wang, Z., Yang, J., Wang, J., Chang, S., & Huang, T. S. (2014b). Data clustering by laplacian regularized l1-graph. In Proceedings of the 28th AAAI conference on artificial intelligence (pp. 3148–3149). Québec City, July 27–31, 2014.
Yang, J., Yu, K., Gong, Y., & Huang, T. S. (2009) Linear spatial pyramid matching using sparse coding for image classification. In CVPR (pp. 1794–1801).
You, C., Robinson, D., & Vidal, R. (2016). Scalable sparse subspace clustering by orthogonal matching pursuit. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 3918–3927). Las Vegas, NV, June 27–30, 2016. https://doi.org/10.1109/CVPR.2016.425.
Zhang, T., Ghanem, B., Liu, S., Xu, C., & Ahuja, N. (2013). Low-rank sparse coding for image classification. In IEEE international conference on computer vision, ICCV 2013 (pp. 281–288). Sydney, December 1–8, 2013.
Zhang, C. H., & Zhang, T. (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statistical Science, 27(4), 576–593.
Article MathSciNet MATH Google Scholar
Zheng, X., Cai, D., He, X., Ma, W. Y., & Lin, X. (2004). Locality preserving clustering for image database. In Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA’04 (pp. 885–891). New York: ACM
Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., et al. (2011). Graph regularized sparse coding for image representation. IEEE Transactions on Image Processing, 20(5), 1327–1336.
Article MathSciNet MATH Google Scholar
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schölkopf, B. (2003). Learning with local and global consistency. In advances in neural information processing systems 16. Neural Information Processing Systems, NIPS 2003. Vancouver, December 8–13, 2003.
Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 28th international conference machine learning (ICML 2003) (pp. 912–919). Washington, August 21–24, 2003.

Download references

Author information

Authors and Affiliations

Snap Research, Venice, USA
Yingzhen Yang & Jianchao Yang
Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, USA
Yingzhen Yang & Thomas S. Huang
Department of ECE, National University of Singapore, Singapore, Singapore
Jiashi Feng
Microsoft Research, Redmond, USA
Nebojsa Jojic

Authors

Yingzhen Yang
View author publications
You can also search for this author inPubMed Google Scholar
Jiashi Feng
View author publications
You can also search for this author inPubMed Google Scholar
Nebojsa Jojic
View author publications
You can also search for this author inPubMed Google Scholar
Jianchao Yang
View author publications
You can also search for this author inPubMed Google Scholar
Thomas S. Huang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yingzhen Yang.

Additional information

Communicated by Edwin Hancock, Richard Wilson, Will Smith, Adrian Bors, Nick Pears.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This material is based upon work supported by the findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. The work of Yingzhen Yang was supported in part by an IBM gift grant to Beckman Institute, UIUC. The work of Jiashi Feng was partially supported by National University of Singapore startup Grant R-263-000-C08-133 and Ministry of Education of Singapore AcRF Tier One Grant R-263-000-C21-112.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Y., Feng, J., Jojic, N. et al. Subspace Learning by $\ell ^{0}$-Induced Sparsity. Int J Comput Vis 126, 1138–1156 (2018). https://doi.org/10.1007/s11263-018-1092-4

Download citation

Received: 25 April 2017
Accepted: 11 April 2018
Published: 17 July 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11263-018-1092-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Subspace Learning by \(\ell ^{0}\)-Induced Sparsity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

$$\ell ^{0}$$ -Sparse Subspace Clustering

Structural Reweight Sparse Subspace Clustering

Robust Affine Subspace Clustering via Smoothed \(\ell _{0}\)-Norm

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Subspace Learning by \(\ell ^{0}\)-Induced Sparsity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

$$\ell ^{0}$$ -Sparse Subspace Clustering

Structural Reweight Sparse Subspace Clustering

Robust Affine Subspace Clustering via Smoothed \(\ell _{0}\)-Norm

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now