Skip to main content
Log in

Robust discriminative feature learning with calibrated data reconstruction and sparse low-rank model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Since large amounts of labeled high-dimensional data needed to be processed, supervised feature learning has become an important and challenging problem in machine learning. Conventional supervised methods often adopt 2-norm loss function, which is sensitive to the outliers. However, real world data always contain lots of outliers that make traditional supervised methods fail to achieve the optimal performance. In addition, these methods can not reconstruct the original complex structured data well, since the dimensions of their learned projection matrices are often limited to the number of classes and are sub-optimal. To address these challenges, we propose a novel robust discriminative feature learning (RDFL) method via calibrated data reconstruction and sparse low-rank model. Specifically, RDFL preserves the discriminant information and simultaneously reconstructs the complex low-rank structure by minimizing joint 2,1-norm reconstruction error and within-class distance. To solve the proposed non-smooth problem, we derive an efficient optimization algorithm to soften the contributions of outliers. Meanwhile, we adopt the general power iteration method (GPIM) to accelerate our algorithm to make it scalable to large scale problem and theoretically analyze the convergence and computational complexity of the proposed algorithm. Extensive experimental results present that our proposed RDFL outperforms other compared methods in most cases and significantly improve the robust performance to noise and outliers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html

  2. http://images.ee.umist.ac.uk/danny/database.html

  3. http://www.cs.columbia.edu/CAVE/research/coil-20.html

  4. http://www.cad.zju.edu.cn/home/dengcai/Data/USPS

  5. http://crcv.ucf.edu/data/UCF_Sports_Action.php

  6. http://www.cad.zju.edu.cn/home/dengcai/Data/TextData.html

References

  1. Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396

    Article  Google Scholar 

  2. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    Book  Google Scholar 

  3. Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD conference on knowledge discovery and data mining. ACM, New York, pp 333–342

  4. Cai X, Ding C, Nie F, Huang H (2013) On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In: 1124–1132. ACM, New York

  5. Fukunaga K (2013) Introduction to statistical pattern recognition. Academic Press, Cambridge

    Google Scholar 

  6. Golub GH, Van Loan CF (2012) Matrix computations, vol 3. Johns Hopkins University Press, Baltimore

    Google Scholar 

  7. Gong P, Zhou J, Fan W, Ye J (2014) Efficient multi-task feature learning with calibration. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 761–770

  8. Gorodnitsky IF, Rao BD (1997) Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 45(3):600–616

    Article  ADS  Google Scholar 

  9. Gower JC, Dijksterhuis GB (2004) Procrustes problems, vol 30. Oxford University Press on Demand, Oxford

    Book  Google Scholar 

  10. He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. In: Weiss Y, Schölkopf PB, Platt JC (eds) Advances in neural information processing systems 18. http://papers.nips.cc/paper/2909-laplacian-score-for-feature-selection.pdf. MIT Press, pp 507–514

  11. Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804

    Article  PubMed  Google Scholar 

  12. Jiang YG, Ye G, Chang SF, Ellis D, Loui AC (2011) Consumer video understanding: a benchmark database and an evaluation of human and machine performance. In: Proceedings of the 1st ACM international conference on multimedia retrieval, vol 29. ACM, New York

  13. Lewis DD (1992) Feature selection and feature extraction for text categorization. In: Proceedings of the workshop on speech and natural language. Association for Computational Linguistics, pp 212–217

  14. Li H, Jiang T, Zhang K (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17(1):157–165

    Article  ADS  PubMed  Google Scholar 

  15. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2016) Feature selection: A data perspective. arXiv preprint arXiv:1601.07996

  16. Liu H, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes Proceedings of international conference on tools with artificial intelligence. IEEE, New York, pp 388–391

  17. Liu H, Wang L, Zhao T (2014) Multivariate regression with calibration. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., pp 127–135. http://papers.nips.cc/paper/5630-multivariate-regression-with-calibration.pdf

  18. Loog M, Duin RPW, Haeb-Umbach R (2001) Multiclass linear dimension reduction by weighted pairwise fisher criteria. IEEE Trans Pattern Anal Mach Intell 23(7):762–766

    Article  Google Scholar 

  19. Luo T, Hou C, Yi D, Zhang J (2016) Discriminative orthogonal elastic preserving projections for classification. Neurocomputing 179:54–68

    Article  Google Scholar 

  20. Moore B (1981) Principal component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Autom Control 26(1):17–32

    Article  MathSciNet  Google Scholar 

  21. Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint 2,1-norms minimization. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems 23. Curran Associates, Inc., pp 1813–1821. http://papers.nips.cc/paper/3988-efficient-and-robust-feature-selection-via-joint-l21-norms-minimization.pdfhttp://papers.nips.cc/paper/3988-efficient-and-robust-feature-selection-via-joint-l21-norms-minimization.pdfhttp://papers.nips.cc/paper/3988-efficient-and-robust-feature-selection-via-joint-l21-norms-minimization.pdf

  22. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  PubMed  Google Scholar 

  23. Robnik- v Sikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53 (1-2):23–69

  24. Roffo G, Melzi S (2016) Features Selection via Eigenvector Centrality. In: Proceedings of new frontiers in mining complex patterns (NFMCP 2016)

  25. Roffo G, Melzi S (2017) Ranking to learn: feature ranking and selection via eigenvector centrality. arXiv preprint arXiv:1704.054091704.05409

  26. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  CAS  PubMed  Google Scholar 

  27. Schönemann P H (1966) A generalized solution of the orthogonal procrustes problem. Psychometrika 31 (1):1–10

    Article  MathSciNet  Google Scholar 

  28. Schönemann P H (1968) On two-sided orthogonal procrustes problems. Psychometrika 33(1):19–33

    Article  MathSciNet  PubMed  Google Scholar 

  29. Soltanolkotabi M, Elhamifar E, Candes EJ et al (2014) Robust subspace clustering. Ann Stat 42 (2):669–699

    Article  MathSciNet  Google Scholar 

  30. Wright J, Ganesh A, Rao S, Peng Y, Ma Y (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in neural information processing systems 22. Curran Associates, Inc., pp 2080–2088. http://papers.nips.cc/paper/3704-robust-principal-component-analysis-exact-recovery-of-corrupted-low-rank-matrices-via-convex-optimization.pdfhttp://papers.nips.cc/paper/3704-robust-principal-component-analysis-exact-recovery-of-corrupted-low-rank-matrices-via-convex-optimization.pdfhttp://papers.nips.cc/paper/3704-robust-principal-component-analysis-exact-recovery-of-corrupted-low-rank-matrices-via-convex-optimization.pdf

  31. Xu H, Caramanis C, Sanghavi S (2010) Robust PCA via outlier pursuit. In: Advances in neural information processing systems 23. Curran Associates, Inc., pp 2496–2504. http://papers.nips.cc/paper/4005-robust-pca-via-outlier-pursuit.pdf

  32. Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) 1,2-norm regularized discriminative feature selection for unsupervised learning. In: Proceedings of international joint conference on artificial intelligence, vol 22, no1, pp 1589–1594

  33. Ye J, Xiong T (2006) Computational and theoretical analysis of null space and orthogonal linear discriminant analysis. J Mach Learn Res 7:1183–1204

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Foundation of China (No. 61473302 and No. 61503396) and NSF III-1421057.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Yang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, T., Yang, Y., Yi, D. et al. Robust discriminative feature learning with calibrated data reconstruction and sparse low-rank model. Appl Intell 54, 2867–2880 (2024). https://doi.org/10.1007/s10489-017-1060-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1060-7

Keywords

Navigation