Robust unsupervised feature selection based on matrix factorization with adaptive loss via bi-stochastic graph regularization

Song, Xiangfa

doi:10.1007/s10489-024-05876-2

Robust unsupervised feature selection based on matrix factorization with adaptive loss via bi-stochastic graph regularization

Published: 29 November 2024

Volume 55, article number 55, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xiangfa Song ORCID: orcid.org/0000-0002-4947-2282¹

164 Accesses
Explore all metrics

Abstract

Unsupervised feature selection (UFS) has gained increasing attention and research interest in various domains, such as machine learning and data mining. Recently, numerous matrix factorization-based methods have been widely adopted for UFS. However, the following issues still exist. First, most methods based on matrix factorization use the squared Frobenius-norm to measure the loss term, making them sensitive to outliers. Although using $ \varvec{l}_{\varvec{2,1}} $-norm-based loss improves the robustness, it is sensitive to small loss. Second, most existing matrix factorization-based methods utilize a fixed graph to preserve the local structure of data, making them fail to capture the intrinsic local structure accurately. To address the above drawbacks, we present a novel robust UFS model named matrix factorization with adaptive loss via bi-stochastic graph regularization (MFALBS). Specifically, MFALBS decomposes the projected data points into two orthogonal matrices to facilitate discriminative feature selection. Meanwhile, MFALBS utilizes an adaptive loss function for measuring the loss term in order to enhance model robustness. Moreover, MFALBS adopts an adaptive learning strategy for optimal bi-stochastic graph learning in order to improve local structure preservation ability. Finally, an optimization algorithm is proposed to solve the proposed model. Experimental results on real-world datasets show the effectiveness and superiority of MFALBS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive unsupervised feature selection with robust graph regularization

Article 07 July 2023

Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection

Article 06 July 2024

Double regularized matrix factorization for image classification and clustering

Article Open access 22 June 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability and Access

Data will be made available on request.

Notes

https://jundongl.github.io/scikit-feature/datasets.html.
The code and datasets will be made available at: https://github.com/xiangfasong/MFALBS.

References

Georgiou T, Liu Y, Chen W, Lew MS (2020) A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int J Multim Inf Retr 9(3):135–170. https://doi.org/10.1007/S13735-019-00183-W
Article MATH Google Scholar
Bommert A, Welchowski T, Schmid M, Rahnenführer J (2022) Benchmark of filter methods for feature selection in high-dimensional gene expression survival data. Briefings Bioinform 23(1). https://doi.org/10.1093/BIB/BBAB354
Mostafa RR, Khedr AM, Aghbari ZA, Afyouni I, Kamel I, Ahmed N (2024) An adaptive hybrid mutated differential evolution feature selection method for low and high-dimensional medical datasets. Knowl Based Syst 283:111218. https://doi.org/10.1016/J.KNOSYS.2023.111218
Omuya EO, Okeyo GO, Kimwele MW (2021) Feature selection for classification using principal component analysis and information gain. Expert Syst Appl 174:114765. https://doi.org/10.1016/J.ESWA.2021.114765
Liu T, Lu Y, Zhu B, Zhao H (2023) Clustering high-dimensional data via feature selection. Biometrics 79(2):940–950. https://doi.org/10.1111/biom.13665
Article MathSciNet MATH Google Scholar
Lu X, Long J, Wen J, Fei L, Zhang B, Xu Y (2022) Locality preserving projection with symmetric graph embedding for unsupervised dimensionality reduction. Pattern Recognit 131:108844. https://doi.org/10.1016/J.PATCOG.2022.108844
Zhou J, Zhang Q, Zeng S, Zhang B, Fang L (2024) Latent linear discriminant analysis for feature extraction via isometric structural learning. Pattern Recognit 149:110218. https://doi.org/10.1016/J.PATCOG.2023.110218
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: International conference on neural information processing systems, pp 507–514
Miao J, Zhao J, Yang T, Fan C, Tian Y, Shi Y, Xu M (2024) Explicit unsupervised feature selection based on structured graph and locally linear embedding. Expert Syst Appl 255:124568. https://doi.org/10.1016/J.ESWA.2024.124568
Xie X, Cao Z, Sun F (2023) Joint learning of graph and latent representation for unsupervised feature selection. Appl Intell 53(21):25282–25295. https://doi.org/10.1007/S10489-023-04893-X
Article MATH Google Scholar
Mozafari M, Seyedi SA, Mohammadiani RP, Tab FA (2024) Unsupervised feature selection using orthogonal encoder-decoder factorization. Inf Sci 663:120277. https://doi.org/10.1016/J.INS.2024.120277
Lin X, Guan J, Chen B, Zeng Y (2022) Unsupervised feature selection via orthogonal basis clustering and local structure preserving. IEEE Trans Neural Networks Learn Syst 33(11):6881–6892. https://doi.org/10.1109/TNNLS.2021.3083763
Article MathSciNet MATH Google Scholar
Moslemi A, Ahmadian A (2023) Dual regularized subspace learning using adaptive graph learning and rank constraint: Unsupervised feature selection on gene expression microarray datasets. Comput Biol Medicine 167:107659. https://doi.org/10.1016/J.COMPBIOMED.2023.107659
Wang X, Wu P, Xu Q, Zeng Z, Xie Y (2021) Joint image clustering and feature selection with auto-adjoined learning for high-dimensional data. Knowl Based Syst 232:107443. https://doi.org/10.1016/J.KNOSYS.2021.107443
Dhal P, Azad C (2024) A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text. Neural Comput Appl 36(7):3525–3553. https://doi.org/10.1007/S00521-023-09225-1
Article MATH Google Scholar
Wu X, Xu X, Liu J, Wang H, Hu B, Nie F (2021) Supervised feature selection with orthogonal regression and feature weighting. IEEE Trans Neural Networks Learn Syst 32(5):1831–1838. https://doi.org/10.1109/TNNLS.2020.2991336
Article MathSciNet MATH Google Scholar
Li Z, Nie F, Wu D, Wang Z, Li X (2024) Sparse trace ratio LDA for supervised feature selection. IEEE Trans Cybern 54(4):2420–2433. https://doi.org/10.1109/TCYB.2023.3264907
Article MATH Google Scholar
Lai J, Chen H, Li T, Yang X (2022) Adaptive graph learning for semi-supervised feature selection with redundancy minimization. Inf Sci 609:465–488. https://doi.org/10.1016/J.INS.2022.07.102
Article MATH Google Scholar
Sheikhpour R, Berahmand K, Forouzandeh S (2023) Hessian-based semi-supervised feature selection using generalized uncorrelated constraint. Knowl Based Syst 269:110521. https://doi.org/10.1016/J.KNOSYS.2023.110521
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: International conference on artificial intelligence, pp 1026–1032. https://doi.org/10.1609/AAAI.V26I1.8289
Wang C, Wang J, Gu Z, Wei J, Liu J (2024) Unsupervised feature selection by learning exponential weights. Pattern Recognit 148:110183. https://doi.org/10.1016/J.PATCOG.2023.110183
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2018) Feature selection: A data perspective. ACM Comput Surv 50(6):94–19445. https://doi.org/10.1145/3136625
Article MATH Google Scholar
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2020) A review of unsupervised feature selection methods. Artif Intell Rev 53(2):907–948. https://doi.org/10.1007/S10462-019-09682-Y
Article MATH Google Scholar
Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: A framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804. https://doi.org/10.1109/TCYB.2013.2272642
Article MATH Google Scholar
Zhu P, Zuo W, Zhang L, Hu Q, Shiu SCK (2015) Unsupervised feature selection by regularized self-representation. Pattern Recognit 48(2):438–446. https://doi.org/10.1016/J.PATCOG.2014.08.006
Article MATH Google Scholar
Liu Y, Ye D, Li W, Wang H, Gao Y (2020) Robust neighborhood embedding for unsupervised feature selection. Knowl Based Syst 193:105462. https://doi.org/10.1016/J.KNOSYS.2019.105462
Du L, Shen Y (2015) Unsupervised feature selection with adaptive structure learning. In: International conference on knowledge discovery and data mining, pp 209–218. https://doi.org/10.1145/2783258.2783345
Nie F, Zhu W, Li X (2016) Unsupervised feature selection with structured graph optimization. In: International conference on artificial intelligence, pp 1302–1308. https://doi.org/10.1609/AAAI.V30I1.10168
Huang Y, Shen Z, Cai F, Li T, Lv F (2021) Adaptive graph-based generalized regression model for unsupervised feature selection. Knowl Based Syst 227:107156. https://doi.org/10.1016/J.KNOSYS.2021.107156
Zhao H, Li Q, Wang Z, Nie F (2022) Joint adaptive graph learning and discriminative analysis for unsupervised feature selection. Cogn Comput 14(3):1211–1221. https://doi.org/10.1007/S12559-021-09875-0
Article MATH Google Scholar
Tang C, Zheng X, Zhang W, Liu X, Zhu X, Zhu E (2023) Unsupervised feature selection via multiple graph fusion and feature weight learning. Sci China Inf Sci 66(5). https://doi.org/10.1007/S11432-022-3579-1
Zhou Q, Wang Q, Gao Q, Yang M, Gao X (2024) Unsupervised discriminative feature selection via contrastive graph learning. IEEE Trans Image Process 33:972–986. https://doi.org/10.1109/TIP.2024.3353572
Article MATH Google Scholar
Wang S, Pedrycz W, Zhu Q, Zhu W (2015) Subspace learning for unsupervised feature selection via matrix factorization. Pattern Recognit 48(1):10–19. https://doi.org/10.1016/J.PATCOG.2014.08.004
Article MATH Google Scholar
Shang R, Wang W, Stolkin R, Jiao L (2016) Subspace learning-based graph regularized feature selection. Knowl Based Syst 112:152–165. https://doi.org/10.1016/J.KNOSYS.2016.09.006
Article MATH Google Scholar
Shang R, Xu K, Shang F, Jiao L (2020) Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection. Knowl Based Syst 187. https://doi.org/10.1016/J.KNOSYS.2019.07.001
Han D, Kim J (2015) Unsupervised simultaneous orthogonal basis clustering feature selection. In: International conference on computer vision and pattern recognition, pp 5016–5023. https://doi.org/10.1109/CVPR.2015.7299136
Qian M, Zhai C (2013) Robust unsupervised feature selection. In: International joint conference on artificial intelligence, pp 1621–1627
Du S, Ma Y, Li S, Ma Y (2017) Robust unsupervised feature selection via matrix factorization. Neurocomputing 241:115–127. https://doi.org/10.1016/J.NEUCOM.2017.02.034
Article MATH Google Scholar
Luo C, Zheng J, Li T, Chen H, Huang Y, Peng X (2022) Orthogonally constrained matrix factorization for robust unsupervised feature selection with local preserving. Inf Sci 586:662–675. https://doi.org/10.1016/J.INS.2021.11.068
Article MATH Google Scholar
Nie F, Wang H, Huang H, Ding CHQ (2013) Adaptive loss minimization for semi-supervised elastic embedding. In: International joint conference on artificial intelligence, pp 1565–1571
Nie F, Huang H, Cai X, Ding CHQ (2010) Efficient and robust feature selection via joint 2, 1-norms minimization. In: International conference on neural information processing systems, pp 1813–1821
Sun Z, Xie H, Liu J, Gou J, Yu Y (2023) Dual-graph with non-convex sparse regularization for multi-label feature selection. Appl Intell 53(18):21227–21247. https://doi.org/10.1007/S10489-023-04515-6
Article Google Scholar
Wang Q, He X, Jiang X, Li X (2022) Robust bi-stochastic graph regularized matrix factorization for data clustering. IEEE Trans Pattern Anal Mach Intell 44(1):390–403. https://doi.org/10.1109/TPAMI.2020.3007673
Boyd SP, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122. https://doi.org/10.1561/2200000016
Li S, Tang C, Liu X, Liu Y, Chen J (2019) Dual graph regularized compact feature representation for unsupervised feature selection. Neurocomputing 331:77–96. https://doi.org/10.1016/J.NEUCOM.2018.11.060
Article MATH Google Scholar
Shi Z, Liu J (2023) Noise-tolerant clustering via joint doubly stochastic matrix regularization and dual sparse coding. Expert Syst Appl 222:119814. https://doi.org/10.1016/J.ESWA.2023.119814
Neumann JV (1950) Functional Operators (AM-22), Volume 2: The Geometry of Orthogonal Spaces.(AM-22). Princeton University Press, Princeton
Zhang X (2020) A Matrix Algebra Approach to Artificial Intelligence. Springer, Singapore. https://doi.org/10.1007/978-981-15-2770-8
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Information Engineering, Henan University, Kaifeng, 475001, Henan, China
Xiangfa Song

Authors

Xiangfa Song
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xiangfa Song: Conceptualization, Formal analysis, Methodology, Software, Validation, Writing-review & editing.

Corresponding author

Correspondence to Xiangfa Song.

Ethics declarations

Competing Interests

The author declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and Informed Consent for Data Used

The dataset used is public and available, and there are no ethical implications.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Lemma 3

The closed-form solution to the optimization problem (15) is $ \textbf{S}=\textbf{R}+\frac{n+\textbf{1}_{n}^{T} \textbf{R} \textbf{1}_{n}}{n^{2}} \textbf{1}_{n} \textbf{1}_{n}^{T}-\frac{1}{n} \textbf{R} \textbf{1}_{n} \textbf{1}_{n}^{T}-\frac{1}{n} \textbf{1}_{n} \textbf{1}_{n}^{T}\textbf{R} $, where $ \textbf{R}=\frac{\textbf{A}+\textbf{A}^T}{2} $.

Proof

The Lagrangian function for problem (15) can be formulated as follows:

$$\begin{aligned} \min _{\textbf{S}} \frac{1}{2}\Vert \textbf{S}-\textbf{A}\Vert _{F}^{2}-\left( \eta ^{T}(\textbf{S 1}_n -\textbf{1}_n)\right) -T r\left( \Xi ^{T}\left( \textbf{S}^{T}-\textbf{S}\right) \right) , \end{aligned}$$

(A1)

where $\eta \in \mathbb {R}^{n}$ and $\Xi \in \mathbb {R}^{n \times n}$ are Lagrangian multipliers.

By differentiating w.r.t. $ \textbf{S} $ and setting it to 0, we obatain

$$\begin{aligned} \textbf{S}-\textbf{A}-\left( \Xi ^{T}-\Xi \right) - \eta \textbf{1}_n^T=0. \end{aligned}$$

(A2)

By calculating the transpose on both sides of (A2), we get:

$$\begin{aligned} \textbf{S}-\textbf{A}^{T}+\left( \Xi ^{T}-\Xi \right) -\textbf{1}_n \eta ^T=0. \end{aligned}$$

(A3)

By combining (A2) with (A3), we obtain:

$$\begin{aligned} -\left( \Xi ^{T}-\Xi \right) =\frac{1}{2}\left( \textbf{A}-\textbf{A}^{T}+\eta \textbf{1}_n^{T}-\textbf{1}_n \eta ^{T}\right) . \end{aligned}$$

(A4)

By combining (A2) with (A4), we further have:

$$\begin{aligned} \textbf{S}=\textbf{R}+\frac{1}{2}\eta \textbf{1}_n^T+\frac{1}{2}\textbf{1}_n\eta ^T. \end{aligned}$$

(A5)

where $ \textbf{R}=\frac{\textbf{A}+\textbf{A}^T}{2} $. By the constraint $ \textbf{S 1}_n=\textbf{1}_n $, we multiply vector $ \textbf{1}_n $ on both sides of (A5), and we can obtain

$$\begin{aligned} \textbf{1}_n=\textbf{R 1}_n+\frac{n}{2}\eta +\frac{1}{2}\textbf{1}_n\textbf{1}_n^T\eta , \end{aligned}$$

(A6)

from which we get

$$\begin{aligned} \eta =2\left( n \textbf{I}+\textbf{1}_n\textbf{ 1}_n^{T}\right) ^{-1}(\textbf{I}-\textbf{R}) \textbf{1}_n. \end{aligned}$$

(A7)

Based on the Woodbury formula [48], we get

$$\begin{aligned} \left( n \textbf{I}+\textbf{1}_n\textbf{ 1}_n^{T}\right) ^{-1}=\frac{1}{n}\left( \textbf{I}-\frac{1}{2n}\textbf{1}_n\textbf{ 1}_n^{T}\right) . \end{aligned}$$

(A8)

By combining (A5), (A7) and (A8), we can get the solution of problem (15) as

$$\begin{aligned} \textbf{S}=\textbf{R}+\frac{n+\textbf{1}_{n}^{T} \textbf{R} \textbf{1}_{n}}{n^{2}} \textbf{1}_{n} \textbf{1}_{n}^{T}-\frac{1}{n} \textbf{R} \textbf{1}_{n} \textbf{1}_{n}^{T}-\frac{1}{n} \textbf{1}_{n} \textbf{1}_{n}^{T}\textbf{R}, \end{aligned}$$

(A9)

$\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, X. Robust unsupervised feature selection based on matrix factorization with adaptive loss via bi-stochastic graph regularization. Appl Intell 55, 55 (2025). https://doi.org/10.1007/s10489-024-05876-2

Download citation

Accepted: 29 October 2024
Published: 29 November 2024
DOI: https://doi.org/10.1007/s10489-024-05876-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust unsupervised feature selection based on matrix factorization with adaptive loss via bi-stochastic graph regularization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive unsupervised feature selection with robust graph regularization

Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection

Double regularized matrix factorization for image classification and clustering

Data Availability and Access

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Appendix A

Lemma 3

Proof

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Robust unsupervised feature selection based on matrix factorization with adaptive loss via bi-stochastic graph regularization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive unsupervised feature selection with robust graph regularization

Self-representation with adaptive loss minimization via doubly stochastic graph regularization for robust unsupervised feature selection

Double regularized matrix factorization for image classification and clustering

Explore related subjects

Data Availability and Access

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Appendix A

Appendix A

Lemma 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation