Skip to main content

Advertisement

RKHS reconstruction based on manifold learning for high-dimensional data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Kernel trick has achieved remarkable success in various machine learning tasks, especially those with high-dimensional non-linear data. In addition, these data usually tend to have compact representation that cluster in a low-dimensional subspace. In order to offer a general and comprehensive framework for high-dimensional non-linear data, in this paper, we generalizes multiple kernel learning and subspace learning in a reconstructed reproducing kernel Hilbert space (RKHS) endowed with manifold leaning. First, we construct reconstructed kernels by fusing manifold learning and some base kernel functions, and then learn the optimal kernel by linearly combining the reconstructed kernels. The proposed MKL method can introduce different prior knowledge such as neighborhood information and classification information, to solve different tasks of high-dimensional data. Furthermore, we propose a subspace learning based on RKHS reconstruction, named MVSL for short, of which the objective function is designed with variance maximization criterion, and use an iterative algorithm to solve it. We also incorporates data discriminant information to the learning process of the modified kernel by kernel alignment criterion and a regularization term, to learning the optimal kernel matrix for RKHS reconstruction, and propose another subspace learning method, named Discriminative MVSL. Experimental results on toy and real-world datasets demonstrate that the proposed MKL and subspace learning methods are able to learn the local manifold and the global statistics information of data based on RKHS reconstruction, and thus they achieve a satisfactory performance on classification and dimension reduction tasks.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The experiments in our manuscript contains toy data sets and real-world data sets (six datasets from the UCI database, Statue-Faces dataset, MINIST handwritten digit dataset, the ORL facial dataset, and the JAFFE facial expression dataset.) 1. UCI Database We downloaded data from the website https://archive.ics.uci.edu/ 2. Statue-Faces dataset It is download down form https://github.com/gionuno/isomap 3. MNIST Database It is download down form http://yann.lecun.com/exdb/mnist/ 4. ORL We started downloading data from the website https://paperswithcode.com/dataset/orl 5. JAFFE It is download down form https://paperswithcode.com/dataset/jaffe

References

  1. Zeng Z et al (2023) CoIn: Correlation Induced Clustering for Cognition of High Dimensional Bioinformatics Data. IEEE J Biomed Health Inform 27(2):598–607

    Article  MATH  Google Scholar 

  2. Wang K, Song Z (2024) High-Dimensional Cross-Plant Process Monitoring With Data Privacy: A Federated Hierarchical Sparse PCA Approach. IEEE Trans Industr Inf 20(3):4385–4396

    Article  MATH  Google Scholar 

  3. Xu Y, Yu Z, Cao W, Chen CLP (2023) A Novel Classifier Ensemble Method Based on Subspace Enhancement for High-Dimensional Data Classification. IEEE Trans Knowl Data Eng 35(1):16–30

    Article  MATH  Google Scholar 

  4. Bessa M, Bostanabad R, Liu ZL et al (2017) A Framework for Data-driven Analysis of Materials under Uncertainty: Countering the Curse of Dimensionality. Comput Methods Appl Mech Eng 320:633–667

    Article  MathSciNet  MATH  Google Scholar 

  5. Luo C, Ni B, Yan S, Wand M (2016) Image Classification by Selective Regularized Subspace Learning. IEEE Trans Multimedia 18(1):40–50

    Article  MATH  Google Scholar 

  6. Chi Z et al (2023) Multiple Kernel Subspace Learning for Clustering and Classification. IEEE Trans Knowl Data Eng 35(7):7278–7290

    MATH  Google Scholar 

  7. Niu G, Ma Z, Chen HQ, Su X (2021) Polynomial Approximation to Manifold Learning. J Intell Fuzzy Syst 41(6):5791–5806

    Article  MATH  Google Scholar 

  8. Ren J, Liu Y, Liu J (2024) Commonality and Individuality-Based Subspace Learning. IEEE Trans Cybern 54(3):1456–1469

    Article  MATH  Google Scholar 

  9. Liu Y, Liao S, Zhang H, Ren W, Wang W (2021) Kernel Stability for Model Selection in Kernel-Based Algorithms. IEEE Trans Cybern 51(12):5647–5658

    Article  MATH  Google Scholar 

  10. He X, Niyogi P (2003) Locality Preserving Projections. Proceedings of neural information processing systems

  11. Xanthopoulos P, Pardalos PM, Trafalis TB (2013) Linear Discriminant Analysis. Chicago 3(6):27–33

    MATH  Google Scholar 

  12. Deutsch HP (2004) Principle Component Analysis. Deriv Intern Model

  13. Gu H, Wang X, Chen X et al (2017) Manifold Learning by Curved Cosine Mapping. IEEE Trans Knowl Data Eng 2017:1-1

  14. Schölkopf B, Smola A, Müller KR (1998) Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput 10(5):1299–1319

  15. Sindhwani V, Niyogi P, Belkin M (2005) Beyond the Point Cloud: from Transductive to Semi-supervised Learning. Int Conf Mach Learn ACM

  16. Nguyen CH, Ho TB (2008) An Efficient Kernel Matrix Evaluation Measure. Pattern Recogn 41(11):3366–3372

    Article  MATH  Google Scholar 

  17. Rakotomamonjy A, Bach FR, Canu S et al (2007) More Efficiency in Multiple Kernel Learning. Proceedings of ICML, 2007: 775–782

  18. Gönen, Mehmet, Alpaydın, Ethem (2011) Multiple Kernel Learning Algorithms. J Mach Learn Res 12:2211–2268

  19. Saburou S, Yoshihiro S (2016) Theory of reproducing kernels and applications. Springer

  20. Jiang L, Liu S, Ma Z et al (2022) Regularized RKHS-Based Subspace Learning for Motor Imagery Classification. Entropy 24(2):195

    Article  MathSciNet  MATH  Google Scholar 

  21. Cortes C, Mohri M, Rostamizadeh A (2010) Two-Stage Learning Kernel Algorithms. Proceedings of international conference on machine learning, 2010:239–246

  22. Ying Y, Huang K, Campbell C (2009) Enhanced Protein Fold Recognition Through a Novel Data Integration Approach. BMC Bioinformatics 10(1):267

    Article  MATH  Google Scholar 

  23. Pouya MG, Yanning S (2023) Graph-Aided Online Multi-Kernel Learning. J Mach Learn Res 24:1–44

    MathSciNet  MATH  Google Scholar 

  24. Gönen M (2012) Bayesian Efficient Multiple Kernel Learning. Proceedings of international conference on machine learning. (ICML)

  25. Mao Q, Tsang IW, Gao S et al (2015) Generalized Multiple Kernel Learning with Data-dependent Priors. IEEE Trans Neural Netw Learn Syst 26(6):1134–1148

    Article  MathSciNet  MATH  Google Scholar 

  26. Lanckriet G, Cristianini N, Bartlett P et al (2004) Learning the Kernel Matrix with Semi-Definite Programming. J Mach Learn Res 5:27–72

    MATH  Google Scholar 

  27. Sonnenburg S, Rätsch G, Schäfer C (2006) A General and Efficient Multiple Kernel Learning Algorithm. Adv Neural Inform Process Syst 2006:1273–1280

  28. Rakotomamonjy A, Bach F, R.,Canu, Stéphane, et al (2008) SimpleMKL. J Mach Learn Res 9(3):2491–2521

  29. Cortes C, Mohri M, Rostamizadeh A (2011) Ensembles of Kernel Predictors. Proceedings of conference on uncertainty in artificial intelligence, 2011:145–152

  30. Girolami MA, Rogers S (2005) Hierarchic Bayesian Models for Kernel Learning. Int Conf Mach Learn ACM

  31. Li L, Zhang Z (2018) Semisupervised Domain Ddaptation by Covariance Matching. IEEE Trans Pattern Anal Mach Intell 41(11):2724–2739

    Article  MATH  Google Scholar 

  32. Xu X, Deng J, Coutinho E et al (2018) Connecting Subspace Learning and Extreme Learning Machine in speech emotion recognition. IEEE Trans Multimedia 21(3):795–808

    Article  MATH  Google Scholar 

  33. Zhou SH et al (2020) Multiple Kernel Clustering with Neighbor-Kernel Subspace Segmentation. IEEE Trans Neural Netw Learn Syst 31(4):1351–1362

    Article  MathSciNet  MATH  Google Scholar 

  34. Yan W, Yang M, Li Y (2023) Robust Low Rank and Sparse Representation for Multiple Kernel Dimensionality Reduction. IEEE Trans Circuits Syst Video Technol 33(1):1–15

    Article  MATH  Google Scholar 

  35. Boothby William M (1975) An Introduction to Differentiable Manifolds and Riemannian Geometry. Academic Press, New York

    MATH  Google Scholar 

  36. Fiori S (2012) Extended Hamiltonian Learning on Riemannian Manifolds: Numerical Aspects. IEEE Trans Neural Netw Learn Syst 23(1):7–21

    Article  MATH  Google Scholar 

  37. Sun Y, Gao J, Hong X et al (2015) Heterogeneous Tensor Decomposition for Clustering via Manifold Optimization. IEEE Trans Pattern Anal Mach Intell 38(3):476–489

    Article  MATH  Google Scholar 

  38. Mika S et al (1999) Fisher discriminant analysis with kernels, Neural Networks for Signal Processing, 1999:41–48

  39. Xu Z, Jin R, King I et al (2008) An Extended Level Method for Efficient Multiple Kernel Learning. Adv Neural Inform Process Syst 2008:1825-1832

  40. Vishwanathan SVN, Sun Z, Ampornpunt N et al (2010) Multiple Kernel Learning and the SMO Algorithm. Advances in Neural Information Processing Systems 23: Conference on Neural Information Processing Systems A Meeting Held December. DBLP

  41. Tenenbaum JB, De Silva V, Langford JC (2000) A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290(5500):2319–2323

    Article  MATH  Google Scholar 

  42. Roweis ST, Saul LK (2000) Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290(5500):2323

    Article  MATH  Google Scholar 

  43. Belkin M, Niyogi P (2003) Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Comput 15(6):1373–1396

    Article  MATH  Google Scholar 

  44. Zhang Z, Zha H (2004) Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment. SIAM J Sci Comput 26(1):313–338

    Article  MathSciNet  MATH  Google Scholar 

  45. Boothby William (1975) M, An Introduction to Differentiable Manifolds and Riemannian Geometry. Academic Press, New York

  46. https://github.com/gionuno/isomap

  47. https://archive.ics.uci.edu/

  48. http://yann.lecun.com/exdb/mnist/

  49. https://paperswithcode.com/dataset/orl

  50. https://paperswithcode.com/dataset/jaffe

  51. http://www.math.ucla.edu/wittman/mani

Download references

Acknowledgements

This work is supported in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2022A1515140103, the Research Projects of Ordinary Universities in Guangdong Province under Grant 2023KTSCX133, and the Featured Innovation Project of Foshan Education Bureau 2022DZXX06

Author information

Authors and Affiliations

Authors

Contributions

Guo Niu: Original Draft, Writing - Review, Funding acquisition Nannan Zhu: Editing, formal techniques to analyze Zhengming Ma: formal techniques to analyze, Oversight and leadership responsibility for the research Xin Wang: formal techniques to analyze, Performing the experiments Xi Liu: Performing the experiments Zhou Yan: Specifically visualization/ data presentation Yuexia Zhou: Performing the experiments

Corresponding author

Correspondence to Nannan Zhu.

Ethics declarations

Ethical and informed consent for data used

As authors of the forthcoming research paper titled ‘RKHS reconstruction based on manifold leanring for high-dimensional data’, we hereby assert our unwavering commitment to ethical standards and the acquisition of informed consent in the utilization of data. This declaration encapsulates the principles and practices integral to the ethical conduct of our technological research endeavors. We commit to transparency in our methodologies, providing a clear description of the tools, algorithms, and technologies employed in our research. Any potential impact on participants or stakeholders will be communicated transparently. Nannan Zhu 2024.7.9

Competing of Interests statement

All authors disclosed no relevant relationships.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appdenix

Appdenix

The multiple kernel learning algorithm [26] based on structural risk primarily adopts the idea of support vector machines (SVM) to solve the multiple kernel coefficients. Firstly, data is mapped to the feature space by using a kernel function, and then an optimal hyperplane is sought in the feature space to achieve maximization linear separability of the data. Let \(f\left( x \right) ={{w}^{T}}\varphi \left( x \right) +b\) be the expression of the hyperplane, where w is the linear coefficient, b is the offset, and \(\varphi \) is the mapping. The distance between the data and the hyperplane is denoted as

$$\begin{aligned} \gamma =\underset{i}\min\,\frac{{{l}_{i}}f\left( {{x}_{i}} \right) }{{{\left\| w \right\| }_{2}}}=\underset{i}\min\,\frac{{{l}_{i}}\left( {{w}^{T}}\varphi \left( {{x}_{i}} \right) +b \right) }{{{\left\| w \right\| }_{2}}} \end{aligned}$$
(23)

where \({{l}_{i}}\) denotes the parameter of the hyperplane \(f\left( {{x}_{i}} \right) \). The objective is to maximize this distance, which can be formulated as:

$$\begin{aligned} \underset{w,b}{\max}\,\underset{i}{\min }\,\frac{{{l}_{i}}\left( {{w}^{T}}\varphi \left( {{x}_{i}} \right) +b \right) }{{{\left\| w \right\| }_{2}}} \end{aligned}$$
(24)

The learning problem (24) is equal to

$$\begin{aligned} & \underset{w,b}{\min }\,\frac{1}{2}{{\left\| w \right\| }^{2}}\nonumber \\ & s.t.\ {{l}_{i}}\left( {{w}^{T}}\varphi \left( {{x}_{i}} \right) +b \right) \ge 1 \end{aligned}$$
(25)

By introducing some slack variables in the objective function (25), the optimization problem becomes:

$$\begin{aligned} & \underset{w,b}{\min}\,\frac{1}{2}{{\left\| w \right\| }^{2}}+C\sum \limits _{i=1}^{N}{{{\xi }_{i}}}\nonumber \\ & s.t.\ {{l}_{i}}\left( {{w}^{T}}\varphi \left( {{x}_{i}} \right) +b \right) \ge 1-{{\xi }_{i}}\nonumber \\ & s.t.\ {{\xi }_{i}}\ge 0,i=1,\cdots ,N \end{aligned}$$
(26)

where C is the penalty parameter. Furthermore, the Lagrangian function is constructed based on the Lagrange multiplier method:

$$\begin{aligned} L\left( w,b,\alpha ,\mu ,\xi \right) =\frac{1}{2}\left\| w \right\| _{2}^{2}+\sum \limits _{i=1}^{N}{\left( C-{{\alpha }_{i}}-{{\mu }_{i}} \right) {{\xi }_{i}}-}\sum \limits _{i=1}^{N}{{{\alpha }_{i}}\left( {{l}_{i}}\left( {{w}^{T}}{{x}_{i}}+b \right) -1 \right) } \end{aligned}$$
(27)

where \({{\alpha }_{i}}\ge 0,{{\mu }_{i}}\ge 0,i=1,\cdots ,N\), \({{\alpha }_{i}}\) and \({{\mu }_{i}}\) are Lagrange multipliers associated with the inequality and equality constraints, respectively.

By taking the partial derivatives of the Lagrangian function with respect to \( w,b,\xi \) and setting them equal to zero, we obtain:

$$\begin{aligned} \frac{\partial L}{\partial w}=w-\sum \limits _{i=1}^{N}{{{\alpha }_{i}}{{l}_{i}}\varphi \left( {{x}_{i}} \right) }=0 \end{aligned}$$
(28)
$$\begin{aligned} \frac{\partial L}{\partial b}=\sum \limits _{i=1}^{N}{{{\alpha }_{i}}{{l}_{i}}}=0 \end{aligned}$$
(29)
$$\begin{aligned} \frac{\partial L}{\partial {{\xi }_{i}}}=C-{{\alpha }_{i}}-{{\mu }_{i}}=0,i=1,\cdots ,N \end{aligned}$$
(30)

Substituting equations (28)-(30) back into equation (27), we can obtain the minimum value of the Lagrangian function with respect to \(w,b,\xi \) as:

$$\begin{aligned} & \underset{w,b,\xi }{\min }\,L\left( w,b,\alpha ,\mu ,\xi \right) =-\frac{1}{2}\sum \limits _{i=1}^{N}{\sum \limits _{j=1}^{N}{{{\alpha }_{i}}{{\alpha }_{j}}{{l}_{i}}{{l}_{j}}}}\left\langle \varphi \left( {{x}_{i}} \right) ,\varphi \left( {{x}_{j}} \right) \right\rangle +\sum \limits _{i=1}^{N}{{{\alpha }_{i}}} \nonumber \\ & =-\frac{1}{2}\sum \limits _{i=1}^{N}{\sum \limits _{j=1}^{N}{{{\alpha }_{i}}{{\alpha }_{j}}{{l}_{i}}{{l}_{j}}}}k\left( {{x}_{i}},{{x}_{j}} \right) +\sum \limits _{i=1}^{N}{{{\alpha }_{i}}} \end{aligned}$$
(31)

The dual problem of the primal problem (31) is to maximize \(\alpha ,\mu \) by the minimum value of the Lagrangian function with respect to \(w,b,\xi \):

$$\begin{aligned} & \underset{\alpha }{\max}\,\sum \limits _{i=1}^{N}{{{\alpha }_{i}}}-\frac{1}{2}\sum \limits _{i=1}^{N}\sum \limits _{j=1}^{N}{{{\alpha }_{i}}}{{\alpha }_{j}}{{l}_{i}}{{l}_{j}}k\left( {{x}_{i}},{{x}_{j}} \right) \nonumber \\ & s.t.\ C\ge {{\alpha }_{i}}\ge 0,i=1,\cdots ,N\nonumber \\ & C\ge {{\alpha }_{j}}\ge 0,j=1,\cdots ,N\nonumber \\ & s.t.\ \sum \limits _{i=1}^{N}{{{\alpha }_{i}}{{l}_{i}}=0} \end{aligned}$$
(32)

where the inequality constraint \(C\ge {{\alpha }_{i}}\ge 0,i=1,\cdots ,N\) is obtained by eliminating \({\mu }_{i}\) based on the inequality constraint \({{\mu }_{i}}\ge 0\) and the equality constraint \(C-{{\alpha }_{i}}-{{\mu }_{i}}=0\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, G., Zhu, N., Ma, Z. et al. RKHS reconstruction based on manifold learning for high-dimensional data. Appl Intell 55, 124 (2025). https://doi.org/10.1007/s10489-024-05923-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05923-y

Keywords