Abstract
High-dimensional data are difficult to explore and analyze due to they are highly correlative and redundant. Although previous dimensionality reduction methods have achieved promising performance, there are still some limitations. For example, the constructed distribution of data in the embedding space could not be approximated adaptively, and the parameters in these model lack of interpretation. To handle these problems, in this paper, a novel dimensionality reduction method named t-Distribution Adaptive Manifold Embedding (t-AME) is proposed. Firstly, t-AME constructs the pairwise distance similarity probability in the embedding space by Student-t distribution, and distributions generated by different degrees of freedom are learned according to the data itself to better match high-dimensional data distributions. Afterwards, to pull similar points together and push apart dissimilar points, an objective function with the corresponding optimization strategy is designed. Therefore, both the local and global structure of the original data could be well preserved in the embedding space. Finally, numerical experiments on synthetic and real datasets illustrate that the proposed method achieves a significant improvement over some representative and state-of-the-art dimensionality reduction methods.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-023-04838-4/MediaObjects/10489_2023_4838_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-023-04838-4/MediaObjects/10489_2023_4838_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-023-04838-4/MediaObjects/10489_2023_4838_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-023-04838-4/MediaObjects/10489_2023_4838_Fig4_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Han J, Cheng G, Li Z, Zhang D (2017) A unified metric learning-based framework for co-saliency detection. IEEE Trans Circuit Syst Video Technol 28(10):2473–2483
Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758
Raunak V, Gupta V, Metze F (2019) Effective dimensionality reduction for word embeddings. Proceedings of the 4th Workshop on Representation Learning for NLP 235–243
Mironczuk MM, Protasiewicz J (2018) A recent overview of the state-of-the-art elements of text classification. Expert Syst Appl 106:36–54
Dorrity MW, Saunders LM, Queitsch C, Fields S, Trapnell C (2020) Dimensionality reduction by UMAP to visualize physical and genetic interactions. Nature Commun 11(1):1537
Kobak D, Berens P (2019) The art of using t-SNE for single-cell transcriptomics. Nature Commun 10(1):5416
Linderman GC, Rachh M, Hoskins JG, Steinerberger S, Kluger Y (2019) Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nature Method 16(3):243–245
Turk M (1991) Pentland. Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Belhumeur PN, Hepanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Luo QL (1981) Introduction to Multidimensional Scaling. Math Practice Theory 3:54–62
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. Neural Inf Process Syst 14:585–591
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
Mcinnes L, Healy J, Melville J (2018) UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv:1802.03426
Senanayake DA, Wang W, Naik SH, Halgamuge S (2021) Self-organizing nebulous growths for robust and incremental data visualization. IEEE Trans Neural Netw Learn Sys 32(10):4588–4602
Moon KR, Dijk DV, Wang Z, Gigante S, Burkhardt DB, Chen WS, Yim K, Antonia VDE, Hirn MJ, Coifman RR (2019) Visualizing structure and transitions in high-dimensional biological data. Nature Biotechnol 37(10):1482–1492
Amid E, Warmuth MK (2019) TriMap: large-scale dimensionality reduction using triplets. arXiv:1910.00204
Narayan A, Berger B, Cho H (2020) Density-preserving data visualization unveils dynamic patterns of single-cell transcriptomic variability. Cold Spring Harbor Laboratory
Ding J, Condon A, Shah SP (2018) Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nature Commun 9(1):2002
Becht E (2019) Dimensionality reduction for visualizing single-cell data using UMAP. Nature Biotechnol 37(1):38
Szubert B, Cole JE, Monaco C, Drozdov I (2019) Structure-preserving visualisation of high dimensional single-cell datasets. Sci Rep 9(1):1–10
Sainburg T, Mcinnes L, Gentner TQ (2020) Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning. arXiv:2009.12981
Wang Y, Huang H, Rudin C, Shaposhnik Y (2021) Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization. J Mach Learn Res 22(201):1–73
Damrich S, Hamprecht F (2021) On UMAP’s true loss function. arXiv:2103.14608
Tang J, Liu J, Zhang M, Mei Q (2016) Visualizing large-scale and high-dimensional data. Proceedings of the 25th international conference on world wide web 287–297
Zhang S, Ma Z, Gan W (2021) Dimensionality reduction for tensor data based on local decision margin maximization. IEEE Trans Image Process 30:234–248
Gultepe E, Makrehchi M (2018) Improving clustering performance using independent component analysis and unsupervised feature learning. Hum-centric Comput Inf Sci 225(8)
Cheng D, Zhu Q, Huang J, Yang QWL (2017) Natural neighbor-based clustering algorithm with local representatives. Knowl-Based Syst 123:238–253
Wu J, Jian C, Hui X, Ming X (2009) External validation measures for K-means clustering: A data distribution perspective. Expert Syst Appl 36(3):6050–6061
Santos JM, Embrechts M (2009) On the use of the adjusted rand index as a metric for evaluating supervised classification. Artif Neural Netw 5769:175-184
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant no. 12001057 and 61976174, the Fundamental Research Funds for the Central Universities in Chang’an University under Grant no. 300102122101 and 300102120201, the Key Research and Development of Shaanxi Province of China under Grant no. 2021NY-170
Author information
Authors and Affiliations
Contributions
Changpeng Wang: Conceptualization, Writing - review & editing. Linlin Feng: Software, Investigation. Lijuan Yang: Formal analysis, Validation. Tianjun Wu: Reviewing, Editing,. Jiangshe Zhang: Methodology, Supervision
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, C., Feng, L., Yang, L. et al. Dimensionality reduction by t-Distribution adaptive manifold embedding. Appl Intell 53, 23853–23863 (2023). https://doi.org/10.1007/s10489-023-04838-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04838-4