Abstract
Similarity-based embedding is a paradigm that recently gained interest in the field of nonlinear dimensionality reduction. It provides an elegant framework that naturally emphasizes the preservation of the local structure of the data set. An emblematic method in this trend is t-distributed stochastic neighbor embedding (t-SNE), which is acknowledged to be an efficient method in the recent literature. This paper aims at analyzing the reasons of this success, together with the impact of the two metaparameters embedded in the method. Moreover, the paper shows that t-SNE can be interpreted as a distance-preserving method with a specific distance transformation, making the link with existing methods. Experiments on artificial data support the theoretical discussion.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
BELKIN, M. and NIYOGI, P. (2002): Laplacian eigenmaps and spectral techniques for embedding and clustering. In: T.G. Dietterich, S. Becker, Z. Ghahramani (Eds.): NIPS 2001 proc., 14. MIT Press, 585-591.
DEMARTINES, P. and HERAULT, J. (1997): Curvilinear component analysis: A self-organizing neural network for nonlinear mapping of data sets. IEEE Transactions on Neural Networks, 8 (1), 148-154.
ERHANY D., MANZAGOL P.-A., BENGIO Y., BENGIO S. and VINCENT P. (2009): The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training. Journal of Machine Learning Research Proc., 5, 153-160.
HINTON, G. and ROWEIS, S.T. (2003): Stochastic Neighbor Embedding. In: S. Becker, S. Thrun and K. Obermayer (Eds.): Advances in NeuralInformation Processing Systems (NIPS 2002), 15. MIT Press, 833-840.
KRUSKAL, J.B. (1964): Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1-28.
LEE, J.A. and VERLEYSEN, M. (2004): Curvilinear Distance Analysis versus Isomap. Neurocomputing, 57, 49-76.
LEE, J.A. and VERLEYSEN, M. (2007): Nonlinear dimensionality reduction. Springer, New York.
LEE, J.A. and VERLEYSEN, M. (2009): Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing, 72 (7-9), 1431-1443.
PARVIAINEN E. and VEHTARI A. (2009): Features and metric from a classifier improve visualizations with dimension reduction In: C. Alippi, M. Polycarpou, C. Panayiotou, G. Ellinas (Eds.): ICANN 2009 proc. Springer, LNCS 5769, 225-234.
ROWEIS, S.T. and SAUL, L.K. (2000): Nonlinear dimensionality reduction by locally linear embedding. Science, 290 (5500), 2323-2326.
SAERENS, M., FOUSS, F., YEN, L. and DUPONT, P. (2004): The principal components analysis of a graph, and its relationships to spectral clustering. In: J.-F. Boulicaut, F. Esposito, F. Giannotti, D. Pedreschi (Eds.): ECML 2004 proc.. Springer, LNCS 3201, 371-383.
SAMMON, J.W. (1969) A nonlinear mapping algorithm for data structure analysis. IEEE Transactions on Computers, CC-18 (5), 401-409.
SAUL, L.K., WEINBERGER, K.Q., HAM, J.H., SHA, F. and LEE, D.D. (2006): Spectral methods for dimensionality reduction. In: O. Chapelle, B. Schoelkopf, B. and A. Zien, A. (Eds.): Semisupervised Learning. MIT Press, 293-308.
SCHOLKOPF, B., SMOLA, A. and MULLER, K.-R. (1998): Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10 ,1299–1319.
SHEPARD, R.N. (1962): The analysis of proximities: Multidimensional scaling with an unknown distance function (1 - 2). Psychometrika, 27, 125-140 and 219-249.
TENENBAUM, J.B., DE SILVA, V. and LANGFORD, J.C. (2000): A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290 (5500), 2319-2323.
VAN DER MAATEN, L. and HINTON, G. (2008): Visualizing Data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605.
VAN DER MAATEN, L.J.P. (2009): Learning a Parametric Embedding by Preserving Local Structure. In: D. van Dyk and M. Welling (Eds.): Proc. 12th Int. Conf. on Artificial Intel. and Statistics, Clearwater Beach, FL. JMLR Proceedings 5, 384-391.
XIAO, L. , SUN, J. and BOYD, S. (2006): A Duality View of Spectral Methods for Dimensionality Reduction. In: W. Cohen and A. Moore (Eds.): ICML proc., Pittsburg (PA). Omni Press, 1041-1048.
WEINBERGER K.Q. and SAUL, L.K. (2006): Unsupervised Learning of Image Manifolds by Semidefinite Programming. International Journal of Computer Vision, 70 (1), 77-90.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, J.A., Verleysen, M. (2010). On the Role and Impact of the Metaparameters in t-distributed Stochastic Neighbor Embedding. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2604-3_31
Published:
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2603-6
Online ISBN: 978-3-7908-2604-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)