skip to main content
10.1145/3507623.3507637acmotherconferencesArticle/Chapter ViewAbstractPublication PagesciisConference Proceedingsconference-collections
research-article

Manifold Learning Projection Quality Quantitative Evaluation

Published: 11 April 2022 Publication History

Abstract

A large number dimensions may cause a variety of problems in real-world applications: some dimensions might be redundant and can worsen the quality of the workflow output, and, in the vast majority of exercises with datasets, data are distributed along a highly nonlinear manifold whose structure is unknown. This paper focuses on analyzing the outputs of nonlinear dimensionality reduction, or Manifold Learning, techniques. We introduce three meaningful measures that are capable of providing context behind projections onto lower-dimensional spaces. The measures will enable us to compare techniques with each other and assist in choosing suitable hyperparameters. Moreover, we propose to view projections from the standpoint of simplicial complex distortion. In connection to that, we establish the process of a dimension-agnostic graph-based data tessellation technique that builds a simplicial skeleton of high-dimensional data. Alongside our new tessellation technique, we evaluate the proposed quality measures on the Delaunay-tessellation-based simplicial approximations of manifolds.

Supplementary Material

p77-belov-supplement (p77-belov-supplement.pdf)
Presentation slides

References

[1]
M. Belkin and P. Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, Vol. 14. 585–591.
[2]
M. Belkin and P. Niyogi. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation 15(2003), 1373–1396.
[3]
C. M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg.
[4]
C. M. Bishop, M. Svensén, and C. K. I. Williams. 1997. Magnification factors for the GTM algorithm. In Fifth International Conference on Artificial Neural Networks (Conf. Publ. No. 440). 64–69.
[5]
C. M. Bishop, M. Svensén, and C. K. I. Williams. 1998. GTM: The Generative Topographic Mapping. Neural Comput. 10, 1 (Jan. 1998), 215–234. https://doi.org/10.1162/089976698300017953
[6]
J.-D. Boissonnat and A. Ghosh. 2014. Manifold Reconstruction Using Tangential Delaunay Complexes. Discrete & Computational Geometry 51, 1 (2014), 221–267. https://doi.org/10.1007/s00454-013-9557-2
[7]
J. Chen and Y. Han. 1990. Shortest Paths on a Polyhedron. In Symposium on Computational Geometry.
[8]
S.-W. Cheng, T. K. Dey, and E. A. Ramos. 2005. Manifold Reconstruction from Point Samples. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms (Vancouver, British Columbia) (SODA ’05). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1018–1027. http://dl.acm.org/citation.cfm?id=1070432.1070579
[9]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. 2009. Introduction to Algorithms, Third Edition(3rd ed.). The MIT Press.
[10]
K. Crane, C. Weischedel, and M. Wardetzky. 2013. Geodesics in Heat: A New Approach to Computing Distance Based on Heat Flow. ACM Transactions on Graphics 32 (2013). https://doi.org/10.1145/2516971.2516977
[11]
T. Deschamps and L. D. Cohen. 2001. Fast extraction of minimal paths in 3D images and applications to virtual endoscopy. Medical image analysis 5 4 (2001), 281–99.
[12]
R. Diestel. 2017. Graph Theory (5thed.). Springer Publishing Company, Incorporated.
[13]
E. W. Dijkstra. 1959. A note on two problems in connexion with graphs. Numerische mathematik 1, 1 (1959), 269–271.
[14]
D. L. Donoho and C. Grimes. 2003. Hessian Eigenmaps: Locally Linear Embedding Techniques for High-Dimensional Data. Proceedings of the National Academy of Sciences 100, 10(2003), 5591–5596.
[15]
H. Edelsbrunner and E. P. Mücke. 1994. Three-dimensional Alpha Shapes. ACM Trans. Graph. 13, 1 (Jan. 1994), 43–72. https://doi.org/10.1145/174462.156635
[16]
J. E. Goodman, J. O’Rourke, and C. D. Tóth. 2017. Handbook of Discrete and Computational Geometry, Third Edition. CRC Press, Boca Raton, FL.
[17]
P. Jaccard. 1912. THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE. New Phytologist 11, 2 (1912), 37–50. https://doi.org/10.1111/j.1469-8137.1912.tb05611.x arXiv:https://nph.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1469-8137.1912.tb05611.x
[18]
J. B. Kruskal. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1 (1964), 1–27.
[19]
J. B. Kruskal. 1964. Nonmetric multidimensional scaling: a numerical method. Psychometrika 29, 2 (1964), 115–129.
[20]
P. Kumar, B. Surampudi, and R. Krishna. 2010. A New Similarity Metric for Sequential Data. IJDWM 6(2010), 16–32. https://doi.org/10.4018/jdwm.2010100102
[21]
J. A. Lee and M. Verleysen. 2009. Quality Assessment of Dimensionality Reduction: Rank-based Criteria. Neurocomputing 72, 7 (2009), 1431 – 1443. https://doi.org/10.1016/j.neucom.2008.12.017 Advances in Machine Learning and Computational Intelligence.
[22]
W. Lueks, B. Mokbel, M. Biehl, and B. Hammer. 2011. How to Evaluate Dimensionality Reduction? - Improving the Co-ranking Matrix. arXiv:arXiv:1110.3917
[23]
L. van der Maaten. 2014. Accelerating t-SNE using tree-based algorithms.Journal of Machine Learning Research 15, 1 (2014), 3221–3245.
[24]
L. van der Maaten and G. Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579–2605.
[25]
J. May. 2019. Simplicial objets algebraic topology.SERBIULA (sistema Librum 2.0)(2019).
[26]
L. McInnes, J. Healy, and J. Melville. 2018. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arxiv:1802.03426 [stat.ML]
[27]
J. S. B. Mitchell, D. M. Mount, and C. H. Papadimitriou. 1987. The Discrete Geodesic Problem. SIAM J. Comput. 16, 4 (Aug. 1987), 647–668. https://doi.org/10.1137/0216045
[28]
P. Rigollet. 2016. Lecture 19: Principal Component Analysis. In Statistics for Applications—MIT Course No. 18.650 / 18.6501. Massachusetts Institute of Technology. https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/index.htm# MIT OpenCourseWare.
[29]
S. T. Roweis and L. K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323–2326.
[30]
D. B. Rubin and D. T. Thayer. 1982. EM algorithms for ML factor analysis. Psychometrika 47, 1 (1982), 69–76. https://doi.org/10.1007/BF02293851
[31]
J.A. Sethian. 1999. Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science. Cambridge University Press.
[32]
D. Sommerville. 1959. An Introduction to the Geometry of N Dimensions. Methuen. 123–126 pages.
[33]
G. Strang. 2006. Linear algebra and its applications. Thomson, Brooks/Cole, Belmont, CA. http://www.amazon.com/Linear-Algebra-Its-Applications-Edition/dp/0030105676
[34]
V. Surazhsky, T. Surazhsky, D. Kirsanov, S. J. Gortler, and H. Hoppe. 2005. Fast Exact and Approximate Geodesics on Meshes. ACM Trans. Graph. 24, 3 (July 2005), 553–560. https://doi.org/10.1145/1073204.1073228
[35]
J. B. Tenenbaum, V. De Silva, and J. C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500 (2000), 2319–2323.
[36]
W. S. Torgerson. 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17, 4 (01 Dec. 1952), 401–419. https://doi.org/10.1007/BF02288916
[37]
J. Venna and S. Kaski. 2001. Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study. Lecture Notes in Computer Science 2130 (Sept. 2001). https://doi.org/10.1007/3-540-44668-0_68
[38]
S.-Q. Xin and G.-J. Wang. 2009. Improving Chen and Han’s Algorithm on the Discrete Geodesic Problem. ACM Trans. Graph. 28(2009). https://doi.org/10.1145/1559755.1559761
[39]
J. Zhang, Q. Wang, L. He, and Z. Zhou. 2011. Quantitative Analysis of Nonlinear Embedding. IEEE Transactions on Neural Networks 22, 12 (Dec. 2011), 1987–1998. https://doi.org/10.1109/TNN.2011.2171991

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CIIS '21: Proceedings of the 2021 4th International Conference on Computational Intelligence and Intelligent Systems
November 2021
95 pages
ISBN:9781450385930
DOI:10.1145/3507623
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 April 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dimensionality reduction
  2. machine learning
  3. manifold learning
  4. noise reduction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Grant Agency of the Czech Technical University in Prague

Conference

CIIS 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 47
    Total Downloads
  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media