Abstract
As data volume and dimensions continue to grow, effective and efficient methods are needed to obtain the low dimensional features of the data that describe its true structure. Most nonlinear dimensionality reduction methods (NLDR) utilize the Euclidean distance between the data points to form a general idea of the data manifold structure. Isomap uses the geodesic distance between data points and then uses classical multidimensional scaling(cMDS) to obtain low dimensional features. As the data size increases Isomap becomes complex. To overcome this disadvantage, Landmark Isomap (L-Isomap) uses selected data points called landmark points and finds the geodesic distance from these points to all other non-landmark points. Traditionally, landmark points are randomly selected without considering any statistical property of the data manifold. We contend that the quality of the features extracted is dependent on the selection of the landmark points. In applications such as data classification, the net accuracy is dependent on the quality of the features selected, and hence landmark points selection might play a crucial role. In this paper, we propose a clustering approach to obtain the landmark points. These new points are now used to represent the data, and Fisher’s linear discriminants are used for classification. The proposed method is tested with different datasets to verify the efficacy of the approach.
Similar content being viewed by others
References
Alsabti, K., Ranka, S., Singh, V. (1997). An efficient k-means clustering algorithm.
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.
Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M. (2004). Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. Advances in Neural Information Processing Systems, 16, 177–184.
Burfield, C. (2013). Floyd-warshall algorithm. Massachusetts Institute of Technology.
Chatfield, C., & Collins, A.J. (1980). Principal component analysis. In Introduction to multivariate analysis, (Vol. 1 pp. 57–81).
Feng, L., Gao, C., Sun, T., Wu, H. (2010). A neighborhood selection algorithm for manifold learning. In 2010 International conference on computer design and applications (ICCDA), (Vol. 2 pp. 2–339): IEEE.
Jain, A.K. (2010). Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8), 651–666.
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y. (2002). An efficient k-means clustering algorithm: Analysis and implementation. IEEE transcation on. Pattern analysis and machine learning, 24(7), 881–892.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Roweis, S.T., & Saul, L.K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.
Nene, S.K.N.S.A., & Murase, H. (1996). Columbia object image library (COIL-20). In Technical report CUCS-005-96.
Saul, L.K., & Roweis, S.T. (2003). Think globally, fit locally: unsupervised learning of low dimensional manifolds. The Journal of Machine Learning Research, 4, 119–155.
Silva, V.D., & Tenenbaum, J.B. (2002). Global versus local methods in nonlinear dimensionality reduction. In Advances in neural information processing systems (pp. 705–712).
Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematics and Statistical Psychology, 59(1), 1–34.
Steinley, D., & Brusco, M.J. (2007). Intializing k-means batch clustering: a critical evaluation of several techniques. Journal of Classification, 24(1), 99–121.
Tenenbaum, J.B., De Silva, V., Langford, J.C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.
Tsai, F.S., & Chan, K.L. (2007). Dimensionality reduction techniques for data exploration. In 2007 6th International conference on information, communications & signal processing, (pp. 1–5): IEEE.
Van der Maaten, L., Postma, E., Van Den Herik, H. (2009). Dimensionality reduction: A comparative review. Journal of Machine Learning Research, 10, 1–41.
Wang, J. (2012). Classical multidimensional scaling. In Geometric structure of high-dimensional data and dimensionality reduction (pp. 115–129).
Welling, M. (2005). Fisher linear discriminant analysis. Department of Computer Science, University of Toronto, 3, 1–4.
Yang, M.-H. (2002). Face recognition using extended isomap. In 2002. Proceedings. 2002 international conference on image processing, (Vol. 2 p. 117): IEEE.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rashmi, M., Sankaran, P. Optimal Landmark Point Selection Using Clustering for Manifold Modeling and Data Classification. J Classif 36, 94–112 (2019). https://doi.org/10.1007/s00357-018-9285-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-018-9285-7