Abstract
The generalization ability of classification is often closely related to both the intra-class compactness and the inter-class separability. Owing to the fact that many current dimensionality reduction methods, regarded as a pre-processor, often lead to the poor classification performance on real-life data, in this paper, a new data preprocessing technique called manifold contraction (MC) is proposed for the classification-oriented learning task. The main motivation behind MC lies in seeking a proper mapping of contracting the given multiple-manifold data such that the ratio of the intra-class to the inter-class scatters is minimized. Moreover, in order to properly control the contraction level in MC, an adaptive MC (AMC) criterion is developed in the semi-supervised setting. Due to its generality, MC can be not only applied in original space and reproducing kernel Hilbert space (RKHS), but also easily incorporated with dimensionality reduction method for further improvement of classification performance. The final experimental results show that MC, as a data preprocessor, is effective and promising in the subsequent classification learning, especially in small-size labeled sample case.
Similar content being viewed by others
References
Cherkassky V. Model complexity control and statistical learning theory. Nat Comput, 2006, 1: 109–133
Duin R P W, Pekalska E. Object representation, sample size and data complexity. In: Basu M, Ho T K, eds. Data Complexity in Pattern Recognition. London: Springer, 2006. 25–47
Bishop M. Neural Networks for Pattern Recognition. Oxford: Oxford University Press, 1995
Yan S, Xu D, Zhang B, et al. Graph embedding and extension: a general framework for dimensionality reduction. IEEE Trans Patt Anal Mach Intell, 2007, 29: 40–51
Hotelling H. Analysis of a complex of statistical variables into principle components. J Educat Psych, 1933, 24: 417–441
Fisher R A. The use of multiple measurements in taxonomic problem. Annal Eugen, 1936, 7: 179–188
Cai D, He X, Han J. Semi-supervised discriminant analysis. In: Proceedings of IEEE International Conference on Computer Vision (ICCV′07), Rio de Janeiro, Brazil, 2007. 1–7
Zhang D, Zhou Z H, Chen S C. Semi-supervised dimensionality reduction. In: Proceedings of the 7th SIAM International Conference on Data Mining (ICDM′07), Minneapolis, MN, 2007. 629–634
Schölkopf B, Smola A J, Müller K R. Nonlinear component analysis as a kernel eigenvalue problem. Neur Comput, 1998, 10: 1299–1319
Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290: 2323–2326
Tenenbaum J B, de Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290: 2319–2323
Lafon S, Lee A B. Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans Patt Anal Mach Intell, 2006, 28: 1393–1403
Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation. Neur Comput, 2003, 15: 1373–1396
Chapelle O, Schölkopf B, Zien A. Analysis of Benchmarks. Semi-Supervised Learning. Chapelle O, Schölkopf B, Zien A, eds. New York: MIT Press, 2006. 377–390
Van der Maaten L J P, Postma E O, van den Herik H J. Dimensionality reduction: A comparative review. Neurocomputing, 2008, published online
Zhu X J. Semi-supervised learning literature survey. Computer Sciences TR 1530, University of Wisconsin-Madison, Last modified on July 19, 2008
Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res, 2006, 7: 2399–2434
Szummer M, Jaakkola T. Partially labeled classification with markov random walks. Adv Neur Inf Process Syst, 2002, 15: 945–952
He J R, Li M J, Zhang H J, et al. Manifold-ranking based image retrieval. In: Proceedings of 12th ACM Multimedia, New York: ACM, 2004. 9–16
Zhou D, Bousquet O, Lal T N, et al. Learning with local and global consistency. Adv Neur Inf Process Syst, 2004, 16: 321–328
Zhou D, Weston J, Gretton A, et al. Ranking on data manifold. Adv Neur Inf Process Syst, 2004, 17: 169–176
Zhu X J, Ghahramani Z, Lafferty J. Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conferences on Machine Learning (ICML′03), Washington, DC, 2003. 912–919
Bengio Y, Delalleanu O, Roux N L. Label Propagation and Quadratic Criterion. Semi-Supervised Learning. Chapelle O, Schölkopf B, Zien A, eds. New York: MIT Press, 2006. 193–215
Wang F, Zhang C S. Label propagation through linear neighborhoods. IEEE Trans Knowl Data Eng, 2008, 20: 55–67
Keerthi S S, Shevade S K, Bhattacharyya C, et al. A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neur Netw, 2000, 11: 124–136
Vapnik V N. The Nature of Statistical Learning Theory. 2nd ed. New York: Springer-Verlag, 1995
Tsang W I, Kocsor A, Kwok J T. Efficient kernel feature extraction for massive data sets. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD′06), Philadelphia, PA, USA, 2006. 724–729
Breitenbach M, Grudic G. Clustering through ranking on manifolds. In: Proceedings of the 22th International Conferences on Machine Learning (ICML’05), New York: ACM, 2005. 73–80
Porkaew K, Chakrabarti K, Mehrotra S. Query refinement for multimedia retrieval and its evaluation techniques in MARS. In: ACM International Multimedia Conference, Orlando, Florida, 1999. 235–238
Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Trans Inf Syst, 2006, 24: 219–244
Van der Maaten L J P. An Introduction to Dimensionality Reduction Using Matlab. Technical Report MICC 07-07. Maastricht University, Maastricht, The Netherlands, 2007
Levina E, Bickel P J. Maximum likelihood estimation of intrinsic dimension. Adv Neur Inf Process Syst, 2004, 17: 777–784
Fukunaga K. Olsen D R. An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput, 1971, C-20: 176–183
Bengio Y, Larochelle H, Vincent V. Non-local manifold parzen windows. Adv Neur Inf Process Syst, 2005, 18: 115–122
Vincent P, Bengio Y. Manifold parzen windows. Adv Neur Inf Process Syst, 2002, 15: 825–832
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Hu, E., Chen, S. & Yin, X. Manifold contraction for semi-supervised classification. Sci. China Inf. Sci. 53, 1170–1187 (2010). https://doi.org/10.1007/s11432-010-0066-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-010-0066-0