A new approach to discover interlacing data structures in high-dimensional space

Ban, Tao; Zhang, Changshui; Abe, Shigeo

doi:10.1007/s10844-008-0055-6

A new approach to discover interlacing data structures in high-dimensional space

Published: 26 May 2008

Volume 33, pages 3–22, (2009)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Tao Ban¹,
Changshui Zhang² &
Shigeo Abe³

115 Accesses
2 Citations
Explore all metrics

Abstract

The discovery of structures hidden in high-dimensional data space is of great significance for understanding and further processing of the data. Real world datasets are often composed of multiple low dimensional patterns, the interlacement of which may impede our ability to understand the distribution rule of the data. Few of the existing methods focus on the detection and extraction of the manifolds representing distinct patterns. Inspired by the nonlinear dimensionality reduction method ISOmap, in this paper we present a novel approach called Multi-Manifold Partition to identify the interlacing low dimensional patterns. The algorithm has three steps: first a neighborhood graph is built to capture the intrinsic topological structure of the input data, then the dimensional uniformity of neighboring nodes is analyzed to discover the segments of patterns, finally the segments which are possibly from the same low-dimensional structure are combined to obtain a global representation of distribution rules. Experiments on synthetic data as well as real problems are reported. The results show that this new approach to exploratory data analysis is effective and may enhance our understanding of the data distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bourlard, H., & Kamp, Y. (1988). Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 59, 291–294.
Article MATH MathSciNet Google Scholar
Bruske, J., & Sommer, G. (1998). An algorithm for intrinsic dimensionality estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5), 572–575.
Article Google Scholar
Camastra, F., & Vinciarelli, A. (2002). Estimating the intrinsic dimension of data with a fractal-based method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(10), 1404–1407.
Article Google Scholar
Chang, H. D., & Wang, J. F. (1994). A robust stroke extraction method for handwritten Chinese characters. International Journal of Pattern Recognition and Artificial Intelligence, 8(5), 1223–1239.
Article Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms (2nd ed., pp. 595–601). Cambridge: MIT and McGraw-Hill.
MATH Google Scholar
Costa, J., & Hero, A. O. (2004). Geodesic entropic graphs for dimension and entropy estimation in manifold learning. IEEE Transactions on Signal Processing, 52(8), 2210–2221.
Article MathSciNet Google Scholar
Cox, T., & Cox, M. (1994). Multidimensional scaling. London: Chapman & Hall.
MATH Google Scholar
Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification (2nd ed., pp. 517–556). New York: Wiley.
MATH Google Scholar
Fukunaga, K., & Olsen, D. R. (1971). An algorithm for finding intrinsic dimensionality of data. IEEE Transactions on Computers, 20, 176–183.
Article MATH Google Scholar
Hyvarinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13, 411–430.
Article Google Scholar
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Englewood Cliffs: Prentice Hall.
MATH Google Scholar
Jolliffe, I. T. (1986). Principal component analysis. New York: Springer-Verlag.
Google Scholar
Kambhatla, N., & Leen, T. K. (1997). Dimension reduction by local principal component analysis. Neural Computation, 9(7), 1493–1516.
Article Google Scholar
Kégl, B. (2003). Intrinsic dimension estimation using packing numbers. In T. G. Dietterich (Ed.), Advances in neural information processing systems 14 (NIPS2002). Cambridge: MIT.
Google Scholar
Kohonen, T. (2001). Self-organizing maps, third extended edition, Springer series in information sciences (Vol. 30). Berlin, Heidelberg, New York: Springer.
Google Scholar
Kumar, V., Grama, A., Gupta, A., & Karypis, G. (1994). Introduction to parallel computing: Design and analysis of algorithms (pp. 257–297). Redwood City: Benjamin/Cummings.
MATH Google Scholar
Levina, E., & Bickel, P. J. (2005). Maximum likelihood estimation of intrinsic dimension. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing systems 17 (NIPS2004). Cambridge: MIT.
Google Scholar
Pettis, K., Bailey, I., Jain, T., & Dubes, R. (1979). An intrinsic dimensionality estimator from near-neighbor information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 25–37.
Article MATH Google Scholar
Raginsky, M., & Lazebnik, S. (2006). Estimation of intrinsic dimensionality using high-rate vector quantization. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.), Advances in neural information processing systems 18 (NIPS2005). Cambridge: MIT.
Google Scholar
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326.
Article Google Scholar
Sklansky, J., & Wassel, G. N. (1981). Pattern classifiers and trainable machines (pp. 112–113). New York: Springer-Verlag.
MATH Google Scholar
Tenenbaum, J. B., Silvam, V. de., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319–2323.
Article Google Scholar
Verveer, P. J., & Duin, R. P. W. (1995). An evaluation of intrinsic dimensionality estimators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 81–86.
Article Google Scholar
Zeng, J., & Liu, Z. Q. (2006). Stroke segmentation of Chinese characters using Markov random fields. In Proceedings of 18th international conference on pattern recognition (ICPR’06), (pp. 868–871).

Download references

Author information

Authors and Affiliations

Information Security Research Center, National Institute of Information and Communications Technology, Tokyo, 184-8795, Japan
Tao Ban
Department of Automation, Tsinghua University, Beijing, 100-083, China
Changshui Zhang
Graduate School of Science and Technology, Kobe University, Kobe, 657-8501, Japan
Shigeo Abe

Authors

Tao Ban
View author publications
You can also search for this author in PubMed Google Scholar
Changshui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shigeo Abe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Ban.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ban, T., Zhang, C. & Abe, S. A new approach to discover interlacing data structures in high-dimensional space. J Intell Inf Syst 33, 3–22 (2009). https://doi.org/10.1007/s10844-008-0055-6

Download citation

Published: 26 May 2008
Issue Date: August 2009
DOI: https://doi.org/10.1007/s10844-008-0055-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new approach to discover interlacing data structures in high-dimensional space

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new approach to discover interlacing data structures in high-dimensional space

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation