Abstract
The dimensionality and the amount of data that need to be processed when intensive data streams are classified may occur prohibitively large. The aim of this paper is to analyze Johnson-Lindenstrauss type random projections as an approach to dimensionality reduction in pattern classification based on K-nearest neighbors search. We show that in-class data clustering allows us to retain accuracy recognition rates obtained in the original high-dimensional space also after transformation to a lower dimension.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achlioptas, D.: Database friendly random projections. In: Proc. Principles of Database Systems (PODS), pp. 274–281 (2001)
Arriaga, R., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. In: Proc. of FOCS, New York, pp. 616–623 (1999)
Bertoni, A., Valentini, G.: Random projections for assessing gene expression cluster stability. In: Proceedings IEEE international joint conference on neural networks, vol. 1, pp. 149–154 (2005)
Biau, G., Devroye, L., Lugosi, G.: On the Performance of Clustering in Hilbert Spaces. IEEE Tran. on Information Theory 54(2), 781–790 (2008)
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to image and text data. In: Proc. Seventh ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD 2001), pp. 245–250 (2001)
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures and Algorithms 22(1), 60–65 (2003)
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th international conference on machine learning (ICML 2003), Washington DC, USA (August 2003)
Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Proc. of STOC, Dallas, TX, pp. 604–613 (1998)
Indyk, P., Naor, A.: Nearest neighbor preserving embeddings. ACM Transactions on Algorithms (TALG) archive 3, article no. 31 (2007)
Clarkson, K.: Nearest-neighbor searching and metric space dimensions. In: Nearest-Neighbor Methods for Learning and Vision: Theory and Practice. MIT Press, Cambridge (2005)
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipshitz mapping into Hilbert space. Contemporary Mathematics 26, 189–206 (1984)
Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18, 92–106 (2006)
Kaski, S.: Dimensionality reduction by random mapping: fast similarity computation for clustering. In: Proc. IEEE International Joint Conference on Neural Networks, vol. 1, pp. 413–418 (1998)
Skubalska-Rafajlowicz, E.: Pattern recognition algorithm based on space-filling curves and orthogonal expansion. IEEE Trans. on Information Theory 47, 1915–1927 (2001)
Skubalska-Rafajlowicz, E.: Random projection RBF nets for multidimensional density estimation. International Journal of Applied Mathematics and Computer Science 18(4), 455–464 (2008)
Skubalska-Rafajlowicz E: Neural networks with sigmoidal activation functions – dimension reduction using normal random projection. Nonlinear Analysis 71(12), e1255–e1263 (2009)
Vempala, S.: The Random Projection Method. American Mathematical Society, Providence (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Skubalska-Rafajłowicz, E. (2010). Clustering of Data and Nearest Neighbors Search for Pattern Recognition with Dimensionality Reduction Using Random Projections. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2010. Lecture Notes in Computer Science(), vol 6113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13208-7_58
Download citation
DOI: https://doi.org/10.1007/978-3-642-13208-7_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13207-0
Online ISBN: 978-3-642-13208-7
eBook Packages: Computer ScienceComputer Science (R0)