Abstract
The estimation of relevant information theoretical quantities, such as entropy, mutual information, and various divergences is computationally expensive in high dimensions. However, for this task, one may apply pairwise Euclidean distances of sample points, which suits random projection (RP) based low dimensional embeddings. The Johnson-Lindenstrauss (JL) lemma gives theoretical bound on the dimension of the low dimensional embedding. We adapt the RP technique for the estimation of information theoretical quantities. Intriguingly, we find that embeddings into extremely small dimensions, far below the bounds of the JL lemma, provide satisfactory estimates for the original task. We illustrate this in the Independent Subspace Analysis (ISA) task; we combine RP dimension reduction with a simple ensemble method. We gain considerable speed-up with the potential of real-time parallel estimation of high dimensional information theoretical quantities.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Shannon, C.: The mathematical theory of communication. Bell System Technical Journal 27, 623–656 (1948)
Neemuchwala, H., Hero, A., Zabuawala, S., Carson, P.: Image registration methods in high-dimensional space. Int. J. Imaging Syst. and Technol. 16(5), 130–145 (2007)
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. J. of the ACM 45(6), 891 (1998)
Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz maps into a Hilbert space. Contemporary Mathematics 26, 189–206 (1984)
Arriga, R.I., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projections. Machine Learning 63, 161–182 (2006)
Matoušek, J.: On variants of the Johnson-Lindenstrauss lemma. Random Structures and Algorithms 33(2), 142–156 (2008)
Vempala, S.S.: The Random Projection Method. DIMACS Series in Discrete Math., vol. 65 (2005), http://dimacs.rutgers.edu/Volumes/Vol65.html
Baraniuk, R., Davenport, M., DeVore, R., Wakin, M.: A simple proof of the restricted isometry property for random matrices (to appear, 2008)
Cardoso, J.: Multidimensional independent component analysis. In: ICASSP 1998, vol. 4, pp. 1941–1944 (1998)
Jutten, C., Hérault, J.: Blind separation of sources: An adaptive algorithm based on neuromimetic architecture. Signal Processing 24, 1–10 (1991)
Kybic, J.: High-dimensional mutual information estimation for image registration. In: ICIP 2004, pp. 1779–1782. IEEE Computer Society, Los Alamitos (2004)
Gaito, S., Greppi, A., Grossi, G.: Random projections for dimensionality reduction in ICA. Int. J. of Appl. Math. and Comp. Sci. 3(4), 154–158 (2006)
Bingham, E.: Advances in Independent Component Analysis with Applications to Data Mining. PhD thesis, Helsinki University of Technology (2003), http://www.cis.hut.fi/ella/thesis/
Theis, F.J.: Uniqueness of complex and multidimensional independent component analysis. Signal Processing 84(5), 951–956 (2004)
Theis, F.J.: Multidimensional independent component analysis using characteristic functions. In: EUSIPCO 2005 (2005)
Theis, F.J.: Blind signal separation into groups of dependent signals using joint block diagonalization. In: ISCAS 2005, Kobe, Japan, pp. 5878–5881 (2005)
Amari, S., Cichocki, A., Yang, H.H.: A new learning algorithm for blind signal separation. In: NIPS 1996, vol. 8, pp. 757–763 (1996)
Szabó, Z., Póczos, B., Lőrincz, A.: Cross-entropy optimization for independent process analysis. In: Rosca, J.P., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 909–916. Springer, Heidelberg (2006)
Póczos, B., Lőrincz, A.: Independent subspace analysis using k-nearest neighborhood distances. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 163–168. Springer, Heidelberg (2005)
Szabó, Z., Póczos, B., Lőrincz, A.: Undercomplete blind subspace deconvolution. J. of Machine Learning Res. 8, 1063–1095 (2007)
Kozachenko, L.F., Leonenko, N.N.: On statistical estimation of entropy of random vector. Problems Infor. Transmiss. 23(2), 95–101 (1987)
Hero, A., Ma, B., Michel, O., Gorman, J.: Applications of entropic spanning graphs. Signal Processing 19(5), 85–95 (2002)
Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method. Springer, Heidelberg (2004)
Learned-Miller, E.G., Fisher III, J.W.: ICA using spacings estimates of entropy. J. of Machine Learning Res. 4, 1271–1295 (2003)
Bach, F.R., Jordan, M.I.: Beyond independent components: Trees and clusters. J. of Machine Learning Res. 4, 1205–1233 (2003)
Póczos, B., Szabó, Z., Kiszlinger, M., Lőrincz, A.: Independent process analysis without a priori dimensional information. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 252–259. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Szabó, Z., Lőrincz, A. (2009). Fast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds) Independent Component Analysis and Signal Separation. ICA 2009. Lecture Notes in Computer Science, vol 5441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00599-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-00599-2_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00598-5
Online ISBN: 978-3-642-00599-2
eBook Packages: Computer ScienceComputer Science (R0)