Abstract
Chapter 1 presented a synopsis of information theory to understand its foundations and how it affected the field of communication systems. In a nutshell, mutual information characterizes the fundamental compromise of maximum rate for error-free information transmission (the channel capacity theorem) as well as the minimal information that needs to be sent for a given distortion (the rate distortion theorem). In essence given the statistical knowledge of the data and these theorems the optimal communication system emerges, or self-organizes from the data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aczél J., Daróczy Z., On measures of information and their characterizations, Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975.
Amari S., Cichocki A., Yang H., A new learning algorithm for blind signal separation. Advances in Neural Information Processing Systems, vol. 8 pp. 757–763, MIT Press, Cambridge, MA, 1996.
Atick J., Redlich A., Towards a theory of early visual processing, Neural Comput., 2:308–320, 1990.
Attneave F., Some informational aspects of visual perception, Psychol. Rev., 61; 183–193, 1954.
Barlow H., Unsupervised learning. Neural Comput., 1(3):295–311, 1989.
Barlow H., Kausal T., Mitchison G., Finding minimum entropy codes, Neural Comput., 1(3):412–423, 1989.
Becker S., Hinton G., A self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355:161–163, 1992
Becker. S., Unsupervised learning with global objective functions. In M. A. Arbib, (Ed.), The Handbook of Brain Theory and Neural Networks, pp. 997–1000. MIT Press, Cambridge, MA, 1998c
Bell A., Sejnowski T., An information-maximization approach to blind separation and blind deconvolution. Neural Comput., 7(6):1129–1159, 1995.
Benveniste A., Goursat M., Ruget G., Robust identification of a non-minimum phase system: Blind adjustment of a linear equalizer in data communications, IEEE Trans. Autom. Control, 25(3):385–399, 1980.
Bishop C., Neural Networks for Pattern Recognition, Clarendon Press, Oxford, 1995.
Cardoso J., Blind signal separation: Statistical principles, Proc. IEEE, 86(10):2009–2025, 1998.
Chechik G., Tishby N., Temporal dependent plasticity: An information theoretic account, Proc. Neural Inf. Process. Syst., 13:110–116, 2001.
Chen Z., Haykin S., Eggermont J., Bekker S., Correlative learning: A basis for brain and adaptive systems, John Wiley, Hoboken, NJ, 2007.
Choi S., Cichocki A., Amari S., Flexible independent component analysis, J. VLSI Signal Process., 26:25–38, 2000.
Comon P., Independent component analysis, a new concept?, Signal Process., 36(3):287–314, 1994.
Donoho D., On minimum entropy deconvolution, in Applied Time Series Analysis II, Academic Press, New York, 1981, pp. 565–609.
Erdogmus D., Principe J., Hild II K., Do Hebbian synapses estimate entropy?, Proc. IEEE Workshop on Neural Networks for Signal Process., Martigni, Switzerland, pp. 199–208, 2002.
Erdogmus D., Hild II K., Lazaro M., Santamaria I., Principe J., Adaptive blind deconvolution of linear channels using Renyi’s nntropy with Parzen estimation, IEEE Trans. Signal Process., 52(6)1489–1498, 2004.
Erdogmus D., Ozertem U., Self-consistent locally defined principal surfaces. In Proc. Int. Conf. Acoustic, Speech and Signal Processing, volume 2, pp. 15–20, April 2007.
Gersho A., Gray R. Vector Quantization and Signal Compression. Springer, New York, 1991
Haykin S., Neural Networks: A Comprehensive Foundation, Prentice Hall, Upper Saddle River, NJ, 1999.
Hebb D., Organization of Behavior: A Neurophysiology Theory, John Wiley, NY, New York, 1949.
Heskes T., Energy functions for self-organizing maps. In E. Oja and S. Kaski, editors, Kohonen Maps, Elsevier, Amsterdam, 1999, pp. 303–316.
Hild II K., Erdogmus D., Principe J., Blind source separation using Renyi’s mutual information, IEEE Signal Process. Lett., 8:174–176, 2001.
Hild II K., Erdogmus D., Principe J., An analysis of entropy estimators for blind source separation, Signal Process., 86(1):182–194, 2006.
Hinton G. and Sejnowski T., Unsupervised learning: Foundations of neural computation, MIT Press, Cambridge, MA, 1999.
Hyvarinen A., Fast and Robust Fixed-Point Algorithms for Independent Component Analysis, IEEE Trans. Neural Netw., 10(3):626–634, 1999.
Huber, P.J., Robust Estimation of a Location Parameter. Ann. Math. Statist., 35:73–101, 1964.
Jaynes E., Probability Theory, the Logic of Science, Cambridge University Press, Cambridge, UK, 2003.
Jumarie G., Relative Information, Springer Verlag, New York, 1990
Kegl B., Krzyzak A., Piecewise linear skeletonization using principal curves. IEEE Trans. Pattern Anal. Mach. Intell., 24(1):59–74, 2002.
LeCun Y., Chopra S., Hadsell R., Ranzato M., Huang F., A tutorial on energy-based learning, in Predicting Structured Data, Bakir, Hofman, Scholkopf, Smola, Taskar (Eds.), MIT Press, Boston, 2006.
Lehn-Schieler T., Hegde H., Erdogmus D., and Principe J., Vector-quantization using information theoretic concepts. Natural Comput., 4:39–51, Jan. 2005.
Linsker R., Towards an organizing principle for a layered perceptual network. In D. Z. Anderson (Ed.), Neural Information Processing Systems - Natural and Synthetic. American Institute of Physics, New York, 1988.
MacKay D., Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge, UK, 2003.
Marossero D., Erdogmus D., Euliano N., Principe J., Hild II, K., Independent components analysis for fetal electrocardiogram extraction: A case for the data efficient mermaid algorithm, Proceedings of NNSP’03, pp. 399–408, Toulouse, France, Sep 2003.
Nadal J., Parga N., Nonlinear neurons in the low noise limit: a factorial code maximizes information transfer, Network, 5:561–581, 1994.
Oja. E., A simplified neuron model as a principal component analyzer. J. Math. Biol., 15:267–273, 1982.
Papoulis A., Probability, Random Variables and Stochastic Processes, McGraw-Hill, New York, 1965.
Pereira F., Tishby N., Lee L., Distributional clustering of english words. In Meeting of the Association for Computational Linguistics, pp. 183–190, 1993.
Pham D., Vrins, F., Verleysen, M., On the risk of using Renyi’s entropy for blind source separation, IEEE Trans. Signal Process., 56(10):4611–4620, 2008.
Principe J., Xu D., Information theoretic learning using Renyi’s quadratic entropy, in Proc. ICA’99, 407–412, Aussois, France, 1999.
Principe, J., Xu D., Fisher J., Information theoretic learning, in unsupervised adaptive filtering, Simon Haykin (Ed.), pp. 265–319, Wiley, New York, 2000.
Principe J., Euliano N., Lefebvre C., Neural Systems: Fundamentals through Simulations, CD-ROM textbook, John Wiley, New York, 2000.
Rao S., Unsupervised Learning: An Information Theoretic Learning Approach, Ph.D. thesis, University of Florida, Gainesville, 2008.
Roweis S., Saul L., Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323–2326, 2000.
Slonim N. Tishby N., The power of word clusters for text classification. In 23rd European Colloquium on Information Retrieval Research, 2001.
Tishby N., Pereira F., and Bialek W., The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377, 1999.
Watanabe S., Pattern Recognition: Human and Mechanical. Wiley, New York, 1985.
Wu H., Principe J., Simultaneous diagonalization in the frequency domain for source separation, Proc. First Int. Workshop on Ind. Comp. Anal. ICA’99, 245–250, Aussois, France, 1999.
Wyszecki G., Stiles W., Color Science: Concepts and Methods, Quantitative Data and Formulae. Wiley, New York, 1982.
Xu D., Principe J., Fisher J., Wu H., A novel measure for independent component analysis (ICA), in Proc. of ICASSP’98, vol. 2, pp. 1161–1164, 1998
Zemel R., Hinton G., Learning population codes by minimizing the description length, in Unsupervised Learning, Hinton and Sejnowski (Eds.), pp. 261–276, MIT Press, Cambridge, MA, 1999.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Rao, S., Erdogmus, D., Xu, D., Hild, K. (2010). Self-Organizing ITL Principles for Unsupervised Learning. In: Information Theoretic Learning. Information Science and Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-1570-2_8
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1570-2_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-1569-6
Online ISBN: 978-1-4419-1570-2
eBook Packages: Computer ScienceComputer Science (R0)