Unsupervised neural network learning procedures for feature extraction and classification

Becker, Suzanna; Plumbley, Mark

doi:10.1007/BF00126625

Unsupervised neural network learning procedures for feature extraction and classification

Published: July 1996

Volume 6, pages 185–203, (1996)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Suzanna Becker¹ &
Mark Plumbley²

637 Accesses
45 Citations
4 Altmetric
Explore all metrics

Abstract

In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to real-world problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extreme Learning Machine – A New Machine Learning Paradigm

Artificial Intelligence and Machine Learning

A survey of deep network techniques all classifiers can adopt

Article 17 November 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

H.M. Abbas and M.M. Fahmy, “A neural model for adaptive Karhunen Loéve transform (KLT),” in Proceedings of the International Joint Conference on Neural Networks, IJCNN-92, Baltimore, 1992, vol. II, pp. 975–980.
J.J. Atick and A.N. Redlich, “Predicting ganglion and simple cell receptive field organizations from information theory,” Institute for Advanced Study, Princeton, Technical Report, IASSNS-HEP-89/55, 1989.
Google Scholar
J.J. Atick and A.N Redlich, “Towards a theory of early visual processing,” Neural Computation, vol. 2, pp. 308–320, 1990.
Google Scholar
P. Baldi and K. Hornik, “Neural networks and principal component analysis: Learning from examples without local minima,” Neural Networks, vol. 2, pp. 53–58, 1989.
Google Scholar
H.B. Barlow, “Unsupervised learning,” Neural Computation, vol. 1, pp. 295–311, 1989.
Google Scholar
S. Becker, An Information-theoretic Unsupervised Learning Algorithm for Neural Networks, Ph.D. Thesis, University of Toronto, 1992.
S. Becker, “Learning to categorize objects using temporal coherence,” in Advances in Neural Information Processing Systems 5, Morgan Kaufmann, pp. 361–368, 1993.
S. Becker and G.E. Hinton, “A self-organizing neural network that discovers surfaces in random-dot stereograms,” Nature, vol. 355, pp. 161–163, 1992.
Google Scholar
S. Becker and G.E. Hinton, “Learning mixture models of spatial coherence,” Neural Computation, vol. 5, no. 2, pp. 267–277, 1993.
Google Scholar
A.J. Bell, “Self-organisation in real neurons: Anti-hebb in ‘channel space’,” in Advances in Neural Information Processing Systems 4, Morgan Kaufmann, pp. 59–66, 1992.
E.L. Bienenstock, L.N. Cooper, and P.W. Munro, “Theory for the development of neuron selectivity; orientation specificity and binocular interaction in visual cortex,” Journal of Neuroscience, vol. 2, pp. 32–48, 1982.
Google Scholar
H. Bourlard and Y. Kamp, “Auto-association by multilayer perceptrons and singular value decomposition,” Biological Cybernetics, vol. 59, pp. 291–294, 1988.
Google Scholar
J.S. Bridle, “Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition,” in NATO ASI Series on Systems and Computer Science, edited by F. Fougelman-Soulie and J. Herault, Springer-Verlag, 1990.
G.A. Carpenter and S. Grossberg, “A massively parallel architecture for a self-organizing neural pattern recognition machine,” Computer Vision, vol. 37, pp. 54–115, 1983.
Google Scholar
G.W. Cottrell, P.W. Munro, and D. Zipser, “Image compression by back propagation: A demonstration of extensional programming,” in Advances in Cognitive Science, edited by N.E. Sharkey, vol. 2, Abbex: Norwood, NJ, 1989.
Google Scholar
A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Proceedings of the Royal Statistical Society, vol. B 39, 1977, pp. 1–38.
Google Scholar
J.L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, pp. 179–211, 1990.
Google Scholar
J.L. Elman and D. Zipser, “Learning the hidden structure of speech,” Institute of Cognitive Science, University of California, San Diego, ICS Report 8701, 1987.
Google Scholar
F. Fallside, “On the analysis of multi-dimensional linear predictive/autoregressive data by a class of single layer connectionist models,” in IEE Conference on Artificial Neural Networks, pp. 176–180, 1989.
P. Földiák, “Adaptive network for optimal linear feature extraction,” in Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, Washington, DC, 1989, pp. 401–405.
Y. Freund and D. Haussler, “Unsupervised learning of distributions on binary vectors using 2-layer networks,” in Advances in Neural Information Processing Systems 4, Morgan Kaufmann Publishers, pp. 912–919, 1992.
K. Fukushima, “Cognitron: A self-organizing multilayered neural network,” Biological Cybernetics, vol. 20, pp. 121–136, 1975.
Google Scholar
K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, pp. 193–202, 1980.
Google Scholar
K. Fukushima, “A hierarchical neural network model for associative memory,” Biological Cybernetics, vol. 50, pp. 105–113, 1984.
Google Scholar
C. Galland, Learning in Deterministic Boltzmann Machine Networks, Ph.D. Thesis, University of Toronto, 1992.
J.J. Gerbrands, “On the relationships between SVD, KLT and PCA,” Pattern Recogntion, vol. 14, pp. 375–381, 1981.
Google Scholar
G.H. Golub and C.F.Van Loan, Matrix Computations, North Oxford Academic: Oxford, 1983.
Google Scholar
R.C. Gonzalez and P. Wintz, Digital Image Processing, Addison-Wesley: Reading, MA, second edition, 1987.
Google Scholar
G.E. Hinton and T.J. Sejnowski, “Learning and relearning in Boltzmann machines,” in Parallel distributed processing: Explorations in the microstructure of cognition, edited by D.E. Rumelhart, J.L. McClelland, and the PDP research group, MIT Press: Cambridge, MA, vol. I, pp. 282–317, 1986.
Google Scholar
K. Hornik and C.-M. Kuan, “Convergence analysis of local feature extraction algorithms,” Neural Networks, vol. 5, pp. 229–240, 1992.
Google Scholar
N. Intrator, “Feature extraction using an unsupervised neural network,” Neural Computation, vol. 4, no. 1, pp. 98–107, 1992.
Google Scholar
R.A. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton, “Adaptive mixtures of local experts,” Neural Computation, vol. 3, no. 1, 1991.
M.I. Jordan and R.A. Jacobs, “Hierarchies of adaptive experts,” in Advances in Neural Information Processing Systems 5, Morgan Kaufmann, pp. 985–992, 1993.
C. Jutten and J. Herault, “Blind separation of sources, part I: An adaptive algorithm based on enuromimetic architecture,” Signal Processing, vol. 24, pp. 1–10, 1991.
Google Scholar
C. Jutten and J. Herault, “Blind separation of sources, part II: Problems statement,” Signal Processing, vol. 24, pp. 11–20, 1991.
Google Scholar
J. Karhunen and J. Joutsensalo, “Tracking of sinusoidal frequencies by neural network learning algorithms,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-91, Toronto, Canada, 1991.
T. Kohonen, “Clustering, taxonomy, and topological maps of patterns,” in Proceedings of the Sixth International Conference on Pattern Recognition, edited by M. Lang, IEEE Computer Society Press: Silver Spring, MD, 1982.
Google Scholar
T. Kohonen, “The ‘neural’ phonetic typewriter,” IEEE Computer, vol. 21, pp. 11–22, 1988.
Google Scholar
T. Kohonen and E. Oja, “Fast adaptive formation of orthogonalizing filters and associative memory in recurrent networks of neuron-like elements,” Biological Cybernetics, vol. 21, pp. 85–95, 1976.
Google Scholar
S.Y. Kung and K.I. Diamantaras, “A neural network learning algorithm for adaptive principal component extraction (APEX),” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-90, vol. II, 1990, pp. 861–864.
A.S. Lapedes and R.M. Farber, “Nonlinear signal processing using neural networks: Prediction and system modelling,” Los Alamos National Laboratory, Technical Report LA-UR-87-2662, 1987.
T.K. Leen, “Dynamics of learning in linear feature-discovery networks,” Network, vol. 2, pp. 85–105, 1991.
Google Scholar
T.K. Leen, M. Rudnick, and D. Hammerstrom, “Hebbian feature discovery improves classifier efficiency,” in Proceedings of the International Joint Conference on Neural Networks, IJCNN-89, Washington, DC, 1989, pp. I: 51–56.
R. Linsker, “Self-organization in a perceptual network,” IEEE Computer, vol. 21, no. 3, pp. 105–117, March 1988.
Google Scholar
R. Linsker, “Deriving receptive fields using an optimal encoding criterion,” in Advances in Neural Information Processing Systems 5, Morgan Kaufmann, pp. 953–960, 1993.
S.P. Luttrell, “Hierarchical vector quantisation,” in Proceedings of the Inst. of Elec. Eng., vol. 136, pp. 405–413, 1989.
M.C. Mozer, “Discovering discrete distributed representations with iterative competitive learning,” in Advances in Neural/Information Processing Systems 3, Morgan Kaufmann, pp. 627–634, 1991.
M.C. Mozer, “Induction of multiscale temporal structure,” in Advances in Neural Information Processing Systems 4, Morgan Kaufmann, pp. 275–282, 1992.
M.C. Mozer, “Neural net architectures for temporal sequence processing,” in Predicting the future and understanding the past, edited by A. Weigend and N. Gershenfeld, Addison-Wesley Publishing: Redwood City, CA, 1993.
Google Scholar
R.M. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, vol. 56, pp. 71–113, 1992.
Google Scholar
R.M. Neal and G.E. Hinton, “A new view of the EM algorithm that justifies incremental and other variants,” Submitted for publication.
S.J. Nowlan, “Maximum likelihood competitive learning,” in Neural Information Processing Systems,edited by D.S. Touretzky, Morgan Kaufmann: San Mateo, CA, vol. 2, pp. 574–582, 1990.
Google Scholar
S.J. Nowlan, Soft Competitive Adaptation: Neural Network Learning Algorithms based on Fitting Statistical Mixtures, Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh PA, 1991. Also published as CMU Technical Report CMU-CS-91–126.
Google Scholar
E. Oja, “A simplified neuron model as a principal component analyser,” Journal of Mathematical Biology, vol. 15, pp. 267–273, 1982.
Google Scholar
E. Oja, “Neural networks, principal components, and subspaces,” International Journal of Neural Systems, vol. 1, no. 1, pp. 61–68, 1989.
Google Scholar
E. Oja, “Principal components, minor components, and linear neural networks,” Neural Networks, vol. 5, pp. 927–935, 1992.
Google Scholar
E. Oja and J. Karhunen, “On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix,” Journal of Mathematical Analysis and Applications, vol. 106, pp. 69–84, 1985.
Google Scholar
E. Oja, H. Ogawa, and J. Wangviwattana, “PCA in fully parallel neural networks,” in Artificial Neural Networks, edited by I. Aleksander and J. Taylor, North-Holland: Amsterdam, vol. 2, pp. 199–202, 1992.
Google Scholar
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann: San Mateo, California, 1988.
Google Scholar
B.A. Pearlmutter and G.E. Hinton, “G-maximization: An unsupervised learning procedure for discovering regularities,” in Neural Networks for Computing: American Institute of Physics Conference Proceedings 151, edited by J.S.Denker, pp. 333–338, 1986.
C. Peterson and J.R. Anderson, “A mean field theory learning algorithm for neural networks,” Complex Systems, vol. 1, pp. 995–1019, 1987.
Google Scholar
C. Peterson and E. Hartman, “Explorations of the mean field theory learning algorithm,” Neural Networks, vol. 2, p. 475, 1989.
M.D. Plumbley, “Efficient information transfer and anti-Hebbian neural networks,” Neural Networks, vol. 6, no. 6, pp. 823–833, 1993.
Google Scholar
M.D. Plumbley, “A Hebbian/anti-Hebbian network which optimizes information capacity by orthonormalizing the principal subspace,” in Proceedings of the IEE Artificial Neural Networks Conference, ANN-93, Brighton, UK, May 1993, pp. 86–90.
M.D. Plumbley and F. Fallside, “An information-theoretic approach to unsupervised connectionist models,” in Proceedings of the 1988 Connectionist Models Summer School, edited by D. Touretzky, G. Hinton, and T. Sejnowski, Morgan-Kaufmann, San Mateo, CA, 1988, pp. 239–245.
Google Scholar
J. Rubner and P. Tavan, “A self-organizing network for principal component analysis,” Europhysics Letters, vol. 10, pp. 693–698, 1989.
Google Scholar
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning internal representations by error propagation,” in Parallel Distributed Processing: Exploration in the Microstructure of Cognition, edited by D.E. Rumelhart and J.L. McClelland, MIT Press: Cambridge, MA, vol. 1, pp. 318–362, 1986.
Google Scholar
D.E. Rumelhart and D. Zipser, “Competitive learning,” Cognitive Science, vol. 9, pp. 75–112, 1985.
Google Scholar
T.D. Sanger, “Optimal unsupervised learning in a single-layer feedforward neural network,” Neural Networks, vol. 2, pp. 459–473, 1989.
Google Scholar
E. Saund, “Dimensionality-reduction using connectionist networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 3, pp. 304–314, 1989.
Google Scholar
J. Schmidhuber, “Learning factorial codes by predictability minimization,” Neural Computation, vol. 4, pp. 863–879, 1992.
Google Scholar
J. Schmidhuber, “Learning unambiguous reduced sequence descriptions,” in Advances in Neural Information Processing Systems 4, Morgan Kaufmann, pp. 291–298, 1992.
N.N. Schraudolph and T.J. Sejnowski, “Competitive antihebbian learning of invariants,” in Advances in Neural Information Processing Systems 4, Morgan Kaufmann, pp. 1017–1024, 1992.
C.E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, pp. 379–423, 623–656, 1948.
Google Scholar
A. Ukrainec and S. Haykin, “Application of unsupervised neural networks to the enhancement of polarization targets in dualpolarized radar images,” in IEEE Canadian Conference on Electrical and Computer Engineering, 1991.
C.von der Malsburg, “Self-organization of orientation sensitive cells in striate cortex,” Kybernetik, vol. 14, pp. 85–100, 1973.
Google Scholar
S. Watanabe, Pattern Recognition: Human and Mechanical, John Wiley & Sons: New York, 1985.
Google Scholar
A.S. Weigend, B.A. Huberman, and D.E. Rumelhart, “Predicting the future: A connectionist approach,” International Journal of Neural Systems, vol. 1, pp. 193–209, 1990.
Google Scholar
R.J. Williams, “Feature discovery through error-correction learning,” Institute of Cognitive Science, University of California, San Diego, ICS Report 8501, 1985.
Google Scholar
R. Zemel and G.E. Hinton, “Developing topographic representations by minimizing description length,” in Advances in Neural Information Processing System 6, edited by J.D. Cowan, G. Tesauro, and J. Alspector, Morgan Kaufmann, pp. 11–18, 1994.
R.S. Zemel and G.E. Hinton, “Discovering viewpoint-invariant relationships that characterize objects,” in Advances In Neural Information Processing Systems 3, edited by R.P. Lippmann, J.E. Moody, and D.S. Touretzky, Morgan Kaufmann Publishers, pp. 299–305, 1991.

Download references

Author information

Authors and Affiliations

Department of Psychology, McMaster University, L8S 4K1, Hamilton, Ont., Canada
Suzanna Becker
Department of Computer Science, King's College, London, Strand, WC2R 2LS, London, UK
Mark Plumbley

Authors

Suzanna Becker
View author publications
You can also search for this author inPubMed Google Scholar
Mark Plumbley
View author publications
You can also search for this author inPubMed Google Scholar

Additional information

The first author is supported by research grants from the James S. McDonnell Foundation (grant #93–95) and the Natural Sciences and Engineering Research Council of Canada. For part of this work, the second author was supported by a Temporary Lectureship from the Academic Initiative of the University of London, and by a grant (GR/J38987) from the Science and Engineering Research Council (SERC) of the UK.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Becker, S., Plumbley, M. Unsupervised neural network learning procedures for feature extraction and classification. Appl Intell 6, 185–203 (1996). https://doi.org/10.1007/BF00126625

Download citation

Issue Date: July 1996
DOI: https://doi.org/10.1007/BF00126625

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised neural network learning procedures for feature extraction and classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Extreme Learning Machine – A New Machine Learning Paradigm

Artificial Intelligence and Machine Learning

A survey of deep network techniques all classifiers can adopt

Explore related subjects

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now