Abstract
This paper presents an unsupervised multi-modal learning system that learns associative representation from two input modalities, or channels, such that input on one channel will correctly generate the associated response at the other and vice versa. In this way, the system develops a kind of supervised classification model meant to simulate aspects of human associative memory. The system uses a deep learning architecture (DLA) composed of two input/output channels formed from stacked Restricted Boltzmann Machines (RBM) and an associative memory network that combines the two channels using a simple back-fitting algorithm. The DLA is trained on and pairs of MNIST handwritten digit images to develop hierarchical features and associative representations that are able to reconstruct one image given its paired-associate. Experiments show that the multi-modal learning system generates models that are as accurate as back-propagation networks but with the advantage of a bi-directional network and unsupervised learning from either paired or non-paired training examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bengio, Y.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bengio, Y., Lecun, Y.: Scaling learning algorithms towards AI. MIT Press (2007)
Deng, L., Seltzer, M.L., Yu, D., Acero, A., Mohamed, A.R., Hinton, G.E.: Binary coding of speech spectrograms using a deep auto-encoder. In: Kobayashi, T., Hirose, K., Nakamura, S. (eds.) Interspeech, pp. 1692–1695. ISCA (2010)
Desjardins, G., Courville, A., Bengio, Y.: Tempered markov chain monte carlo for training of restricted boltzmann machines. Technical Report 1345, Département d’Informatique et de Recherche Opérationnelle, Université de Montréal, October 2009
Gerrig, R.J., Zimbardo, P.G.: Psychology and Life. MyPsychLab Series. Pearson/Allen and Bacon (2007)
Gouws, S.: Deep unsupervised feature learning for natural language processing. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, NAACL HLT 2012, Stroudsburg, PA, USA, pp. 48–53. Association for Computational Linguistics (2012)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Technical report, Gatsby Computational Neuroscience Unit, University College London (2002)
Hinton, G.E.: Learning multiple layers of representation. Trends in Cognitive Sciences 11, 428–434 (2007)
Hinton, G.E.: A practical guide to training restricted boltzmann machines, Technical report (2010)
Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The wake-sleep algorithm for unsupervised neural networks. Science 268(5214), 1158–1161 (1995)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Hinton, G.E., Sejnowski, T.J.: Learning and relearning in boltzmann machines. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 282–317. MIT Press, Cambridge (1986)
Le, Q.V., Monga, R., Devin, M., Corrado, G., Chen, K., Ranzato, M., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. CoRR, abs/1112.6209 (2012)
Mayer, R.E: Multimedia Learning. Cambridge University Press (2009)
Paivio, A.: Mental representations. Oxford University Press, Incorporated (1990)
Nther Palm, G.: Neural associative memories and sparse coding. Neural Netw. 37, 165–171 (2013)
Ranzato, M., Boureau, Y.I., Lecun, Y.: Sparse feature learning for deep belief networks. In: NIPS-2007 (2007)
Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., Poggio, T.: A quantitative theory of immediate visual recognition. Prog Brain Res., 33–56 (2007)
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2231–2239 (2012)
Wang, T.: Classification Via Reconstruction Using A Multi-Channel Deep Learning Architecture. Masters Thesis, Acadia University, Wolfvillle, NS, Canada (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, T., Silver, D.L. (2015). Learning Paired-Associate Images with an Unsupervised Deep Learning Architecture. In: Barbosa, D., Milios, E. (eds) Advances in Artificial Intelligence. Canadian AI 2015. Lecture Notes in Computer Science(), vol 9091. Springer, Cham. https://doi.org/10.1007/978-3-319-18356-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-18356-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18355-8
Online ISBN: 978-3-319-18356-5
eBook Packages: Computer ScienceComputer Science (R0)