Abstract
The goal of our paper is to learn the association and the semantic grounding of two sensory input signals that represent the same semantic concept. The input signals can be or cannot be the same modality. This task is inspired by infants learning. We propose a novel framework that has two symbolic Multilayer Perceptron (MLP) in parallel. Furthermore, both networks learn to ground semantic concepts and the same coding scheme for all semantic concepts in both networks. In addition, the training rule follows EM-approach. In contrast, the traditional setup of association task pre-defined the coding scheme before training. We have tested our model in two cases: mono- and multi-modal. Our model achieves similar accuracy association to MLPs with pre-defined coding schemes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Andersen, E.S., Dunlea, A., Kekelis, L.: The impact of input: language acquisition in the visually impaired. First Lang. 13(37), 23–49 (1993)
Asano, M., Imai, M., Kita, S., Kitajo, K., Okada, H., Thierry, G.: Sound symbolism scaffolds language development in preverbal infants. Cortex 63, 196–205 (2015)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39(1), 1–38 (1977)
Harnad, S.: The symbol grounding problem. Phys. D Nonlinear Phenom. 42(1), 335–346 (1990)
Khan, I., Saffari, A., Bischof, H.: Tvgraz: multi-modal learning of object categories by combining textual and visual features. In: AAPR Workshop, pp. 213–224 (2009)
Lecun, Y., Cortes, C.: The MNIST database of handwritten digits
Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Nakamura, T., Araki, T., Nagai, T., Iwahashi, N.: Grounding of word meanings in latent dirichlet allocation-based multimodal concepts. Adv. Robot. 25(17), 2189–2206 (2011)
Needham, C.J., Santos, P.E., Magee, D.R., Devin, V., Hogg, D.C., Cohn, A.G.: Protocols from perceptual observations. Artif. Intell. 167(1), 103–136 (2005)
Nene, S.A., Nayar, S.K., Murase, H.: Columbia Object Image Library (COIL-20). Technical report, February 1996
Pereira, J.C., Vasconcelos, N.: Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems. Comput. Vis. Image Underst. 124, 123–135 (2014)
Plunkett, K., Sinha, C., Møller, M.F., Strandsby, O.: Symbol grounding or the emergence of symbols? vocabulary growth in children and a connectionist net. connection Sci. 4(3–4), 293–312 (1992)
Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: ACM International Conference on Multimedia, pp. 251–260 (2010)
Raue, F., Byeon, W., Breuel, T., Liwicki, M.: Parallel sequence classification using recurrent neural networks and alignment. In: 13th International Conference on Document Analysis and Recognition (ICDAR) (2015)
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, vol. 2, p. 1470. IEEE Computer Society, Washington, DC (2003)
Spencer, P.E.: Looking without listening: is audition a prerequisite for normal development of visual attention during infancy? J. Deaf Stud. Deaf Educ. 5(4), 291–302 (2000)
Yu, C., Ballard, D.H.: A multimodal learning interface for grounding spoken language in sensory perceptions. ACM Trans. Appl. Percept. (TAP) 1(1), 57–80 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Raue, F., Palacio, S., Breuel, T.M., Byeon, W., Dengel, A., Liwicki, M. (2016). Symbolic Association Using Parallel Multilayer Perceptron. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-44781-0_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44780-3
Online ISBN: 978-3-319-44781-0
eBook Packages: Computer ScienceComputer Science (R0)