Abstract
The ability to develop representations of components and to recombine them in a new but compositionally meaningful manner is considered a hallmark of human cognition, which has not been reached by machines, yet. The Omniglot challenge taps into this deficit by posing several one-shot/few-shot generation and classification tasks of handwritten character trajectories. In contrast to the original approach of providing character components, we investigated how compositional representations can develop naturally within a generative LSTM model. The network’s performance and the underlying mechanisms are examined on the original Omniglot dataset and on our own more representative dataset. We show that solving the challenge becomes possible, because, during training, the designed LSTM network fosters the learning of compositional representations, which it can quickly reassemble into new, unseen but related character trajectories. Evidence is provided by several experiments, including an analysis of the latent states of the system, revealing the emergent compositional structures with t-SNE, and the evaluation of the network’s performance, when training and test alphabets do or do not share components. Overall, we show how compositionality can be fostered in latent, generative encodings, thus improving machine learning by further aligning technical methods to cognitive mechanisms in humans.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Similar results were found for other test alphabets. A deeper interaction analysis goes beyond the scope of this paper. Test Burmese: Training Balinese (0.273) < Greek (0.280) < Latin (0.299); Test Latin: Training Greek / Burmese (0.230) < Balinese (0.251); Test Greek: Training Latin (0.329) < Burmese (0.334) < Balinese (0.339).
References
Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261 (2018)
Butz, M.V., Bilkey, D., Humaidan, D., Knott, A., Otte, S.: Learning, planning, and control in a monolithic neural event inference architecture. Neural Netw. 117, 135–144 (2019)
Edwards, H., Storkey, A.: Towards a neural statistician. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Eslami, S., et al.: Attend, infer, repeat: fast scene understanding with generative models. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Fabi, S., Otte, S., Wiese, J.G., Butz, M.V.: Investigating efficient learning and compositionality in generative LSTM networks. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12396, pp. 143–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61609-0_12
Feinman, R., Lake, B.M.: Learning task-general representations with generative neuro-symbolic modeling. arXiv:2006.14448 (2020)
Franklin, N.T., Norman, K.A., Ranganath, C., Zacks, J.M., Gershman, S.J.: Structured event memory: a neuro-symbolic model of event cognition. Psychol. Rev. 127, 327–361 (2020)
George, D., et al.: A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs. Science 358, 6368 (2017)
Gopnik, A.: AIs versus four-year-olds. In: Brockman, J. (ed.) Possible Minds: Twenty-five ways of looking at AI. Penguin Press, New York (2019)
Graves, A.: Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013)
Gregor, K., Besse, F., Rezende, D.J., Danihelka, I., Wierstra, D.: Towards conceptual compression. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Haibe-Kains, B., et al.: Transparency and reproducibility in artificial intelligence. Nature 586, 1–7 (2020)
Hewitt, L.B., Nye, M.I., Gane, A., Jaakkola, T., Tenenbaum, J.B.: The variational homoencoder: Learning to learn high capacity generative models from few examples. In: Uncertainty in Artificial Intelligence (2018)
Hinton, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems (NeurIPS) (2003)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Hupkes, D., Dankers, V., Mul, M., Bruni, E.: Compositionality decomposed: how do neural networks generalise? J. Artif. Intell. Res. 67, 757–795 (2020)
Jensen, D.: Empirical research in machine learning: perspectives and strategies. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015)
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: The omniglot challenge: a 3-year progress report. Curr. Opin. Behav. Sci. 29, 97–104 (2019)
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Niels, R., Vuurpijl, L.: Using dynamic time warping for intuitive handwriting recognition. In: Proceedings of the 12th Conference of the Internatonal Graphonomics Society (2005)
Otte, S., Karlbauer, M., Butz, M.V.: Active tuning. arXiv:2010.03958 (2020)
Partee, B.: Lexical semantics and compositionality. Invitation Cogn. Sci. Lang. 1, 311–360 (1995)
Rezende, D., Danihelka, I., Gregor, K., Wierstra, D., et al.: One-shot generalization in deep generative models. In: International Conference on Machine Learning (2016)
Shyam, P., Gupta, S., Dukkipati, A.: Attentive recurrent comparators. In: International Conference on Machine Learning (2017)
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Acknowledgements
We thank Marcel Molière for help with the t-SNE plots, Thilo Hagendorff for helpful comments on the manuscript, and Maximus Mutschler for maintaining the GPU cluster of the BMBF funded project Training Center for Machine Learning, on which the results were computed. This research was funded by the German Research Foundation (DFG) within Priority-Program SPP 2134 - project “Development of the agentive self” (BU 1335/11-1, EL 253/8-1). MB is part of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Fabi, S., Otte, S., Butz, M.V. (2021). Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12892. Springer, Cham. https://doi.org/10.1007/978-3-030-86340-1_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-86340-1_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86339-5
Online ISBN: 978-3-030-86340-1
eBook Packages: Computer ScienceComputer Science (R0)