Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge

Fabi, Sarah; Otte, Sebastian; Butz, Martin V.

doi:10.1007/978-3-030-86340-1_42

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12892))

Included in the following conference series:

International Conference on Artificial Neural Networks

2124 Accesses
2 Citations

Abstract

The ability to develop representations of components and to recombine them in a new but compositionally meaningful manner is considered a hallmark of human cognition, which has not been reached by machines, yet. The Omniglot challenge taps into this deficit by posing several one-shot/few-shot generation and classification tasks of handwritten character trajectories. In contrast to the original approach of providing character components, we investigated how compositional representations can develop naturally within a generative LSTM model. The network’s performance and the underlying mechanisms are examined on the original Omniglot dataset and on our own more representative dataset. We show that solving the challenge becomes possible, because, during training, the designed LSTM network fosters the learning of compositional representations, which it can quickly reassemble into new, unseen but related character trajectories. Evidence is provided by several experiments, including an analysis of the latent states of the system, revealing the emergent compositional structures with t-SNE, and the evaluation of the network’s performance, when training and test alphabets do or do not share components. Overall, we show how compositionality can be fostered in latent, generative encodings, thus improving machine learning by further aligning technical methods to cognitive mechanisms in humans.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Similar results were found for other test alphabets. A deeper interaction analysis goes beyond the scope of this paper. Test Burmese: Training Balinese (0.273) < Greek (0.280) < Latin (0.299); Test Latin: Training Greek / Burmese (0.230) < Balinese (0.251); Test Greek: Training Latin (0.329) < Burmese (0.334) < Balinese (0.339).

References

Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261 (2018)
Butz, M.V., Bilkey, D., Humaidan, D., Knott, A., Otte, S.: Learning, planning, and control in a monolithic neural event inference architecture. Neural Netw. 117, 135–144 (2019)
Article Google Scholar
Edwards, H., Storkey, A.: Towards a neural statistician. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Google Scholar
Eslami, S., et al.: Attend, infer, repeat: fast scene understanding with generative models. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Google Scholar
Fabi, S., Otte, S., Wiese, J.G., Butz, M.V.: Investigating efficient learning and compositionality in generative LSTM networks. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12396, pp. 143–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61609-0_12
Chapter Google Scholar
Feinman, R., Lake, B.M.: Learning task-general representations with generative neuro-symbolic modeling. arXiv:2006.14448 (2020)
Franklin, N.T., Norman, K.A., Ranganath, C., Zacks, J.M., Gershman, S.J.: Structured event memory: a neuro-symbolic model of event cognition. Psychol. Rev. 127, 327–361 (2020)
Article Google Scholar
George, D., et al.: A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs. Science 358, 6368 (2017)
Google Scholar
Gopnik, A.: AIs versus four-year-olds. In: Brockman, J. (ed.) Possible Minds: Twenty-five ways of looking at AI. Penguin Press, New York (2019)
Google Scholar
Graves, A.: Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013)
Gregor, K., Besse, F., Rezende, D.J., Danihelka, I., Wierstra, D.: Towards conceptual compression. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Google Scholar
Haibe-Kains, B., et al.: Transparency and reproducibility in artificial intelligence. Nature 586, 1–7 (2020)
Article Google Scholar
Hewitt, L.B., Nye, M.I., Gane, A., Jaakkola, T., Tenenbaum, J.B.: The variational homoencoder: Learning to learn high capacity generative models from few examples. In: Uncertainty in Artificial Intelligence (2018)
Google Scholar
Hinton, G.E., Roweis, S.: Stochastic neighbor embedding. In: Advances in Neural Information Processing Systems (NeurIPS) (2003)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Hupkes, D., Dankers, V., Mul, M., Bruni, E.: Compositionality decomposed: how do neural networks generalise? J. Artif. Intell. Res. 67, 757–795 (2020)
Article MathSciNet Google Scholar
Jensen, D.: Empirical research in machine learning: perspectives and strategies. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015)
Article MathSciNet Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: The omniglot challenge: a 3-year progress report. Curr. Opin. Behav. Sci. 29, 97–104 (2019)
Article Google Scholar
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017)
Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Niels, R., Vuurpijl, L.: Using dynamic time warping for intuitive handwriting recognition. In: Proceedings of the 12th Conference of the Internatonal Graphonomics Society (2005)
Google Scholar
Otte, S., Karlbauer, M., Butz, M.V.: Active tuning. arXiv:2010.03958 (2020)
Partee, B.: Lexical semantics and compositionality. Invitation Cogn. Sci. Lang. 1, 311–360 (1995)
Google Scholar
Rezende, D., Danihelka, I., Gregor, K., Wierstra, D., et al.: One-shot generalization in deep generative models. In: International Conference on Machine Learning (2016)
Google Scholar
Shyam, P., Gupta, S., Dukkipati, A.: Attentive recurrent comparators. In: International Conference on Machine Learning (2017)
Google Scholar
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)
Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2016)
Google Scholar

Download references

Acknowledgements

We thank Marcel Molière for help with the t-SNE plots, Thilo Hagendorff for helpful comments on the manuscript, and Maximus Mutschler for maintaining the GPU cluster of the BMBF funded project Training Center for Machine Learning, on which the results were computed. This research was funded by the German Research Foundation (DFG) within Priority-Program SPP 2134 - project “Development of the agentive self” (BU 1335/11-1, EL 253/8-1). MB is part of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645.

Author information

Authors and Affiliations

Neuro-Cognitive Modeling Group, University of Tübingen, Tübingen, Germany
Sarah Fabi, Sebastian Otte & Martin V. Butz

Authors

Sarah Fabi
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Otte
View author publications
You can also search for this author in PubMed Google Scholar
Martin V. Butz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarah Fabi .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fabi, S., Otte, S., Butz, M.V. (2021). Fostering Compositionality in Latent, Generative Encodings to Solve the Omniglot Challenge. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12892. Springer, Cham. https://doi.org/10.1007/978-3-030-86340-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-86340-1_42
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86339-5
Online ISBN: 978-3-030-86340-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics