Abstract
Extracting meaningful patterns between objects i.e., relational reasoning is crucial element of human reasoning and still a challenging task for artificial intelligence. Our research objective was to investigate two end-to-end architectures augmented with a relational neural module on a challenging Cornell NLVR visual question answering task. It was our hope that the relational reasoning capabilities on multi-modal inputs for which the relational networks are famous for would be leveraged on the task at hand. We have achieved state-of-the-art performance outperforming the results reported in the related research studies conducted on the same benchmark dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M., et al.: TensorFlow: a system for large scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA, pp. 265–283 (2016)
Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Neural module networks. In: IEEE Conference on Computer Vision and Patter Recognition CVPR (2015)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico (2016)
Chen, J., Kuznetsova, P., Warren, D.S., Choi, Y.: Deja image-captions: a corpus of expressive descriptions in repetition. In: 53rd Annual Meeting of the Association of Computational Linguistic, pp. 504–514. ACL, Denver (2015)
Goldman, O., Latcinnik, V., Naveh, U., Globerson, A., Berant, J.: Weakly-supervised semantic parsing with abstract examples. In: 56th Annual Meeting of the Association of Computational Linguistic, pp. 1809–1819. ACL, Melbourne (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. J. Neural Comput. 9(8), 1735–1780 (1997)
Johnson, J., et al.: CLEVR: a diagnostic dataset for compositional language and elementary visual reasoning. In: Proceedings of IEEE Conference on Computer Vision and Patter Recognition, Honolulu, USA (2017)
Johnson, J., et al.: Inferring and executing programs for visual reasoning. In: International Conference on Computer Vision ICCV, Venice, Italy (2017)
Krishna, R., et al.: Visual Genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1), 32–73 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., et al. (eds.) Advances in Neural Information Processing System, vol. 25. NIPS, Lake Tahoe (2012)
Malinowski, M., Rohrbach, M., Fritz, M.: Ask your neurons: a neural-based approach to answering questions about images. In: International Conference on Computer Vision ICCV, Santiago, Chile (2015)
Sabour, S., Rosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing System, vol. 30. NIPS, Long Beach (2017)
Santoro, A., et al.: A simple neural network module for relational reasoning. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing System, vol. 30. NIPS, Long Beach (2017)
Suhr, A., Lewis, M., Yeh, J., Artzi, Y.: A corpus of natural language for visual reasoning. In: 55th Annual Meeting of the Association of Computational Linguistic, pp. 217–223. ACL, Vancouver (2017)
Tan, H., Bansal, M.: Object ordering with bidirectional matchings for visual reasoning, In: Proceedings of NAACL-HLT 2018. ACL, New Orleans (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jankoski, K., Gievska, S. (2018). Evaluation of Multiple Approaches for Visual Question Reasoning. In: Kalajdziski, S., Ackovska, N. (eds) ICT Innovations 2018. Engineering and Life Sciences. ICT 2018. Communications in Computer and Information Science, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-00825-3_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-00825-3_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00824-6
Online ISBN: 978-3-030-00825-3
eBook Packages: Computer ScienceComputer Science (R0)