Learning a Compositional Hierarchy of Disparity Descriptors for 3D Orientation Estimation in an Active Fixation Setting

Kalou, Katerina; Gibaldi, Agostino; Canessa, Andrea; Sabatini, Silvio P.

doi:10.1007/978-3-319-68612-7_22

Katerina Kalou¹⁷,
Agostino Gibaldi^17,18,
Andrea Canessa¹⁷ &
…
Silvio P. Sabatini¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10614))

Included in the following conference series:

International Conference on Artificial Neural Networks

4216 Accesses

Abstract

Interaction with everyday objects requires by the active visual system a fast and invariant reconstruction of their local shape layout, through a series of fast binocular fixation movements that change the gaze direction on the 3-dimensional surface of the object. Active binocular viewing results in complex disparity fields that, although informative about the orientation in depth (e.g., the slant and tilt), highly depend on the relative position of the eyes. Assuming to learn the statistical relationships between the differential properties of the disparity vector fields and the gaze directions, we expect to obtain more convenient, gaze-invariant visual descriptors. In this work, local approximations of disparity vector field differentials are combined in a hierarchical neural network that is trained to represent the slant and tilt from the disparity vector fields. Each gaze-related cell’s activation in the intermediate representation is recurrently merged with the other cells’ activations to gain the desired gaze-invariant selectivity. Although the representation has been tested on a limited set of combinations of slant and tilt, the resulting high classification rate validates the generalization capability of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Active Side of Stereopsis: Fixation Strategy and Adaptation to Natural Environments

Article Open access 20 March 2017

Memory for retinotopic locations is more accurate than memory for spatiotopic locations, even for visually guided reaching

Article 20 November 2017

Inferring visual space from ultra-fine extra-retinal knowledge of gaze position

Article Open access 17 January 2023

References

Ban, H., Welchman, A.E.: fMRI analysis-by-synthesis reveals a dorsal hierarchy that extracts surface slant. J. Neurosci. 35(27), 9823–9835 (2015)
Article Google Scholar
Canessa, A., Gibaldi, A., Chessa, M., Fato, M., Solari, F., Sabatini, S.P.: A dataset of stereoscopic images and ground-truth disparity mimicking human fixations in peripersonal space. Sci. Data 4 (2017)
Google Scholar
Dhond, U.R., Aggarwal, J.K.: Structure from stereo-a review. IEEE Trans. Syst. Man Cybern. 19(6), 1489–1510 (1989)
Article MathSciNet Google Scholar
Gibaldi, A., Canessa, A., Sabatini, S.P.: The active side of stereopsis: fixation strategy and adaptation to natural environments. Sci. Rep. 7, 44800 (2017)
Article Google Scholar
Hansard, M., Horaud, R.: Cyclopean geometry of binocular vision. JOSA A 25(9), 2357–2369 (2008)
Article MATH Google Scholar
Hinkle, D.A., Connor, C.E.: Three-dimensional orientation tuning in macaque area V4. Nat. Neurosci. 5(7), 665–670 (2002)
Article Google Scholar
Koenderink, J.J., van Doorn, A.J.: The internal representation of solid shape with respect to vision. Biol. Cybern. 32(4), 211–216 (1979)
Article MATH Google Scholar
Koenderink, J.J., van Doorn, A.J.: Facts on optic flow. Biol. Cybern. 56(4), 247–254 (1987)
Article MATH Google Scholar
Liu, L., van Hulle, M.M.: Modeling the surround of MT cells and their selectivity for surface orientation in depth specified by motion. Neural Comput. 10(2), 295–312 (1998)
Article Google Scholar
LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: 2004 Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. II–104. IEEE (2004)
Google Scholar
Medsker, L.R., Jain, L.C.: Recurrent neural networks. Des. Appl. 5 (2001)
Google Scholar
Nguyenkim, J.D., DeAngelis, G.C.: Disparity-based coding of three-dimensional surface orientation by macaque middle temporal neurons. J. Neurosci. 23(18), 7117–7128 (2003)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representation by back propagation. Parallel Distrib. Process.: Explor. Microstruct. Cogn. 1 (1986)
Google Scholar
Salinas, E., Abbott, L.F.: A model of multiplicative neural responses in parietal cortex. Proc. Nat. Acad. Sci. 93(21), 11956–11961 (1996)
Article Google Scholar
Tsao, D.Y., Vanduffel, W., Sasaki, Y., Fize, D., Knutsen, T.A., Mandeville, J.B., Wald, L.L., Dale, A.M., Rosen, B.R., Van Essen, D.C., Livingstone, M.S.: Stereopsis activates V3A and caudal intraparietal areas in macaques and humans. Neuron 39(3), 555–568 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genoa, Genoa, Italy
Katerina Kalou, Agostino Gibaldi, Andrea Canessa & Silvio P. Sabatini
School of Optometry, University of California, Berkeley, Berkeley, CA, USA
Agostino Gibaldi

Authors

Katerina Kalou
View author publications
You can also search for this author in PubMed Google Scholar
Agostino Gibaldi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Canessa
View author publications
You can also search for this author in PubMed Google Scholar
Silvio P. Sabatini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katerina Kalou .

Editor information

Editors and Affiliations

University of Lausanne, Lausanne, Switzerland
Alessandra Lintas
University of Genoa, Genoa, Italy
Stefano Rovetta
Universitat Pompeu Fabra, Barcelona, Spain
Paul F.M.J. Verschure
University of Lausanne, Lausanne, Switzerland
Alessandro E.P. Villa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kalou, K., Gibaldi, A., Canessa, A., Sabatini, S.P. (2017). Learning a Compositional Hierarchy of Disparity Descriptors for 3D Orientation Estimation in an Active Fixation Setting. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-68612-7_22
Published: 25 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68611-0
Online ISBN: 978-3-319-68612-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning a Compositional Hierarchy of Disparity Descriptors for 3D Orientation Estimation in an Active Fixation Setting

Abstract

Access this chapter

Similar content being viewed by others

The Active Side of Stereopsis: Fixation Strategy and Adaptation to Natural Environments

Memory for retinotopic locations is more accurate than memory for spatiotopic locations, even for visually guided reaching

Inferring visual space from ultra-fine extra-retinal knowledge of gaze position

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning a Compositional Hierarchy of Disparity Descriptors for 3D Orientation Estimation in an Active Fixation Setting

Abstract

Access this chapter

Similar content being viewed by others

The Active Side of Stereopsis: Fixation Strategy and Adaptation to Natural Environments

Memory for retinotopic locations is more accurate than memory for spatiotopic locations, even for visually guided reaching

Inferring visual space from ultra-fine extra-retinal knowledge of gaze position

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation