On the advantages of foveal mechanisms for active stereo systems in visual search tasks

de Figueiredo, Rui Pimentel; Bernardino, Alexandre; Santos-Victor, José; Araújo, Helder

doi:10.1007/s10514-017-9617-1

On the advantages of foveal mechanisms for active stereo systems in visual search tasks

Published: 04 February 2017

Volume 42, pages 459–476, (2018)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Rui Pimentel de Figueiredo ORCID: orcid.org/0000-0003-2443-5669¹,
Alexandre Bernardino¹,
José Santos-Victor¹ &
…
Helder Araújo¹

594 Accesses
6 Citations
Explore all metrics

Abstract

In this work we study how information provided by foveated images sampled according to the log-polar transformation can be integrated over time in order to build accurate world representations and accomplish visual search tasks in an efficient manner. We focus on a specific visual information modality depth and on how to store it in a flexible memory structure. We propose a probabilistic observational model for a stereo system that relies on the Unscented Transform in order to propagate uncertainty in stereo matching, due to spatial quantization in the retina, to the 3D Cartesian domain. Probabilistic depth measurements are integrated in a novel Sensory Ego-Sphere whose topology can be biased with foveal-like distributions, according to the autonomous agent short-term tasks and goals. Furthermore, we investigate an Upper Confidence Bound algorithm for the task of simultaneously finding the closest object to the observer (visual search) and learning the surrounding environment 3D map (mapping). The performance of task execution is assessed both with a foveated log-polar sensor and a classical uniform one. The advantage of foveal vision and custom ego-sphere representations are illustrated in a series of experiments with a realistic simulator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How Far Can a 1-Pixel Camera Go? Solving Vision Tasks Using Photoreceptors and Computationally Designed Visual Morphology

A Versatile Visual Navigation System for Autonomous Vehicles

How do field of view and resolution affect the information content of panoramic scenes for visual navigation? A computational investigation

Article Open access 18 November 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Receptive fields are the fundamental visual processing units. Each corresponds to a specific region in the retina (image) and is represented by the average value of the photo-receptors (pixels) within it (e.g. average color). For more details, we refer the interested reader to Edelman (1995).

References

Agarwal, A., & Blake, A. (2010). Dense stereo matching over the panum band. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 416–430.
Article Google Scholar
Agrawal, R. (1995). Sample mean based index policies with o (log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27, 1054–1078.
MathSciNet MATH Google Scholar
Ahmad, S., & Yu, A. J. (2013). Active sensing as bayes-optimal sequential decision making, CoRR, vol. abs/1305.6650. http://arxiv.org/abs/1305.6650.
Audibert, J. -Y., & Bubeck, S. (2010). Best arm identification in multi-armed bandits. In COLT-23th conference on learning theory-2010 (pp. 13-p).
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.
Article MATH Google Scholar
Avelino, J. A., Figueiredo, R., Moreno, P., & Bernardino, A. (2016). On the perceptual advantages of visual suppression mechanisms for dynamic robot systems. In International conference on biologically inspired cognitive architectures (BICA).
Begum, M., & Karray, F. (2011). Visual attention for robotic cognition: A survey. IEEE Transactions on Autonomous Mental Development, 3(1), 92–105.
Article Google Scholar
Bernardino, A., & Santos-Victor, J. (2002). A binocular stereo algorithm for log-polar foveated systems. In H. Blthoff, C. Wallraven, S. -W. Lee , & T. Poggio (Eds.), Biologically motivated computer vision, ser. Lecture notes in computer science, (Vol. 2525, pp. 127–136). Berlin: Springer.
Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207.
Article Google Scholar
Butko, N. J., & Movellan, J. R. (2010). Infomax control of eye movements. IEEE Transactions on Autonomous Mental Development, 2(2), 91–107.
Article Google Scholar
Carrasco, M. (2011). Visual attention: The past 25 years. Vision research, 51(13), 1484–1525. (vision Research 50th Anniversary Issue: Part 2). http://www.sciencedirect.com/science/article/pii/S0042698911001544.
Colombo, C., Rucci, M., & Dario, P. (1996). Integrating selective attention and space-variant sensing in machine vision. In J. L. C. Sanz (Ed.), Image technology: Advances in image processing, multimedia and machine vision (pp. 109–127). Springer Berlin Heidelberg.
Cox, D. D., John, S. (1992). Sdo: A statistical method for global optimization. In IEEE international conference on systems, man and cybernetics (pp. 1241–1246). IEEE.
Crawford, L. E., Landy, D., & Presson, A. N. (2014). Bias in spatial memory: Prototypes or relational categories. In Poster presented at the 36th annual conference of the cognitive science Society, Quebec.
Edelman, S. (1995). Receptive fields for vision: From hyperacuity to object recognition. http://cogprints.org/570/.
Ferreira, J., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., & Laugier, C. (2008). Bayesian models for multimodal perception of 3D structure and motion. In International conference on cognitive systems (CogSys 2008), Karlsruhe, Germany. https://hal.archives-ouvertes.fr/hal-00338800.
Fleming, K. A., Peters, R. A., & Bodenheimer, R. E. (2006). Image mapping and visual attention on a sensory ego-sphere. In 2006 IEEE/RSJ international conference on intelligent robots and systems, IROS 2006, Beijing, China (pp. 241–246). October 9-15, 2006. doi:10.1109/IROS.2006.281688.
Friston, K., Adams, R., & Montague, R. (2012). What is value accumulated reward or evidence? Frontiers in Neurorobotics,. doi:10.3389/fnbot.2012.00011.
Google Scholar
Hirose, M., Furuhashi, H., Miyasaka, T., & Araki, K. (2002). Reconstruction of range data by means of geodesic dome type data structure. The Journal of the Institute of Image Electronics Engineers of Japan, 31(3), 388–395.
Google Scholar
Hirschmuller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence, 30(2), 328–341.
Article Google Scholar
Hoffman, M. D., Brochu, E., & de Freitas, N. (2011). Portfolio allocation for bayesian optimization. Citeseer.
Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots. http://octomap.github.com.
Huang, D., Allen, T. T., Notz, W. I., & Zeng, N. (2006). Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of Global Optimization, 34(3), 441–466.
Article MathSciNet MATH Google Scholar
Itti, L., & Baldi, P. F. (2006). Bayesian surprise attracts human attention. In Advances in neural information processing systems (NIPS*2005) (Vol. 19, pp. 547–554). Cambridge, MA: MIT Press. su;mod;bu;td;ey.
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 1254–1259.
Article Google Scholar
Julier, S., & Uhlmann, J. (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE, 92(3), 401–422.
Article Google Scholar
Koch, C., & Ullman, S. (1987). Shifts in selective visual attention: Towards the underlying neural circuitry. In L. M. Vaina (Ed.), Matters of intelligence: Conceptual structures in cognitive neuroscience (pp. 115–141). Dordrecht: Springer Netherlands.
Kriegman, D. J., Triendl, E., & Binford, T. O. (1989). Stereo vision and navigation in buildings for mobile robots. IEEE Transactions on Robotics and Automation, 5(6), 792–803.
Article Google Scholar
Kushner, H. J. (1964). A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Fluids Engineering, 86(1), 97–106.
Google Scholar
Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.
Article MathSciNet MATH Google Scholar
Lizotte, D., Wang, T., Bowling, M., & Schuurmans, D. (2007). Automatic gait optimization with gaussian process regression. In Proceedings of the IJCAI (pp. 944–949).
Mockus, J. (1974). On bayesian methods for seeking the extremum. In Proceedings of the IFIP technical conference (pp. 400–404). London, UK: Springer. http://dl.acm.org/citation.cfm?id=646296.687872.
Moreno, P., Nunes, R., Figueiredo, R., Ferreira, R., Bernardino, A., Santos-Victor, J., Beira, R., Vargas, L., Aragão, D., & Aragão, M. (2015). Vizzy: A humanoid on wheels for assistive robotics. In Robot 2015: Second Iberian robotics conference (pp. 17–28). Springer International Publishing 2016.
Muller, M. E. (1959). A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2(4), 19–20. doi:10.1145/377939.377946.
Article MATH Google Scholar
Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434(7031), 387–391.
Article Google Scholar
Pamplona, D., & Bernardino, A. (2009). Smooth foveal vision with Gaussian receptive fields. In 9th IEEE-RAS international conference on humanoid robots, humanoids 2009, Paris, France (pp. 223–229). December 7–10, 2009. http://dx.doi.org/10.1109/ICHR.2009.5379575.
Perrollaz, M., Spalanzani, A., & Aubert, D. (2010). Probabilistic representation of the uncertainty of stereo-vision and application to obstacle detection. In Intelligent vehicles symposium (IV), 2010 IEEE (pp. 313–318). June 2010.
Peters, R. A., Hambuchen, K. A., & Bodenheimer, R. E. (2009). The sensory ego-sphere: A mediating interface between sensors and cognition. Autonomous Robots, 26(1), 1–19. doi:10.1007/s10514-008-9098-3.
Article Google Scholar
Posner, M. (2012). Cognitive neuroscience of attention. Guilford Press. http://books.google.pt/books?id=8yjEjoS7EQsC.
Robbins, H., et al. (1952). Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58(5), 527–535.
Article MathSciNet MATH Google Scholar
Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., & Pfeifer, R. (2008). Multimodal saliency-based bottom-up attention a framework for the humanoid robot ICUB. In IEEE international conference on robotics and automation, 2008. ICRA 2008 (pp. 962–967). May 2008.
Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (1st ed.). Cambridge, MA: MIT Press.
Google Scholar
Tippetts, B., Lee, D. J., Lillywhite, K., & Archibald, J. (2016). Review of stereo vision algorithms and their suitability for resource-limited systems. Journal of Real-Time Image Processing, 11(1), 5–25.
Article Google Scholar
Vijayakumar, S., Conradt, J., Shibata, T., & Schaal, S. (2001). Overt visual attention for a humanoid robot. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, 2001 (Vol. 4, pp. 2332–2337). IEEE.
von Helmholtz, H. & König, A. (1896). Handbuch der physiologischen Optik (Vol. 1). L. Voss. https://books.google.pt/books?id=Lb4KAAAAIAAJ.
Wang, J., & Liu, Y. (2007). A closed-form solution of reconstruction from nonparallel stereo geometry used in image guided system for surgery. In N. Sebe, Y. Liu, Y. Zhuang, & T. Huang (Eds) Multimedia content analysis and mining, ser. Lecture notes in computer science (Vol. 4577, pp. 371–380). Berlin Heidelberg: Springer.
Weiman, C. F. R. (1995). Binocular stereo via log-polar retinas. In SPIE, Ed.

Download references

Acknowledgements

This work has been partially supported by the Portuguese Foundation for Science and Technology (FCT) Project [UID/EEA/50009/2013]. Rui Figueiredo is funded by FCT Ph.D. Grant PD/BD/105779/2014. Helder Araújo would like to thank FCT (Portuguese Foundation for Science and Technology) grant UID-EEA-0048-2013.

Author information

Authors and Affiliations

Institute for Systems and Robotics (ISR/IST), LARSyS, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
Rui Pimentel de Figueiredo, Alexandre Bernardino, José Santos-Victor & Helder Araújo

Authors

Rui Pimentel de Figueiredo
View author publications
You can also search for this author inPubMed Google Scholar
Alexandre Bernardino
View author publications
You can also search for this author inPubMed Google Scholar
José Santos-Victor
View author publications
You can also search for this author inPubMed Google Scholar
Helder Araújo
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Rui Pimentel de Figueiredo.

Additional information

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Figueiredo, R.P., Bernardino, A., Santos-Victor, J. et al. On the advantages of foveal mechanisms for active stereo systems in visual search tasks. Auton Robot 42, 459–476 (2018). https://doi.org/10.1007/s10514-017-9617-1

Download citation

Received: 15 February 2016
Accepted: 11 January 2017
Published: 04 February 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s10514-017-9617-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the advantages of foveal mechanisms for active stereo systems in visual search tasks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

How Far Can a 1-Pixel Camera Go? Solving Vision Tasks Using Photoreceptors and Computationally Designed Visual Morphology

A Versatile Visual Navigation System for Autonomous Vehicles

How do field of view and resolution affect the information content of panoramic scenes for visual navigation? A computational investigation

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now