Skip to main content
Log in

Region-Based Artificial Visual Attention in Space and Time

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Mobile robots have to deal with an enormous amount of visual data containing static and dynamic stimuli. Depending on the task, only small portions of a scene are relevant. Artificial attention systems filter information at early stages. Among the various methods proposed to implement such systems, the region-based approach has proven to be robust and especially suited for integrating top-down influences. This concept was recently transferred to the spatiotemporal domain to obtain motion saliency. A full-featured integration of the spatial and spatiotemporal systems is presented here. We propose a biologically inspired two-stream system, which allows to use different spatial and temporal resolutions and to pick off spatiotemporal saliency at early stages. We compare the output to classic models and demonstrate the flexibility of the integrated approach in different experiments. These include online processing of continuous input, a task similar to thumbnail extraction and a top-down task of selecting specific moving and non-moving objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.

    Article  Google Scholar 

  2. Aziz MZ, Mertsching B. Fast and robust generation of feature maps for region-based visual attention. In: IEEE transactions on image processing, vol. 17; 2008. p. 633–44.

  3. Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci. 2004;5(6):495–501.

    Article  CAS  PubMed  Google Scholar 

  4. Aziz MZ, Knopf M, Mertsching B. Knowledge-driven saliency: attention to the unseen. In: ACIVS 2011, LNCS 6915; 2011. p. 34–45.

  5. Aziz MZ. Behavior adaptive and real-time model of integrated bottom-up and top-down visual attention. Dissertation, University of Paderborn; 2009.

  6. Tünnermann J, Mertsching B. Continuous region-based processing of spatiotemporal saliency. In: Proceedings of the international conference on computer vision theory and applications; 2012. p. 230–9.

  7. Koch C, Ullman S. Shifts in selective attention: towards the underlying neural circuitry. Hum Neurobiol. 1985;4:219–27.

    CAS  PubMed  Google Scholar 

  8. Treisman AM, Gelade G. A feature integration theory of attention. Cognit Psychol. 1980;12(1):97–136.

    Google Scholar 

  9. Itti L, Koch C. Computational modelling of visual attention. Nat Rev Neurosci. 2001;2(3):194–203.

    Article  CAS  PubMed  Google Scholar 

  10. Belardinelli A, Pirri F, Carbone A. Attention in cognitive systems. Berlin: Springer. 2009. p. 112–23.

  11. Adelson EH, Bergen JR. Spatiotemporal energy models for the perception of motion. J Opt Soc Am. 1985;2(2):284–99.

    Article  CAS  Google Scholar 

  12. Hou X, Zhang L. Saliency detection: a spectral residual approach. In: IEEE CVPR; 2007. p. 1–8.

  13. Li J, Levine MD, An X, He H. Saliency detection based on frequency and spatial domain analyses. In: Proceedings of the British machine vision conference, BMVA Press; 2011. p. 86.1–.11.

  14. Guo C, Ma Q, Zhang L. Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: IEEE CVPR; 2008. p. 1–8.

  15. Cui X, Liu Q, Metaxas DN. Temporal spectral residual: fast motion saliency detection. In: ACM multimedia’09; 2009. p. 617–20.

  16. Gao D, Mahadevan V, Vasconcelos N. The discriminant center-surround hypothesis for bottom-up saliency. In: Advances in neural information processing systems. vol. 20; 2007. p. 1–8.

  17. Seo HJ, Milanfar P. Static and space-time visual saliency detection by self-resemblance. J Vis. 2009;9(12):15.1–.27.

    Article  Google Scholar 

  18. Mahadevan V, Vasconcelos N. Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell. 2010;32(1):171–7.

    Article  PubMed  Google Scholar 

  19. Itti L, Baldi PF. Bayesian surprise attracts human attention. In: Advances in neural information processing systems, vol. 19. Cambridge, MA: MIT Press; 2006. p. 547–54.

  20. Itti L, Baldi PF. Bayesian surprise attracts human attention. Vis Res. 2009;49(10):1295–306.

    Article  PubMed Central  PubMed  Google Scholar 

  21. Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW. SUN: a Bayesian framework for saliency using natural statistics. J Vis. 2008;8(7):1–20.

    Article  Google Scholar 

  22. Zhang L, Tong MH, Cottrell GW. SUNDAy: saliency using natural statistics for dynamic analysis of scenes. In: 31st annual cognitive science society conference; 2009. p. 2944–9.

  23. Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766–86.

    Article  PubMed  Google Scholar 

  24. Oliva A, Torralba A. Building the gist of a scene: the role of global image features in recognition. In: Progress in brain research; 2006. p. 23–36.

  25. Itti L, Koch C. Feature combination strategies for saliency-based visual attention systems. J Electron Imaging. 2001;10(1):161–9.

    Article  Google Scholar 

  26. Navalpakkam V, Itti L. An integrated model of top-down and bottom-up attention for optimal object detection. In: IEEE CVPR; 2006. p. 2049–56.

  27. Aziz MZ, Mertsching B. Visual search in static and dynamic scenes using fine-grain top-down visual attention. In: ICVS, vol. 5008; 2008. p. 3–12.

  28. Wischnewski M, Belardinelli A, Schneider WX, Steil JJ. Where to look next? combining static and dynamic proto-objects in a TVA-based model of visual attention. Cognit Comput. 2010;2(4):326–43.

    Article  Google Scholar 

  29. Kouchaki Z, Nasrabadi AM. A nonlinear feature fusion by variadic neural network in saliency-based visual attention. In: Proceedings of the international conference on computer vision theory and applications; 2012. p. 457–61.

  30. Tünnermann J, Born C, Mertsching B. Top-down visual attention with complex templates. In: Proceedings of the international conference on computer vision theory and applications; 2013. p. 370–7.

  31. Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):185–207.

    Article  PubMed  Google Scholar 

  32. Aziz MZ, Shafik MS, Mertsching B, Munir A. Color segmentation for visual attention of mobile robots. In: Proceedings of the IEEE symposium on emerging technologies; 2005. p. 115–20.

  33. Backer M, Tünnermann J, Mertsching B. Parallel k-means image segmentation using sort, scan and connected components on a GPU. In: Keller R, Kramer D, Weiss JP, editors. Facing the multicore-challenge III. vol. 7686 of lecture notes in computer science. Berlin: Springer; 2013. p. 108–20.

  34. Aziz MZ, Mertsching B. Pop-out and IOR in static scenes with region based visual attention. Bielefeld: Bielefeld University eCollections; 2007.

    Google Scholar 

  35. Ungerleider LG, Mishkin M. 18. In: Ingle DJ, Goodale M, Mansfield RJW, editors. Two Cortical Visual Systems; 1982. p. 549–86.

  36. Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci. 1992;15(1):20–5.

    Article  CAS  PubMed  Google Scholar 

  37. Goodale MA, Westwood DA. An evolving view of duplex vision: separate but interacting cortical pathways for perception and action. Curr Opin Neurobiol. 2004;14(2):203–11.

    Article  CAS  PubMed  Google Scholar 

  38. Tseng P, Tünnermann J, Roker-Knight N, Winter D, Scharlau I, Bridgeman B. Enhancing implicit change detection through action. Perception. 2010;39:1311–21.

    Article  PubMed  Google Scholar 

  39. Itti L. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Vis Cognit. 2005;12(6):1093–123.

    Article  Google Scholar 

  40. CRCNS. Collaborative research in computational neuroscience—data sharing. 2008. http://crcns.org/. Accessed Jun 2013.

  41. Deubel H, Schneider WX. Saccade target selection and object recognition: evidence for a common attentional mechanism. Vis Res. 1996;36(12):1827–37.

    Article  CAS  PubMed  Google Scholar 

  42. Malcolm GL, Henderson JM. Combining Top-down processes to guide eye movements during real-world scene search. J Vis. 2010;10(2):1–11.

    Google Scholar 

  43. Tseng PH, Carmi R, Cameron IGM, Munoz DP, Itti L. Quantifying center bias of observers in free viewing of dynamic natural scenes. J Vis. 2009;9(7):1–16.

    Google Scholar 

  44. PETS2001. 2nd IEEE international workshop on performance evaluation of tracking and surveillance. 2001. http://ftp.pets.rdg.ac.uk/PETS2001/DATASET1/TESTING/. Accessed 3 Jun 2013.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Tünnermann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tünnermann, J., Mertsching, B. Region-Based Artificial Visual Attention in Space and Time. Cogn Comput 6, 125–143 (2014). https://doi.org/10.1007/s12559-013-9220-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-013-9220-5

Keywords

Navigation