Elsevier

Neurocomputing

Volume 69, Issues 16–18, October 2006, Pages 2112-2126
Neurocomputing

Stochastic saliency-based search model for search asymmetry with uncertain targets

https://doi.org/10.1016/j.neucom.2005.09.009Get rights and content

Abstract

Whereas some models with mechanism coding task-related information, e.g. task-related saliency map, can explain why search asymmetry occurs in some visual search tasks where the target is preliminarily defined, existing models cannot clearly explain human behavior in singleton search tasks showing search asymmetry. This study shows that the saliency-based search model can explain search asymmetry in singleton search without assuming task-related saliency map if we introduce two new essential ideas. The first improvement is about a competitive mechanism between feature maps, and the other is the introduction of a stochastic winner takes all (WTA) network. According to the modified model introducing these two modifications, search asymmetry in singleton search tasks results from an asymmetry in probability of the attentional focus being directed to the target. When the saliency difference is large, then attentional focus is almost always directed to the target first. On the other hand, when the saliency difference is relatively small, then the probability of the target receiving attentional focus first becomes relatively smaller. The stochastic saliency-based search model could reproduce human behavior in a range of visual search tasks showing search asymmetry, without assuming task-related saliency map.

Introduction

In order to clarify the mechanism of visual attention, visual search task is frequently used in experimental psychology. Human performance in visual search task is generally characterized in terms of the slope of the response time, defined as the time interval between exhibition of the visual stimulus and the response. A visual search in which the response time is almost entirely independent of the number of distractors is generally called an ‘efficient search’, while an ‘inefficient search’ is one in which the response time is almost directly proportional to the number of distractors.

One of the most interesting phenomena in visual search task is ‘search asymmetry’. Search asymmetry is a phenomenon in which the search for target stimulus A among distractors B is more efficient than vice versa [46]. Many pairs of stimuli are known to produce search asymmetry, such as long and short bars, a single line and pairs of lines, vertical and slightly oblique bars, circles and ellipses [45], ‘live’ and ‘dead’ elephants [52], and regular and reversed Chinese characters [39]. Little is known about the underlying mechanism of search asymmetry. Although some model studies have concluded that prior knowledge about the task is the ingredient essential for search asymmetry [29], [50], the question about the role of prior knowledge in search asymmetry is still under discussion. Using ‘singleton search task’, Saiki et al. [37] reported that search asymmetry occurred even if human participant could not use prior knowledge about a target stimulus. In most usual visual search tasks, information about a target stimulus is presented to a human participant beforehand, and the participant was required to search for a given target in stimulus array. On the other hand, in ‘singleton search task’ used in [37], human participants were required to search for singleton target in stimulus array, and a possible target stimulus switches from trial to trial. Thus, in order to search for a potential target stimulus, participants could not use any information about attributes of visual stimulus. For example, in ‘singleton search task’ with Q-shaped and O-shaped stimulus [37], Q-shaped stimulus is a target and O-shaped stimulus was distractor in some trials. In other trials, O- and Q-shaped stimulus was a target and a distractor stimulus, respectively. A trial with O- and Q-shaped target was randomly selected from trial to trial, so that participant had to judge whether a singleton item existed in a presented stimulus set without using prior knowledge about a potential target item. Saiki et al. [37] reported that search asymmetry occurred even in the singleton search task with uncertain targets, and the result indicated the possibility that prior knowledge about the task at that time is not the ingredient essential for search asymmetry.

Existing neural network models hardly account for search asymmetry in the ‘singleton search task’ with uncertain targets. For example, saliency-based search model with a task-related saliency map [29] can explain human performance in some visual search tasks showing search asymmetry; however, the model cannot explain search asymmetry in singleton search because of the lack of prior knowledge about the task in the singleton search. To explain why search asymmetry occurs even in singleton search tasks, we have to explain why the original saliency-based search model cannot account for search asymmetry, and have to make some improvements on the existing saliency-based model [14].

Before we determine why the original saliency-based search model may not account for search asymmetry, it will be useful to discuss the dynamics of the model. The original saliency-based search model assumes that, in visual search task, using a winner takes all (WTA) mechanism, human participants sequentially deploy a focus of attention from object to object (Exactly speaking, many models using WTA mechanism assumed that attentional focus can shift only from location to location [14], [20]. A mechanism for shifting attention from object to object is still under discussion [49]). According to the attention model assuming serial deployment of attentional focus, e.g. original saliency-based search model, variation of response time mainly results from the mechanism deploying the focus of attention, i.e. only RTshift in Eq. (1) has an essential effect on variation of response time (see Fig. 1), whereas another study claimed that varied response time results from decision mechanism [1], [30].RT=RTshift+RTdecision+RTresponse++RTnoise=RTshift+RTnonshift.Thus, if the anticipated number of shifts of attentional focus before target detection is large and small, then response time becomes long and short, respectively. The essential hypothesis in the original saliency-based search model is that the attentional focus is directed to the most salient location first, and the second most salient location next, thus the order of directing the attentional focus is identical to the rank-order of visual saliency. If this is the case, response time is defined by the anticipated number of shifts of attentional focus before detecting the target stimulus.

Then, why may not this original saliency-based search model explain human behavior in visual search tasks showing search asymmetry? Suppose that stimulus A (e.g., Q-shaped stimulus) is more salient than stimulus B (e.g., O-shaped stimulus), and that the target is A and distractors are B. In this case, the target is the most salient among distractors, and a focus of attention always directs to the target first regardless of the number of distractors and differences in saliency between the target and distractors. Conversely, when the target is B and distractors are A, now the target B should be the least salient object. If this is the case, the target can be attended only after an attentional focus directs to every other distractor, thus the number of anticipated shifts before target detection depends on the number of distractors. Deco et al. [2] showed that the serial exhaustive search model such as original saliency-based search model might only show two extremes of its behavior, i.e. very efficient or inefficient, whereas Deco's model of attention [2] could show continuously variable search efficiency as a function of difference of visual saliency between target stimulus and distractor stimulus. The limitations of the serial exhaustive search model in explaining search asymmetry in singleton search task originates from following two essential assumptions in the model: (1) a target becomes the least salient object in certain case, and (2) order of deploying attentional focus is identical to the rank order of visual saliency regardless of saliency differences between the target and distractors.

The purpose of this study is to propose a modified saliency-based search model without any mechanism utilizing prior knowledge about the task, and to show that the proposed model can account for human behavior in some singleton search tasks showing search asymmetry. The first modification is the introduction of a competitive mechanism between feature maps to represent visual features that have a small but fundamental effect on visual saliency. This inter-feature competition prevents the target from being the least salient location. The second modification introduces the idea that the attention spotlight is deployed to a certain location based on saliency information in a stochastic or probabilistic way through the use of a stochastic WTA mechanism. This stochastic deployment of the spotlight allows for the inclusion of information on the difference of saliency between the target and distractors. In the case of a search target with markedly higher saliency compared to distractors, the target attracts the attentional spotlight strongly, resulting in an efficient and easy search. As the saliency of the target approaches that of any of the distractors, the search efficiency gradually becomes less efficient due to a decrease and increase in probability for directing an attentional focus to the target and distractors, respectively.

These two modifications provided a model that better explains human behavior during visual search tasks showing search asymmetry. The performance of the proposed model with respect to search asymmetry phenomena was highly consistent with psychological evidence, much more so than the original saliency-based search model with deterministic dynamics.

In the remainder of this paper, the details of the proposed neural network model, which shares some ideas with the original saliency-based search model [13], [14], [16], are presented, and the differences between deterministic and stochastic dynamics for deployment of the attentional spotlight are clarified. The ability for the proposed model to reproduce human behavior in some search asymmetry tasks is then examined.

Section snippets

Model

Although the representation of spatial visual attention using the spotlight analogy was well accepted, little is known regarding how the attentional spotlight is driven and what information guides the spotlight to a certain location. Koch and Ullman [20] have shown that visual saliency computed by visual input image is a possible candidate as an index that guides the attentional spotlight to the optimal location. The original saliency-based search model proposed in [20] was later modified in

Estimation of response time

In usual visual search tasks, participants are required to discriminate whether or not the stimulus set presented on a display includes a certain target stimulus. In such tasks, a trial is referred to as a ‘present trial’ if the target is present, and an ‘absent trial’ if not. Although many psychological studies have discussed the potential mechanism involved in visual search tasks based on the results of present and absent trials, it remains unclear how participants conclude that no target

Simulation

Simulations were conducted for the proposed stochastic saliency-based search model to investigate whether the model accurately mimics the search performance of humans in some search asymmetry tasks. The architecture of the model and all parameters were maintained constant throughout the various scenarios. The tables of parameters are shown in Table 1.

Three examples are given below to demonstrate how the stochastic saliency-based search model reproduces search asymmetry phenomena. These

Discussion

The original saliency-based search model [14] could not explain why search asymmetry occurs even in singleton search task [37]. This study presumed that limit of the original model on explaining search asymmetry in singleton search task results from characteristics that the model can show only two extremes of search efficiency, very efficient or inefficient. This binary efficiency results from two intrinsic characteristics of the original saliency-based search model. First, a target can be

Acknowledgments

This work was supported by the 21st Century Center of Excellence Program from JMEXT (D-2 to Kyoto University), and PRESTO “Intelligent Cooperation and Control” from the Japan Science and Technology Agency.

Takahiko Koike took his B.S. from College of Engineering Systems, University of Tsukuba, and his M.S. from Graduate School of Informatics, Kyoto University. He is currently continuing the study toward Ph.D. at the graduate school of Human and Environmental Studies, Kyoto University. His current research activities are related to the mechanism for selective visual attention.

References (53)

  • R.P.N. Rao et al.

    Eye movements in iconic visual search

    Vision Res.

    (2002)
  • P. Van de Laar et al.

    Task-dependent learning of attention

    Neural Networks

    (1997)
  • D.L. Wang

    Object selection based on oscillatory correlation

    Neural Networks

    (1999)
  • R. Desimone et al.

    Neural mechanisms of selective visual attention

    Ann. Rev. Neurosci.

    (1995)
  • M. D’zmura

    Color in visual search

    Vision Res.

    (1991)
  • T. Fukai et al.

    A simple neural network exhibiting selective activation of neuronal ensembles: from winner-take-all to winners-share-all

    Neural Comput.

    (1997)
  • W. Gerstner

    Spiking neurons

  • C.D. Gilbert et al.

    Clustered intrinsic connections in cat visual cortex

    J. Neurosci.

    (1983)
  • J.P. Gottlieb et al.

    The representation of visual salience in monkey parietal cortex

    Nature

    (1998)
  • H. Greenspan, S. Belongie, R. Goodman, P. Perona, S. Rakshit, C.H. Anderson, Overcomplete steerable pyramid filters and...
  • N. Haslam et al.

    Visual search: efficiency continuum or distinct processes?

    Psychonom. Bull. Rev.

    (2001)
  • T.S. Horowitz et al.

    Visual search has no memory

    Nature

    (2000)
  • L. Itti, C. Koch, A comparison of feature combination strategies for saliency-based visual attention system, in: SPIE...
  • L. Itti et al.

    Computational modeling of visual attention

    Nature Rev. Neurosci.

    (2001)
  • L. Itti et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • J.J. Kninerm et al.

    Neuronal responses to static texture patterns in area V1 of the alert macaque monkey

    J. Neurophysiol.

    (1992)
  • Cited by (0)

    Takahiko Koike took his B.S. from College of Engineering Systems, University of Tsukuba, and his M.S. from Graduate School of Informatics, Kyoto University. He is currently continuing the study toward Ph.D. at the graduate school of Human and Environmental Studies, Kyoto University. His current research activities are related to the mechanism for selective visual attention.

    Jun Saiki received his Ph.D. from Department of Psychology, University of California, Los Angeles. He is currently associate professor at Graduate School of Human and Environmental Studies, Kyoto University. He is interested in visual cognition, in particular, visual attention, and visual working memory, and conducting various research projects on these topics using psychophysics, fMRI experiments, and computational modeling.

    View full text