Abstract
A classifier is cardinality invariant if it can classify more than one token of a single type at a time. We present a convolutional neural network (CNN) model of inferotemporal cortex (IT) and show that it is cardinality invariant. While the CNN is designed with translation invariance in mind, cardinality invariance is an emergent property. We speculate that translation invariance may lead to cardinality invariance in general, and particularly in IT. Recent investigations have shown that cells in IT are indeed cardinality blind. We also explore the implications of a cardinality blind classifier for vision overall, concentrating on visual attention and search.
Similar content being viewed by others
References
Corbett GG (2000) Number. Cambridge textbooks in linguistics. Cambridge University Press, Cambridge
DiCarlo JJ, Maunsell JHR (2003) Anterior inferotemporal neurons of monkeys engaged in object recognition can be highly sensitive to object retinal position. J Neurophysiol 89: 3264–278
Duncan J, Humphreys GW (1989) Visual search and stimulus similarity. Psychol Rev 91(3): 433–58
Gillam B (2001) Varieties of grouping and its role in determining surface layout. In: Shipley T, Kellman P (eds) From fragments to objects: segmentation and grouping in vision. Elsevier, Amsterdam, pp 247–64
Goodale MA, Milner AD (1992) Separate visual pathways for perception and action. Trends Neurosci 15(1): 20–5
van der Heijden A (1975) Some evidence for a limited capacity parallel selfterminating process in simple visual search tasks. Acta Psychol 39: 21–1
Hubel DH, Wiesel T (1968) Receptive fields and functional architecture of monkey striate cortex. J Physiol 195: 215–43
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time-series. In: Arbib MA (eds) The handbook of brain theory and neural networks. MIT Press, Cambridge
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1992) Handwritten digit recognition with a back-propagation network. In: Lisboa PGJ (eds) Neural networks: Current applications. Chapman and Hall, London
Logothetis NK, Sheinberg DL (1996) Visual object recognition. Ann Rev Neurosci 19: 577–21
Marr D (1982) Vision: A Computational investigation into the human representation and processing of visual information. W.H. Freeman, San Francisco
Mesulam MM (1998) From sensation to cognition. Brain 121: 1013–052
Mozer MC, Sitton M (1998) Computational modeling of spatial attention. In: Pashler HE (eds) Attention. Psychology Press, Hove, pp 341–93
Nieder A, Miller EK (2004) A parieto-frontal network for visual numerical information in the monkey. Proc Nat Acad Sci 101(19): 7457–462
Riedmiller M (1994) Rprop - description and implementation details. Tech. rep., Institut für Logik, Komplexität und Deduktionssyteme, University of Karlsruhe
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nature Neurosci 2(11): 1019–025
Riesenhuber M, Poggio T (2002) Neural mechanisms of object recognition. Current Opin Neurobiol 12: 162–68
Ritter E (1988) A head-movement approach to construct-state noun phrases. Linguistics 26: 909–29
Ritter E (1995) On the syntactic category of pronouns and agreement. Natural Lang Linguist Theory 13: 405–43
Rumelhart D, Hinton G, Williams R (1986) Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructures of cognition, vol 1: Foundations. MIT Press, Cambridge, chap 8, pp 318–62
Serre T, Kouh M, Cadieu C, U Knoblich GK, Poggio T (2005) A theory of object recognition: Computations and circuits in the feedforward path of the ventral stream in primate visual cortex. Tech. Rep. MIT-CSAIL-TR-2005-082, MIT
Treisman A (1998) The perception of features and objects. In: Wright R (ed) Visual attention, chap 2. Oxford University Press, pp 27–3
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogni Psychol 12: 97–36
Ungerleider LA, Mishkin M (1982) Two cortical visual systems. In: Ingle DJ, Goodale MA, Mansfield RJ (eds) Analysis of visual behavior. MIT Press, Cambridge, pp 549–86
Zamparelli R (2000) Layers in the determiner phrase. Garland, New York
Zoccolan D, Cox DD, DiCarlo JJ (2005) Multiple object response normalization in monkey inferotemporal cortex. J Neurosci 25(36): 8150–164
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Walles, H., Knott, A. & Robins, A. A model of cardinality blindness in inferotemporal cortex. Biol Cybern 98, 427–437 (2008). https://doi.org/10.1007/s00422-008-0229-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-008-0229-x