Abstract
In human vision, the processes and the representations involved in identifying specific individuals are frequently assumed to be different from those used for basic level classification, because classification is largely viewpoint-invariant, but identification is not. This assumption was tested in psychophysical experiments, in which objective similarity between stimuli (and, consequently, the level of their distinction) varied in a controlled fashion. Subjects were trained to discriminate between two classes of computer-generated three-dimensional objects, one resembling monkeys and the other, dogs. Both classes were defined by the same set of 56 parameters, which encoded sizes, shapes, and placement of the limbs, ears, snout, etc. Interpolation between parameter vectors of the class prototypes yielded shapes that changed smoothly between monkey and dog. Withinclass variation was induced in each trial by randomly perturbing all the parameters. After the subjects reached 90% correct performance on a fixed canonical view of each object, discrimination performance was tested for novel views that differed by up to 60 ° from the training view. In experiment 1 (in which the distribution of parameters in each class was unimodal) and in experiment 2 (bimodal classes), the stimuli differed only parametrically and consisted of the same geons (parts), yet were recognized virtually independently of viewpoint in the low-similarity condition. In experiment 3, the prototypes differed in their arrangement of geons, yet the subjects' performance depended significantly on viewpoint in the high-similarity condition. In all three experiments, higher interstimulus similarity was associated with an increase in the mean error rate and, for misorientation of up to 45 °, with an increase in the degree of viewpoint dependence. These results suggest that a geon-level difference between stimuli is neither strictly necessary nor always sufficient for viewpoint-invariant performance. Thus, basic and subordinate-level processes in visual recognition may be more closely related than previously thought.
Similar content being viewed by others
References
Amari S (1978) Feature spaces which admit and detect invariant signal transformations. Proc 4th Intl Conf Pattern Recognition, Tokyo, pp 452–456
Biederman I (1987) Recognition by components: a theory of human image understanding. Psychol Rev 94:115–147
Biederman I, Gerhardstein PC (1993) Recognizing depth-rotated objects: evidence and conditions for 3D viewpoint invariance. J Exp Psychol Hum Percept Perform 19 (in press)
Brown JM, Weisstein N, May JG (1992) Visual search for simple volumetric shapes. Percept Psychophys 51:40–48
Bülthoff HH, Edelman S (1992) Psychophysical support for a 2-D view interpolation theory of object recognition. Proc Natl Acad Sci89:60–64
Bülthoff HH, Mallot HA (1988) Interaction of depth modules: stereo and shading. J Opt Soc Am 5:1749–1758
Cavanagh P (1985) Local log polar frequency analysis in the striate cortex as a basis for size and orientation invariance. In: Rose D, Dobson VG (ed) Models of the visual cortex. Wiley, New York, pp 146–157
Edelman S (1991) Features of recognition. (CS-RT 91–10) Weizmann Institute of Science, Rehovot
Edelman S (1993a) Representation, similarity, and the chorus of prototypes. (CS-TR 93–10)Weizmann Institute of Science, Rehovot. Minds and Machines, 1994, in press
Edelman S (1993b) Representing 3D objects by sets of activities of receptive fields. Biol Cybern 70:37–45
Edelman S (1995) Representation of similarity in 3D object discrimination. Neural Computation 7:407–422
Edelman S, Bülthoff HH (1992) Orientation dependence in the recognition of familiar and novel views of 3D objects. Vision Res32:2385–2400
Edelman S, Poggio T (1992) Bringing the grandmother back into the picture: a memory-based view of object recognition. Int J Pattern Recog Artif Intell 6:37–62
Edelman S, Reisfeld D, Yeshurun Y (1992) Learning to recognize faces from examples. In: Sandini G (ed) Proceedings of 2nd European Conference on Computer Vision. (Lecture Notes in Computer Science, Vol 588). Springer, Berlin Heidelberg New York, pp 787–791
Flannagan MJ, Fried LS, Holyoak KJ (1986) Distributional expectations and the induction of category structure. J Exp Psychol Learning Memory Cogn 12:241–256
Freeman WT (1993) Exploiting the generic view assumption to estimate scene parameters. In: Proceedings of the 3rd International Conference on Computer Vision. IEEE, Washington, DC, pp 347–356
Fried LS, Holyoak KJ (1984) Induction of category distributions: a framework for classification learning. J Exp Psychol Learning Memory Cogn 10:234–257
Gilbert CD (1988) Neuronal and synaptic organization in the cortex. In: Rakic P, Singer W (ed) Neurobiology of neocortex. Wiley, New York, pp 219–240
Green DM, Swets JA (1966) Signal detection theory and psychophysics. Wiley, New York
Heydt von der R, Peterhans E, Baumgartner G (1984) Illusory contours and cortical neurons responses. Science 224:1260–1262
Hofstadter DR (1985) Metamagical themas. Viking, Harmondsworth
Hummel JE, Biederman I (1992) Dynamic binding in a neural network for shape recognition. Psychol Rev 99:480–517
Humphrey GK, Khan SC (1992) Recognizing novel views of threedimensional objects. Can J Psychol 46:170–190
Jolicoeur P (1990) Identification of disoriented objects: a dual-systems theory. Mind and Language 5:387–410
Jolicoeur P, Humphrey GK (1994) Perception of rotated two-dimensional and three-dimensional objects and visual shapes. In: Walsh V, Kulikowski J (ed) Perceptual constancies, Chap 10. Cambridge University Press, Cambridge, UK (in press)
Katz LC, Callaway EM (1992) Development of local circuits in mammalian visual cortex. Annu Rev Neurosci 15:31–56
Knill DC, Kersten D (1991) Ideal perceptual observers for computation, psychophysics and neural networks. In: Watt R (ed) Vision and visual dysfunction, Vol 14, Chap 7. Macmillan, London, pp 83–97
Kruskal JB, Wish M (1978) Multidimensional scaling. Sage Publications, Beverly Hills
Lowe DG (1986) Perceptual organization and visual recognition. Kluwer Academic, Boston
Maddox WT, Ashby FG (1993) Comparing decision bound and exemplar models of categorization. Percept Psychophys 53:49–70
Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. Neural Comp 1:281–289
Moses Y, Ullman S (1992) Limitations of non model-based recognition schemes. In Sandini G (ed) Proceedings of 2nd Euopean Conference on Computer Vision. (Lecture Notes in Computer Science, Vol 588) Springer, Berlin Heidelberg New York, pp 820–828
Nishihara HK, Poggio T (1984) Stereo vision for robotics. In: Brady JM, Paul R (ed) Robotics research: the first international symposium. MIT Press, Cambridge, Mass. pp 489–505
Nosofsky RM (1988) Exemplar-based accounts of relations between classification, recognition, and typicality. J Exp Psychol Learning Memory Cogn 14:700–708
Palmer SE, Rosch E, Chase P (1981) Canonical perspective and the perception of objects. In: Long J, Baddeley A (ed) Attention and performance IX. Erlbaum, Hillsdale, pp 135–151
Poggio T, Edelman S (1990) A network that learns to recognize threedimensional objects. Nature 343:263–266
Poggio T, Fahle M, Edelman S (1992) Fast perceptual learning in visual hyperacuity. Science 256:1018–1021
Polat U, Sagi D (1993) Lateral interactions between spatial channels: uppression and facilitation revealed by lateral masking experiments. Vision Res 33:993–997
Polat U, Sagi D (1994) The architecture of perceptual spatial interactions. Vision Res 34:73–78
Richards W, Jepson A (1992) What makes a good feature? (A. I. Memo No. 1356) Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Mass.
Rock I, DiVita J (1987) A case of viewer-centered object perception. Cogn Psychol 19:280–293
Saha A, Keeler JD (1990) Algorithms for better representation and faster learning in radial basis function networks. In: Touretzky D (ed) Neural information processing systems, Vol 2. Morgan Kaufmann, San Mateo, pp 482–489
Snippe HP, Koenderink JJ (1992) Discrimination thresholds for channel-coded systems. Biol Cybern 66:543–551
Snodgrass JG, Vanderwart M (1980) A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. J Exp Psychol Hum Learning Memory 6:174–215
Tarr M, Pinker S (1989) Mental rotation and orientation-dependence in shape recognition. Cogn Psychol 21:233–282
Ullman S (1989) Aligning pictorial descriptions: an approach to object recognition. Cognition 32:193–254
Ullman S, Basri R (1991) Recognition by linear combinations of models. IEEE Trans Pattern Anal Mach Intell 13:992–1005
Weiss Y, Edelman S (1993) Representation with receptive fields: gearing up for recognition. (CS-TR 93–09) Weizmann Institute of Science, Rehovot, Network, in press
Weisstein N, Harris CS (1974) Visual detection of line segments: an object-superiority effect. Science 186:752–755
Wilson HR, Bergen JR (1979) A four mechanism model for threshold spatial vision. Vision Res 19:19–32
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Edelman, S. Class similarity and viewpoint invariance in the recognition of 3D objects. Biol. Cybern. 72, 207–220 (1995). https://doi.org/10.1007/BF00201485
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00201485