Skip to main content
Log in

Class similarity and viewpoint invariance in the recognition of 3D objects

  • Published:
Biological Cybernetics Aims and scope Submit manuscript

Abstract

In human vision, the processes and the representations involved in identifying specific individuals are frequently assumed to be different from those used for basic level classification, because classification is largely viewpoint-invariant, but identification is not. This assumption was tested in psychophysical experiments, in which objective similarity between stimuli (and, consequently, the level of their distinction) varied in a controlled fashion. Subjects were trained to discriminate between two classes of computer-generated three-dimensional objects, one resembling monkeys and the other, dogs. Both classes were defined by the same set of 56 parameters, which encoded sizes, shapes, and placement of the limbs, ears, snout, etc. Interpolation between parameter vectors of the class prototypes yielded shapes that changed smoothly between monkey and dog. Withinclass variation was induced in each trial by randomly perturbing all the parameters. After the subjects reached 90% correct performance on a fixed canonical view of each object, discrimination performance was tested for novel views that differed by up to 60 ° from the training view. In experiment 1 (in which the distribution of parameters in each class was unimodal) and in experiment 2 (bimodal classes), the stimuli differed only parametrically and consisted of the same geons (parts), yet were recognized virtually independently of viewpoint in the low-similarity condition. In experiment 3, the prototypes differed in their arrangement of geons, yet the subjects' performance depended significantly on viewpoint in the high-similarity condition. In all three experiments, higher interstimulus similarity was associated with an increase in the mean error rate and, for misorientation of up to 45 °, with an increase in the degree of viewpoint dependence. These results suggest that a geon-level difference between stimuli is neither strictly necessary nor always sufficient for viewpoint-invariant performance. Thus, basic and subordinate-level processes in visual recognition may be more closely related than previously thought.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amari S (1978) Feature spaces which admit and detect invariant signal transformations. Proc 4th Intl Conf Pattern Recognition, Tokyo, pp 452–456

  • Biederman I (1987) Recognition by components: a theory of human image understanding. Psychol Rev 94:115–147

    Google Scholar 

  • Biederman I, Gerhardstein PC (1993) Recognizing depth-rotated objects: evidence and conditions for 3D viewpoint invariance. J Exp Psychol Hum Percept Perform 19 (in press)

  • Brown JM, Weisstein N, May JG (1992) Visual search for simple volumetric shapes. Percept Psychophys 51:40–48

    Google Scholar 

  • Bülthoff HH, Edelman S (1992) Psychophysical support for a 2-D view interpolation theory of object recognition. Proc Natl Acad Sci89:60–64

    Google Scholar 

  • Bülthoff HH, Mallot HA (1988) Interaction of depth modules: stereo and shading. J Opt Soc Am 5:1749–1758

    Google Scholar 

  • Cavanagh P (1985) Local log polar frequency analysis in the striate cortex as a basis for size and orientation invariance. In: Rose D, Dobson VG (ed) Models of the visual cortex. Wiley, New York, pp 146–157

    Google Scholar 

  • Edelman S (1991) Features of recognition. (CS-RT 91–10) Weizmann Institute of Science, Rehovot

    Google Scholar 

  • Edelman S (1993a) Representation, similarity, and the chorus of prototypes. (CS-TR 93–10)Weizmann Institute of Science, Rehovot. Minds and Machines, 1994, in press

    Google Scholar 

  • Edelman S (1993b) Representing 3D objects by sets of activities of receptive fields. Biol Cybern 70:37–45

    Google Scholar 

  • Edelman S (1995) Representation of similarity in 3D object discrimination. Neural Computation 7:407–422

    Google Scholar 

  • Edelman S, Bülthoff HH (1992) Orientation dependence in the recognition of familiar and novel views of 3D objects. Vision Res32:2385–2400

    Google Scholar 

  • Edelman S, Poggio T (1992) Bringing the grandmother back into the picture: a memory-based view of object recognition. Int J Pattern Recog Artif Intell 6:37–62

    Google Scholar 

  • Edelman S, Reisfeld D, Yeshurun Y (1992) Learning to recognize faces from examples. In: Sandini G (ed) Proceedings of 2nd European Conference on Computer Vision. (Lecture Notes in Computer Science, Vol 588). Springer, Berlin Heidelberg New York, pp 787–791

    Google Scholar 

  • Flannagan MJ, Fried LS, Holyoak KJ (1986) Distributional expectations and the induction of category structure. J Exp Psychol Learning Memory Cogn 12:241–256

    Google Scholar 

  • Freeman WT (1993) Exploiting the generic view assumption to estimate scene parameters. In: Proceedings of the 3rd International Conference on Computer Vision. IEEE, Washington, DC, pp 347–356

    Google Scholar 

  • Fried LS, Holyoak KJ (1984) Induction of category distributions: a framework for classification learning. J Exp Psychol Learning Memory Cogn 10:234–257

    Google Scholar 

  • Gilbert CD (1988) Neuronal and synaptic organization in the cortex. In: Rakic P, Singer W (ed) Neurobiology of neocortex. Wiley, New York, pp 219–240

    Google Scholar 

  • Green DM, Swets JA (1966) Signal detection theory and psychophysics. Wiley, New York

    Google Scholar 

  • Heydt von der R, Peterhans E, Baumgartner G (1984) Illusory contours and cortical neurons responses. Science 224:1260–1262

    Google Scholar 

  • Hofstadter DR (1985) Metamagical themas. Viking, Harmondsworth

    Google Scholar 

  • Hummel JE, Biederman I (1992) Dynamic binding in a neural network for shape recognition. Psychol Rev 99:480–517

    Google Scholar 

  • Humphrey GK, Khan SC (1992) Recognizing novel views of threedimensional objects. Can J Psychol 46:170–190

    Google Scholar 

  • Jolicoeur P (1990) Identification of disoriented objects: a dual-systems theory. Mind and Language 5:387–410

    Google Scholar 

  • Jolicoeur P, Humphrey GK (1994) Perception of rotated two-dimensional and three-dimensional objects and visual shapes. In: Walsh V, Kulikowski J (ed) Perceptual constancies, Chap 10. Cambridge University Press, Cambridge, UK (in press)

    Google Scholar 

  • Katz LC, Callaway EM (1992) Development of local circuits in mammalian visual cortex. Annu Rev Neurosci 15:31–56

    Google Scholar 

  • Knill DC, Kersten D (1991) Ideal perceptual observers for computation, psychophysics and neural networks. In: Watt R (ed) Vision and visual dysfunction, Vol 14, Chap 7. Macmillan, London, pp 83–97

    Google Scholar 

  • Kruskal JB, Wish M (1978) Multidimensional scaling. Sage Publications, Beverly Hills

    Google Scholar 

  • Lowe DG (1986) Perceptual organization and visual recognition. Kluwer Academic, Boston

    Google Scholar 

  • Maddox WT, Ashby FG (1993) Comparing decision bound and exemplar models of categorization. Percept Psychophys 53:49–70

    Google Scholar 

  • Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. Neural Comp 1:281–289

    Google Scholar 

  • Moses Y, Ullman S (1992) Limitations of non model-based recognition schemes. In Sandini G (ed) Proceedings of 2nd Euopean Conference on Computer Vision. (Lecture Notes in Computer Science, Vol 588) Springer, Berlin Heidelberg New York, pp 820–828

    Google Scholar 

  • Nishihara HK, Poggio T (1984) Stereo vision for robotics. In: Brady JM, Paul R (ed) Robotics research: the first international symposium. MIT Press, Cambridge, Mass. pp 489–505

    Google Scholar 

  • Nosofsky RM (1988) Exemplar-based accounts of relations between classification, recognition, and typicality. J Exp Psychol Learning Memory Cogn 14:700–708

    Google Scholar 

  • Palmer SE, Rosch E, Chase P (1981) Canonical perspective and the perception of objects. In: Long J, Baddeley A (ed) Attention and performance IX. Erlbaum, Hillsdale, pp 135–151

    Google Scholar 

  • Poggio T, Edelman S (1990) A network that learns to recognize threedimensional objects. Nature 343:263–266

    Google Scholar 

  • Poggio T, Fahle M, Edelman S (1992) Fast perceptual learning in visual hyperacuity. Science 256:1018–1021

    Google Scholar 

  • Polat U, Sagi D (1993) Lateral interactions between spatial channels: uppression and facilitation revealed by lateral masking experiments. Vision Res 33:993–997

    Google Scholar 

  • Polat U, Sagi D (1994) The architecture of perceptual spatial interactions. Vision Res 34:73–78

    Google Scholar 

  • Richards W, Jepson A (1992) What makes a good feature? (A. I. Memo No. 1356) Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Mass.

    Google Scholar 

  • Rock I, DiVita J (1987) A case of viewer-centered object perception. Cogn Psychol 19:280–293

    Google Scholar 

  • Saha A, Keeler JD (1990) Algorithms for better representation and faster learning in radial basis function networks. In: Touretzky D (ed) Neural information processing systems, Vol 2. Morgan Kaufmann, San Mateo, pp 482–489

    Google Scholar 

  • Snippe HP, Koenderink JJ (1992) Discrimination thresholds for channel-coded systems. Biol Cybern 66:543–551

    Google Scholar 

  • Snodgrass JG, Vanderwart M (1980) A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. J Exp Psychol Hum Learning Memory 6:174–215

    Google Scholar 

  • Tarr M, Pinker S (1989) Mental rotation and orientation-dependence in shape recognition. Cogn Psychol 21:233–282

    Google Scholar 

  • Ullman S (1989) Aligning pictorial descriptions: an approach to object recognition. Cognition 32:193–254

    Google Scholar 

  • Ullman S, Basri R (1991) Recognition by linear combinations of models. IEEE Trans Pattern Anal Mach Intell 13:992–1005

    Google Scholar 

  • Weiss Y, Edelman S (1993) Representation with receptive fields: gearing up for recognition. (CS-TR 93–09) Weizmann Institute of Science, Rehovot, Network, in press

    Google Scholar 

  • Weisstein N, Harris CS (1974) Visual detection of line segments: an object-superiority effect. Science 186:752–755

    Google Scholar 

  • Wilson HR, Bergen JR (1979) A four mechanism model for threshold spatial vision. Vision Res 19:19–32

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Edelman, S. Class similarity and viewpoint invariance in the recognition of 3D objects. Biol. Cybern. 72, 207–220 (1995). https://doi.org/10.1007/BF00201485

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00201485

Keywords

Navigation