Skip to main content
Log in

Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

One of the difficulties of object recognition stems from the need to overcome the variability in object appearance caused by pose and other factors, such as illumination. The influence of these factors can be countered by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Difficulties of another kind arise in daily life situations that require categorization, rather than recognition, of objects. Although categorization cannot rely on interpolation between stored examples, we show that knowledge of several representative members, or prototypes, of each of the categories of interest can provide the necessary computational substrate for the categorization of new instances. We describe a system that represents input shapes by their similarities to several prototypical objects, and show that it can recognize new views of the familiar objects, discriminate among views of previously unseen shapes, and attribute the latter to familiar categories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adini, Y., Moses, Y., and Ullman, S. 1997. Face recognition: the problem of compensating for illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:721–732.

    Article  Google Scholar 

  • Amit, Y. and Geman, D. 1997. Shape quantization and recognition with randomized trees. Neural Computation, 9:1545–1588.

    Google Scholar 

  • Basri, R. 1996. Recognition by prototypes. International Journal of Computer Vision, 19:147–168.

    Google Scholar 

  • Baxter, J. 1997. The canonical distortion measure for vector quantization and function approximation. In Proc. 14th Intl. Conf. on Machine Learning, D.H. Fisher, J. (Ed.), Nashville, TN, pp. 39–47.

  • Biederman, I. 1987. Recognition by components: a theory of human image understanding. Psychol. Review, 94:115–147.

    Google Scholar 

  • Biederman, I. and Ju, G. 1988. Surface versus edge-based determinants of visual recognition. Cognitive Psychology, 20:38–64.

    Google Scholar 

  • Breuel, T.M. 1992. Geometric Aspects of Visual Object Recognition. Ph.D. Thesis, MIT.

  • Broomhead, D.S. and Lowe, D. 1988. Multivariable functional interpolation and adaptive networks. Complex Systems, 2:321–355.

    Google Scholar 

  • Burge, M., Burger, W., and Mayr, W. 1997. Recognition and learning with polymorphic structural components. Journal of Computing and Information Technology, 4:39–51.

    Google Scholar 

  • Burl, M., Weber, M., Leung, T., and Perona, P. 1998. From Segmentation to Interpretation and Back: Mathematical Methods in Computer Vision, T. Noons, E., J. Pauwels, and L.J. van God (Eds.), chapter “Recognition of Visual Object Classes”, Springer-Verlag, in press.

  • Colin de Verdiére, V. and Crowley, J.L. 1998. Visual recognition using local appearance. In Proc. 4th Europ. Conf. Comput. Vision, H. Burkhardt and B. Neumann (Eds.), LNCS-Series vol. 1406–1407, Springer-Verlag, vol. 1, pp. 640–654.

  • Cover, T. and Hart, P. 1967. Nearest neighbor pattern classification. IEEE Trans. on Information Theory, IT-13:21–27.

    Google Scholar 

  • Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. Wiley: New York.

    Google Scholar 

  • Duvdevani-Bar, S. 1997. Similarity to Prototypes in 3D Shape Representation. Ph.D. Thesis, Weizmann Institute of Science.

  • Duvdevani-Bar, S., Edelman, S., Howell, A.J., and Buxton, H. 1998. A similarity-based method for the generalization of face recognition over pose and expression. In Proc. 3rd Intl. Symposium on Face and Gesture Recognition (FG98), S. Akamatsu and K. Mase (Eds.), Washington, DC. IEEE, pp. 118–123.

    Google Scholar 

  • Edelman, S. 1993. On learning to recognize 3D objects from examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:833–837.

    Google Scholar 

  • Edelman, S. 1995. Representation, Similarity, and the Chorus of Prototypes. Minds and Machines, 5:45–68.

    Google Scholar 

  • Edelman, S. 1998. Representation is representation of similarity. Behavioral and Brain Sciences, 21:449–498.

    Google Scholar 

  • Edelman, S. and Duvdevani-Bar, S. 1997a. Similarity-based viewspace interpolation and the categorization of 3D objects. In Proc. Similarity and Categorization Workshop, Dept. of AI, University of Edinburgh, pp. 75–81,.

  • Edelman, S. and Duvdevani-Bar, S. 1997b. Similarity, connectionism, and the problem of representation in vision. Neural Computation, 9:701–720.

    Google Scholar 

  • Edelman, S. and Intrator, N. 1997. Learning as extraction of lowdimensional representations. In Mechanisms of Perceptual Learning, D. Medin, R. Goldstone, and P. Schyns (Eds.), Academic Press, pp. 353–380.

  • Edelman, S., Reisfeld, D., and Yeshurun, Y. 1992. Learning to recognize faces from examples. In Proc. 2nd European Conf. on Computer Vision, Lecture Notes in Computer Science, G. Sandini (Ed.), Springer Verlag, vol. 588, pp. 787–791.

  • Fillenbaum, S. and Rapoport, A. 1979. Structures in the Subjective Lexicon. Academic Press: New York.

    Google Scholar 

  • Gersho, A. and Gray, R.M. 1992. Vector Quantization and Signal Compression. Kluwer Academic Publishers: Boston.

    Google Scholar 

  • Jacobs, D.W. 1996. The space requirements of indexing under perspective projections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18:330–333.

    Google Scholar 

  • Jolicoeur, P., Gluck, M., and Kosslyn, S.M. 1984. Pictures and names: making the connection. Cognitive Psychology, 16:243–275.

    Google Scholar 

  • Kanatani, K. 1990. Group-Theoretical Methods in Image Understanding. Springer: Berlin.

    Google Scholar 

  • Kendall, D.G. 1984. Shape manifolds, Procrustean metrics and complex projective spaces. Bull. Lond. Math. Soc., 16:81–121.

    Google Scholar 

  • Lando, M. and Edelman, S. 1995. Receptive field spaces and classbased generalization from a single view in face recognition. Network, 6:551–576.

    Google Scholar 

  • Linde, Y., Buzo, A., and Gray, R. 1980. An algorithm for vector quantizer design. IEEE Transactions on Communications, COM-28:84–95.

    Google Scholar 

  • Lowe, D.G. 1986. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers: Boston, MA.

    Google Scholar 

  • Lowe, D.G. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 31:355–395.

    Article  Google Scholar 

  • MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. Proc. 5th Berkeley Symposium, 1:281–297.

    Google Scholar 

  • Marr, D. and Nishihara, H.K. 1978. Representation and recognition of the spatial organization of three dimensional structure. In Proceedings of the Royal Society of London B, vol. 200, pp. 269–294.

    Google Scholar 

  • Mel, B. 1997. SEEMORE: Combining color, shape, and texture histogramming in a neurally-inspired approach to visual object recognition. Neural Computation, 9:777–804.

    Google Scholar 

  • Moody, J. and Darken, C. 1989. Fast learning in networks of locally tuned processing units. Neural Computation, 1:281–289.

    Google Scholar 

  • Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3D objects from appearance. International Journal of Computer Vision, 14:5–24.

    Google Scholar 

  • Nelson, R.C. and Selinger, A. 1998. Large-scale tests of a keyed, appearance-based 3-D object recognition system. Vision Research, 38:2469–2488.

    Google Scholar 

  • Palmer, S.E., Rosch, E., and Chase, P. 1981. Canonical perspective and the perception of objects. In Attention and Performance IX, J. Long and A. Baddeley (Eds.), Erlbaum: Hillsdale, NJ, pp. 135–151.

    Google Scholar 

  • Poggio, T. and Edelman, S. 1990. A network that learns to recognize three-dimensional objects. Nature, 343:263–266.

    Google Scholar 

  • Poggio, T. and Girosi, F. 1989. A theory of networks for approximation and learning. A.I. Memo No. 1140, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.

  • Poggio, T. and Girosi, F. 1990. Regularization algorithms for learning that are equivalent to multilayer networks. Science, 247:978–982.

    Google Scholar 

  • Poggio, T. and Vetter, T. 1992. Recognition and structure from one 2D model view: observations on prototypes, object classes, and symmetries. A.I. Memo No. 1347, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.

  • Price, C.J. and Humphreys, G.W. 1989. The effects of surface detail on object categorization and naming. Quarterly J. Exp. Psych. A, 41:797–828.

    Google Scholar 

  • Riesenhuber, M. and Poggio, T. 1998. Just one view: Invariances in inferotemporal cell tuning. In Advances in Neural Information Processing, M.I. Jordan, M.J. Kearns, and S.A. Solla (Eds.), MIT Press, vol. 10, in press.

  • Rosch, E. 1978. Principles of categorization. In Cognition and Categorization, E. Rosch and B. Lloyd (Eds.), Erlbaum: Hillsdale, NJ, pp. 27–48.

    Google Scholar 

  • SAS 1989. User's Guide, Version 6. SAS Institute Inc.: Cary, NC.

    Google Scholar 

  • Schiele, B. and Crowley, J.L. 1996. Object recognition using multidimensional receptive field histograms. In Proc. ECCV'96, B. Buxton and R. Cipolla (Eds.), volume 1 of Lecture Notes in Computer Science, Springer: Berlin, pp. 610–619.

    Google Scholar 

  • Shapira, Y. and Ullman, S. 1991. A pictorial approach to object classification. In Proceedings IJCAI, pp. 1257–1263.

  • Shepard, R.N. 1980. Multidimensional scaling, tree-fitting, and clustering. Science, 210:390–397.

    Google Scholar 

  • Smith, E.E. 1990. Categorization. In An Invitation to Cognitive Science: Thinking, D.N. Osherson and E.E. Smith (Eds.), MIT Press: Cambridge, MA, vol. 2, pp. 33–53.

    Google Scholar 

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: Afactorization method. International Journal of Computer Vision, 9:137–154.

    Google Scholar 

  • Ullman, S. 1989. Aligning pictorial descriptions: an approach to object recognition. Cognition, 32:193–254.

    Article  PubMed  Google Scholar 

  • Ullman, S. 1996. High Level Vision. MIT Press: Cambridge, MA.

    Google Scholar 

  • Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:992–1005.

    Article  Google Scholar 

  • Vetter, T., Hurlbert, A., and Poggio, T. 1995. View-based models of 3d object recognition: Invariance to imaging transformations. Cerebral Cortex, 5:261–269.

    Google Scholar 

  • Vetter, T. and Poggio, T. 1997. Linear object classes and image synthesis from a single example image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:733–742.

    Google Scholar 

  • Weiss, Y. and Edelman, S. 1995. Representation of similarity as a goal of early visual processing. Network, 6:19–41.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duvdevani-Bar, S., Edelman, S. Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes. International Journal of Computer Vision 33, 201–228 (1999). https://doi.org/10.1023/A:1008102413960

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008102413960

Navigation