Skip to main content
Log in

Generalization to Novel Views: Universal, Class-based, and Model-based Processing

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

A major problem in object recognition is that a novel image of a given object can be different from all previously seen images. Images can vary considerably due to changes in viewing conditions such as viewing position and illumination. In this paper we distinguish between three types of recognition schemes by the level at which generalization to novel images takes place: universal, class, and model-based. The first is applicable equally to all objects, the second to a class of objects, and the third uses known properties of individual objects. We derive theoretical limitations on each of the three generalization levels. For the universal level, previous results have shown that no invariance can be obtained. Here we show that this limitation holds even when the assumptions made on the objects and the recognition functions are relaxed. We also extend the results to changes of illumination direction. For the class level, previous studies presented specific examples of classes of objects for which functions invariant to viewpoint exist. Here, we distinguish between classes that admit such invariance and classes that do not. We demonstrate that there is a tradeoff between the set of objects that can be discriminated by a given recognition function and the set of images from which the recognition function can recognize these objects. Furthermore, we demonstrate that although functions that are invariant to illumination direction do not exist at the universal level, when the objects are restricted to belong to a given class, an invariant function to illumination direction can be defined. A general conclusion of this study is that class-based processing, that has not been used extensively in the past, is often advantageous for dealing with variations due to viewpoint and illuminant changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adini, Y., Moses, Y. and Ullman, S. 1997. Face recognition: the problem of compensating for illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence,19:721–732.

    Article  Google Scholar 

  • Basri, R. and Moses, Y. 1998. When is it possible to identify 3D objects from single images using class constraints? In International Conference on Computer Vision,pp. 541–548.

  • Belhumeur, P.N., Hespanha, J.P. and Kriegman, D.J. 1997. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7): 711–720.

    Article  Google Scholar 

  • Biederman, I. 1985. Human image understanding: recent research and a theory. Computer, Graphics, and Image Processing,32:29–73.

    Google Scholar 

  • Brunelli, R. and Poggio, T. 1991. HyperBF networks for real object recognition. In IJCAI,Australia, pp. 1278–1284.

  • Burns, J.B. Weiss, R.S. and Riseman, E.M. 1992. The non-existence of general-case view-invariants. In J. L. Mundy and A. Zisserman, Eds., Geometrical Invariance in Computer Vision,M.I.T. Press.

  • Canny, J. F. 1986. A computational approach to edge detection. Pattern Analysis and Machine Intelligence,8:679–698.

    Google Scholar 

  • Clemens, D.J. and Jacobs, D.W. 1990. Model-group indexing for recognition. In Proc. Image Understanding Workshop,pp. 604–613.

  • Clemens, D.J. and Jacobs, D.W. 1991. Space and time bounds on indexing 3D models from 2D images. Pattern Analysis and Machine Intelligence,13(10):1007–1017.

    Article  Google Scholar 

  • Craw, I., Ellis, H. and Lishman, J.R. 1987. Automatic extraction of face-features. Pattern Recognition Letters,5:183–187.

    Article  Google Scholar 

  • Daugman, J. G. 1985. Uncertainty relation for resolution in space, spatial frequency and orientation, optimized by two dimensional cortical filters. Journal of Optical Society of America,2:1160–1169.

    Google Scholar 

  • Davis, L. S. 1975. A survey of edge detection techniques. Computer Graphics and Image Processing,4:248–270.

    Google Scholar 

  • Faugeras, O.D. 1992. What can be seen in three dimensions with an uncalibrated stereo rig? In Proc. European Conference on Computer Vision,pp. 563–564.

  • Fawcett, R., Zisserman, A., and Brady, J.M. 1994. Extracting structure from an affine view of a 3D point set with one or two bilateral symmetries. Image and Vision Computing,12(9):615–622.

    Article  Google Scholar 

  • Fischler, M. A., and Bolles, R. C. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM,24:381–395.

    Article  Google Scholar 

  • Hallinan, P.W. A low-dimensional representation of human faces for arbitrary lighting conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 995–999.

  • Haralick, R. M. 1984. Digital step edges from zero crossings of second directional derivatives. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:58–68.

    Google Scholar 

  • Hubel, D.G. and Wiesel, T.N. 1962. Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. Journal of Physiology,160:106–154.

    Google Scholar 

  • Hubel, D.G. and Wiesel, T.N. 1968. Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology,195:215–243.

    Google Scholar 

  • Huttenlocher, D. P., and Ullman, S. 1990. Recognizing solid objects by alignment with an image. International Journal of Computer Vision, 5(2): 195–212.

    Google Scholar 

  • Jacobs, D. 1992. Space efficient 3D model indexing. In IEEE Conference on Computer Vision and Pattern Recognition,pp. 439–444.

  • Kanade, T. 1977. Computer recognition of human faces. Birkhauser Verlag.

  • Kaya, Y. and Kobayashi, K. 1972. A basic study of human face recognition. In S. Watanabe, Ed.,Frontiers of Pattern Recognition,pp. 265–289.

  • Koenderink, J. J., and Van Doorn, A. J. 1991. Affine structure from motion. Journal of the Optical Society of America,8(2):377-385.

    Google Scholar 

  • Lamdan, Y., Schwartz, J.T. and Wolfson, H.J. 1987. Affine invariant model-based object recognition. IEEE Transaction on Robotics and Automation,6:578–589.

    Article  Google Scholar 

  • Lamdan, Y. and Wolfson, H. 1988. Geometric hashing: a general and efficient recognition scheme. In Proceedings of the 2nd International Conference on Computer Vision, pp. 238–251.

  • Longuet-Higgins, H. C. 1981. Acomputer algorithm for reconstructing a scene from two projections. Nature,293:133–135.

    Google Scholar 

  • Lowe, D. G. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence,31:355–395.

    Article  Google Scholar 

  • Marcelja, S. 1980. Mathematical description of the responses of simple cortical cells. J. Optical Soc., 70:1297–1300.

    Google Scholar 

  • Marr, D. and Hildreth, E. 1980. Theory of edge detection. Proc. R. Soc. Lond. B,207:187–217.

    Google Scholar 

  • Moses, Y. 1993. Face recognition: generalization to novel images. Ph.D Thesis, Weizmann Institute of Science.

  • Moses, Y., Edelman, S. and Ullman, S. 1996. Generalization to novel images in upright and inverted faces. Perception,25:443–461.

    Google Scholar 

  • Moses, Y., and Ullman, S. 1992. Limitation of Non-model-based recognition schemes. In Proc. European Conference on Computer Vision,pp. 820–828.

  • Nixon, M. 1985. Eye spacing measurements for facial recognition. SPIE Application of Digital Image Processing VIII,575:279–285.

    Google Scholar 

  • Pollen, D., and Ronner, S. 1983. Visual cortical neurons as localized spatial frequency filters. IEEE Transactions on System, Man and Cybernetics,SMC-13: 907–916.

    Google Scholar 

  • Rothwell, C. A., Forsyth, D. A., Zisserman, A. and Mundy, J.L. 1993. Extracting projective structure from single perspective views of 3D point sets. In Proceeding of International Conference on Computer Vision,pp. 573–582.

  • Rothwell, C.A., Zisserman, A., Forsyth, D.A. and Mundy, J.L. 1992. Canonical frames for planar object recognition. In European Conference on Computer Vision,pp. 757–772.

  • Shashua, A. 1992. Illumination and viewposition in 3D visual recognition. In J.E. Moody, J. E. Hanson, and R.P. Lippman, Eds., Advances in Neural Information Processing Systems 4, Morgan Kaufman, pp. 68–74.

  • Torre, V., and Poggio, T. 1986. On edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,8:147–163.

    Google Scholar 

  • Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of three dimensional motion parameters of rigid objects with curved surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:13-27.

    Google Scholar 

  • Ullman, S. 1979. The interpretation of visual motion. MIT Press.

  • Ullman, S. 1989. Aligning pictorial descriptions: an approach to object recognition. Cognition,32:93–254.

    Article  Google Scholar 

  • Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence,13:992–1005.

    Article  Google Scholar 

  • Viola, P., and Wells III, W. M. 1995. Alignment by maximization of mutual information. In Fifth International Conference on Computer Vision,pp.16–23.

  • Warrington, E.K, and Taylor, A.M. 1978. Two categorical stages of object recognition. Perception,7:152–164.

    Google Scholar 

  • Weinshall, D. 1993. Model-based invariants for 3D vision. International Journal on Computer Vision,10(1):27–42.

    Google Scholar 

  • Wong, K.H., Law, H.M. and Tsang, P.W.M. 1989. A system for recognising human face. In Proc. ICASSP,pp. 1638–1642.

  • Yuille, A. L., Cohen, D.C. and Hallinan, P.W. 1992. Feature extraction from faces using deformable templates. International Journal of Computer Vision,8(2):99–111.

    Google Scholar 

  • Zisserman, A., Forsyth, D., Mundy, J., Rothwell, C., Liu, J. and Pillow, N. 1995. 3D Object Recognition Using Invariance. Artificial Intelligent, 78(1-2):239–288.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moses, Y., Ullman, S. Generalization to Novel Views: Universal, Class-based, and Model-based Processing. International Journal of Computer Vision 29, 233–253 (1998). https://doi.org/10.1023/A:1008088813977

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008088813977

Navigation