Generalization to Novel Views: Universal, Class-based, and Model-based Processing

Moses, Yael; Ullman, Shimon

doi:10.1023/A:1008088813977

Generalization to Novel Views: Universal, Class-based, and Model-based Processing

Published: September 1998

Volume 29, pages 233–253, (1998)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yael Moses¹ &
Shimon Ullman¹

102 Accesses
9 Citations
Explore all metrics

Abstract

A major problem in object recognition is that a novel image of a given object can be different from all previously seen images. Images can vary considerably due to changes in viewing conditions such as viewing position and illumination. In this paper we distinguish between three types of recognition schemes by the level at which generalization to novel images takes place: universal, class, and model-based. The first is applicable equally to all objects, the second to a class of objects, and the third uses known properties of individual objects. We derive theoretical limitations on each of the three generalization levels. For the universal level, previous results have shown that no invariance can be obtained. Here we show that this limitation holds even when the assumptions made on the objects and the recognition functions are relaxed. We also extend the results to changes of illumination direction. For the class level, previous studies presented specific examples of classes of objects for which functions invariant to viewpoint exist. Here, we distinguish between classes that admit such invariance and classes that do not. We demonstrate that there is a tradeoff between the set of objects that can be discriminated by a given recognition function and the set of images from which the recognition function can recognize these objects. Furthermore, we demonstrate that although functions that are invariant to illumination direction do not exist at the universal level, when the objects are restricted to belong to a given class, an invariant function to illumination direction can be defined. A general conclusion of this study is that class-based processing, that has not been used extensively in the past, is often advantageous for dealing with variations due to viewpoint and illuminant changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Excavating AI: the politics of images in machine learning training sets

Article 08 June 2021

A hybrid method based on the completely positive-tensors and PCA for face recognition

Article 12 April 2024

Perceptual image quality assessment: a survey

Article 26 April 2020

References

Adini, Y., Moses, Y. and Ullman, S. 1997. Face recognition: the problem of compensating for illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence,19:721–732.
Article Google Scholar
Basri, R. and Moses, Y. 1998. When is it possible to identify 3D objects from single images using class constraints? In International Conference on Computer Vision,pp. 541–548.
Belhumeur, P.N., Hespanha, J.P. and Kriegman, D.J. 1997. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7): 711–720.
Article Google Scholar
Biederman, I. 1985. Human image understanding: recent research and a theory. Computer, Graphics, and Image Processing,32:29–73.
Google Scholar
Brunelli, R. and Poggio, T. 1991. HyperBF networks for real object recognition. In IJCAI,Australia, pp. 1278–1284.
Burns, J.B. Weiss, R.S. and Riseman, E.M. 1992. The non-existence of general-case view-invariants. In J. L. Mundy and A. Zisserman, Eds., Geometrical Invariance in Computer Vision,M.I.T. Press.
Canny, J. F. 1986. A computational approach to edge detection. Pattern Analysis and Machine Intelligence,8:679–698.
Google Scholar
Clemens, D.J. and Jacobs, D.W. 1990. Model-group indexing for recognition. In Proc. Image Understanding Workshop,pp. 604–613.
Clemens, D.J. and Jacobs, D.W. 1991. Space and time bounds on indexing 3D models from 2D images. Pattern Analysis and Machine Intelligence,13(10):1007–1017.
Article Google Scholar
Craw, I., Ellis, H. and Lishman, J.R. 1987. Automatic extraction of face-features. Pattern Recognition Letters,5:183–187.
Article Google Scholar
Daugman, J. G. 1985. Uncertainty relation for resolution in space, spatial frequency and orientation, optimized by two dimensional cortical filters. Journal of Optical Society of America,2:1160–1169.
Google Scholar
Davis, L. S. 1975. A survey of edge detection techniques. Computer Graphics and Image Processing,4:248–270.
Google Scholar
Faugeras, O.D. 1992. What can be seen in three dimensions with an uncalibrated stereo rig? In Proc. European Conference on Computer Vision,pp. 563–564.
Fawcett, R., Zisserman, A., and Brady, J.M. 1994. Extracting structure from an affine view of a 3D point set with one or two bilateral symmetries. Image and Vision Computing,12(9):615–622.
Article Google Scholar
Fischler, M. A., and Bolles, R. C. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM,24:381–395.
Article Google Scholar
Hallinan, P.W. A low-dimensional representation of human faces for arbitrary lighting conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 995–999.
Haralick, R. M. 1984. Digital step edges from zero crossings of second directional derivatives. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:58–68.
Google Scholar
Hubel, D.G. and Wiesel, T.N. 1962. Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. Journal of Physiology,160:106–154.
Google Scholar
Hubel, D.G. and Wiesel, T.N. 1968. Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology,195:215–243.
Google Scholar
Huttenlocher, D. P., and Ullman, S. 1990. Recognizing solid objects by alignment with an image. International Journal of Computer Vision, 5(2): 195–212.
Google Scholar
Jacobs, D. 1992. Space efficient 3D model indexing. In IEEE Conference on Computer Vision and Pattern Recognition,pp. 439–444.
Kanade, T. 1977. Computer recognition of human faces. Birkhauser Verlag.
Kaya, Y. and Kobayashi, K. 1972. A basic study of human face recognition. In S. Watanabe, Ed.,Frontiers of Pattern Recognition,pp. 265–289.
Koenderink, J. J., and Van Doorn, A. J. 1991. Affine structure from motion. Journal of the Optical Society of America,8(2):377-385.
Google Scholar
Lamdan, Y., Schwartz, J.T. and Wolfson, H.J. 1987. Affine invariant model-based object recognition. IEEE Transaction on Robotics and Automation,6:578–589.
Article Google Scholar
Lamdan, Y. and Wolfson, H. 1988. Geometric hashing: a general and efficient recognition scheme. In Proceedings of the 2nd International Conference on Computer Vision, pp. 238–251.
Longuet-Higgins, H. C. 1981. Acomputer algorithm for reconstructing a scene from two projections. Nature,293:133–135.
Google Scholar
Lowe, D. G. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence,31:355–395.
Article Google Scholar
Marcelja, S. 1980. Mathematical description of the responses of simple cortical cells. J. Optical Soc., 70:1297–1300.
Google Scholar
Marr, D. and Hildreth, E. 1980. Theory of edge detection. Proc. R. Soc. Lond. B,207:187–217.
Google Scholar
Moses, Y. 1993. Face recognition: generalization to novel images. Ph.D Thesis, Weizmann Institute of Science.
Moses, Y., Edelman, S. and Ullman, S. 1996. Generalization to novel images in upright and inverted faces. Perception,25:443–461.
Google Scholar
Moses, Y., and Ullman, S. 1992. Limitation of Non-model-based recognition schemes. In Proc. European Conference on Computer Vision,pp. 820–828.
Nixon, M. 1985. Eye spacing measurements for facial recognition. SPIE Application of Digital Image Processing VIII,575:279–285.
Google Scholar
Pollen, D., and Ronner, S. 1983. Visual cortical neurons as localized spatial frequency filters. IEEE Transactions on System, Man and Cybernetics,SMC-13: 907–916.
Google Scholar
Rothwell, C. A., Forsyth, D. A., Zisserman, A. and Mundy, J.L. 1993. Extracting projective structure from single perspective views of 3D point sets. In Proceeding of International Conference on Computer Vision,pp. 573–582.
Rothwell, C.A., Zisserman, A., Forsyth, D.A. and Mundy, J.L. 1992. Canonical frames for planar object recognition. In European Conference on Computer Vision,pp. 757–772.
Shashua, A. 1992. Illumination and viewposition in 3D visual recognition. In J.E. Moody, J. E. Hanson, and R.P. Lippman, Eds., Advances in Neural Information Processing Systems 4, Morgan Kaufman, pp. 68–74.
Torre, V., and Poggio, T. 1986. On edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,8:147–163.
Google Scholar
Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of three dimensional motion parameters of rigid objects with curved surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:13-27.
Google Scholar
Ullman, S. 1979. The interpretation of visual motion. MIT Press.
Ullman, S. 1989. Aligning pictorial descriptions: an approach to object recognition. Cognition,32:93–254.
Article Google Scholar
Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence,13:992–1005.
Article Google Scholar
Viola, P., and Wells III, W. M. 1995. Alignment by maximization of mutual information. In Fifth International Conference on Computer Vision,pp.16–23.
Warrington, E.K, and Taylor, A.M. 1978. Two categorical stages of object recognition. Perception,7:152–164.
Google Scholar
Weinshall, D. 1993. Model-based invariants for 3D vision. International Journal on Computer Vision,10(1):27–42.
Google Scholar
Wong, K.H., Law, H.M. and Tsang, P.W.M. 1989. A system for recognising human face. In Proc. ICASSP,pp. 1638–1642.
Yuille, A. L., Cohen, D.C. and Hallinan, P.W. 1992. Feature extraction from faces using deformable templates. International Journal of Computer Vision,8(2):99–111.
Google Scholar
Zisserman, A., Forsyth, D., Mundy, J., Rothwell, C., Liu, J. and Pillow, N. 1995. 3D Object Recognition Using Invariance. Artificial Intelligent, 78(1-2):239–288.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics and Computer Science, The Weizmann Institute of Science, Rehovot, 76100, Israel
Yael Moses & Shimon Ullman

Authors

Yael Moses
View author publications
You can also search for this author in PubMed Google Scholar
Shimon Ullman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moses, Y., Ullman, S. Generalization to Novel Views: Universal, Class-based, and Model-based Processing. International Journal of Computer Vision 29, 233–253 (1998). https://doi.org/10.1023/A:1008088813977

Download citation

Issue Date: September 1998
DOI: https://doi.org/10.1023/A:1008088813977

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalization to Novel Views: Universal, Class-based, and Model-based Processing

Abstract

Access this article

Similar content being viewed by others

Excavating AI: the politics of images in machine learning training sets

A hybrid method based on the completely positive-tensors and PCA for face recognition

Perceptual image quality assessment: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Generalization to Novel Views: Universal, Class-based, and Model-based Processing

Abstract

Access this article

Similar content being viewed by others

Excavating AI: the politics of images in machine learning training sets

A hybrid method based on the completely positive-tensors and PCA for face recognition

Perceptual image quality assessment: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation