Abstract
This paper is concerned with learning the canonical gray scale structure of the images of a class of objects. Structure is defined in terms of the geometry and layout of salient image regions that characterize the given views of the objects. The use of such structure based learning of object appearence is motivated by the relative stability of image structure over intensity values. A multiscale segmentation tree description is automatically extracted for all sample images which are then matched to construct a single canonical representative which serves as the model of the class. Different images are selected as prototypes, and each prototype tree is refined to best match the rest of the class. The model tree for the class is that tree which is best supported over all the initializations with different prototypes. Matching is formulated as a problem of finding the best mapping from regions of example images to those of the model tree, and implemented as a problem in incremental refinement of the model tree using a learning approach. Experiments are reported on a face image database. The results demonstrate that a reasonable model of facial geometry and topology is learnt which includes prominent facial features.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
N. Ahuja. A transform for multiscale image segmentation of integrated edge and region detection. pages 1211–1235, 1996.
G. A. Carpenter, S. Grossberg, and D. B. Rosen. Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks, 4:759–771, 1991.
J. R. Beveridge et al. Segmenting images using localized histograms and region merging. International Journal of Computer Vision, 2(3):311–347, 1989.
J. Hornegger and H. Niemann. Statistical learning, localization and identification of objects. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 914–919, 1995.
S. L. Horowitz and T. Pavlidis. Picture segmentation by a directed split-and-merge procedure. In Proc. International Conference on Pattern Recognition, pages 424–433, 1974.
I. Y. kim and H. S. Yang. A systematic way for region-based image segmentation based on markov random field model. Pattern Recognition Letters, (15):969–976, 1994.
W. K. Konen, T. Maurer, and C. von der Malsburg. A fast dynamic link matching algorithm for invariant pattern recognition. Neural Networks, 7(6):1019–1030, 1994.
M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42(3):300–310, 1993.
B. Mogghaddam and A. Pentland. Probabilistic visual learning for object detection. In Proc. IEEE International Conference on Computer Vision, pages 786–793, 1995.
S. Mukherjee and S. K. Nayar. Automatic generation of grbf networks fo visual learning. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 794–800, 1995.
H. Murase and S. K. Nayar. Visual learning and recognition of 3-d objects from appearance. International Journal of Computer Vision, (14):5–24, 1995.
A. M. Nazif and M. D. Levine. Low level image segmentation:an expert system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(5):555–577, 1984.
T. Poggio and S. Edelman. A network that learns to recognize three-dimensional objects. Nature, 343:236–266, 1990.
T. Poggio and F. Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78:1481–1497, 1990.
S. Shams. Multiple elastic modules for visual pattern recognition. Neural Networks, 8(9):1439–1456, 1995.
J.M. Tenenbaum and H. G. Barrow. Experiments in interpretation-guided segmentation. Artificial Intelligence, 8:241–274, 1977.
C. von der Malsburg and E. Bienenstock. A neural network for the retrieval of superimposed connection patterns. Europhysics Letters, 3(11):1243–1249, 1987.
D. I. Waltz. Generating semantic descriptions from drawings of scenes with shadows. Technical Report A. I. Memo 1271, M. I. T. Artificial Intelligence Laboratory, 1972.
Y.,Yakimovsky and J. A. Feldman. A semantics-based decision theory region analysis. In Proc. International Joint Conference on Artificial Intelligence, pages 580–588, 1973.
L. Zadeh. Fuzzy sets. Information Control, 8:338–353, 1965.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Perrin, B., Ahuja, N., Srinivasa, N. (1997). Learning multiscale image models of 2D object classes. In: Chin, R., Pong, TC. (eds) Computer Vision — ACCV'98. ACCV 1998. Lecture Notes in Computer Science, vol 1352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63931-4_233
Download citation
DOI: https://doi.org/10.1007/3-540-63931-4_233
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63931-2
Online ISBN: 978-3-540-69670-4
eBook Packages: Springer Book Archive