Skip to main content

Learning multiscale image models of 2D object classes

  • Session S1B: Segmentation and Grouping
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1352))

Abstract

This paper is concerned with learning the canonical gray scale structure of the images of a class of objects. Structure is defined in terms of the geometry and layout of salient image regions that characterize the given views of the objects. The use of such structure based learning of object appearence is motivated by the relative stability of image structure over intensity values. A multiscale segmentation tree description is automatically extracted for all sample images which are then matched to construct a single canonical representative which serves as the model of the class. Different images are selected as prototypes, and each prototype tree is refined to best match the rest of the class. The model tree for the class is that tree which is best supported over all the initializations with different prototypes. Matching is formulated as a problem of finding the best mapping from regions of example images to those of the model tree, and implemented as a problem in incremental refinement of the model tree using a learning approach. Experiments are reported on a face image database. The results demonstrate that a reasonable model of facial geometry and topology is learnt which includes prominent facial features.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Ahuja. A transform for multiscale image segmentation of integrated edge and region detection. pages 1211–1235, 1996.

    Google Scholar 

  2. G. A. Carpenter, S. Grossberg, and D. B. Rosen. Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks, 4:759–771, 1991.

    Google Scholar 

  3. J. R. Beveridge et al. Segmenting images using localized histograms and region merging. International Journal of Computer Vision, 2(3):311–347, 1989.

    Google Scholar 

  4. J. Hornegger and H. Niemann. Statistical learning, localization and identification of objects. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 914–919, 1995.

    Google Scholar 

  5. S. L. Horowitz and T. Pavlidis. Picture segmentation by a directed split-and-merge procedure. In Proc. International Conference on Pattern Recognition, pages 424–433, 1974.

    Google Scholar 

  6. I. Y. kim and H. S. Yang. A systematic way for region-based image segmentation based on markov random field model. Pattern Recognition Letters, (15):969–976, 1994.

    Google Scholar 

  7. W. K. Konen, T. Maurer, and C. von der Malsburg. A fast dynamic link matching algorithm for invariant pattern recognition. Neural Networks, 7(6):1019–1030, 1994.

    Google Scholar 

  8. M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42(3):300–310, 1993.

    Google Scholar 

  9. B. Mogghaddam and A. Pentland. Probabilistic visual learning for object detection. In Proc. IEEE International Conference on Computer Vision, pages 786–793, 1995.

    Google Scholar 

  10. S. Mukherjee and S. K. Nayar. Automatic generation of grbf networks fo visual learning. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 794–800, 1995.

    Google Scholar 

  11. H. Murase and S. K. Nayar. Visual learning and recognition of 3-d objects from appearance. International Journal of Computer Vision, (14):5–24, 1995.

    Google Scholar 

  12. A. M. Nazif and M. D. Levine. Low level image segmentation:an expert system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(5):555–577, 1984.

    Google Scholar 

  13. T. Poggio and S. Edelman. A network that learns to recognize three-dimensional objects. Nature, 343:236–266, 1990.

    Google Scholar 

  14. T. Poggio and F. Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78:1481–1497, 1990.

    Google Scholar 

  15. S. Shams. Multiple elastic modules for visual pattern recognition. Neural Networks, 8(9):1439–1456, 1995.

    Google Scholar 

  16. J.M. Tenenbaum and H. G. Barrow. Experiments in interpretation-guided segmentation. Artificial Intelligence, 8:241–274, 1977.

    Google Scholar 

  17. C. von der Malsburg and E. Bienenstock. A neural network for the retrieval of superimposed connection patterns. Europhysics Letters, 3(11):1243–1249, 1987.

    Google Scholar 

  18. D. I. Waltz. Generating semantic descriptions from drawings of scenes with shadows. Technical Report A. I. Memo 1271, M. I. T. Artificial Intelligence Laboratory, 1972.

    Google Scholar 

  19. Y.,Yakimovsky and J. A. Feldman. A semantics-based decision theory region analysis. In Proc. International Joint Conference on Artificial Intelligence, pages 580–588, 1973.

    Google Scholar 

  20. L. Zadeh. Fuzzy sets. Information Control, 8:338–353, 1965.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Chin Ting-Chuen Pong

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Perrin, B., Ahuja, N., Srinivasa, N. (1997). Learning multiscale image models of 2D object classes. In: Chin, R., Pong, TC. (eds) Computer Vision — ACCV'98. ACCV 1998. Lecture Notes in Computer Science, vol 1352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63931-4_233

Download citation

  • DOI: https://doi.org/10.1007/3-540-63931-4_233

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63931-2

  • Online ISBN: 978-3-540-69670-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics