Skip to main content

Object Detection Using a Cascade of 3D Models

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3852))

Abstract

We present an alignment framework for object detection using a hierarchy of 3D polygonal models. One difficulty with alignment methods is that the high-dimensional transformation space makes finding potential candidate states a time-consuming task. This is an important consideration in our approach, as an exhaustive search is applied on a densely-sampled state space in order to avoid local minima and to extract all possible candidates. In our framework, a level-of-detail (LOD) 3D geometric model hierarchy is generated for the target object. Each of this model acts as a classifier to determine which of the discrete states are potential candidates. The classification is done through the estimation of pixel and edge-based mutual information between the 3D model and the image, where the classification speed significantly depends on the LOD and resolution of the image. By combining these models of various LOD into a cascade, we show that search time can be reduced significantly while accuracy is maintained.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, WI, vol. 2, pp. 264–271 (2003)

    Google Scholar 

  2. Weber, M., Welling, W., Perona, P.: Unsupervised learning of models for recognition. In: Proceedings of the European Conference on Computer Vision, Dublin, Ireland, vol. 1, pp. 18–32 (2000)

    Google Scholar 

  3. Viola, P.: Alignment by Maximization of Mutual Information. PhD thesis, Massachusetts Institute of Technology (1995)

    Google Scholar 

  4. Cover, T., Thomas, J.: Elements of Information Theory. John Wiley, Chichester (1991)

    Book  MATH  Google Scholar 

  5. Campbell, R., Flynn, P.: A survey of free-form object representation and recognition techniques. Computer Vision and Image Understanding 81, 166–210 (2001)

    Article  MATH  Google Scholar 

  6. Kollnig, H., Nagel, N.N.: 3d pose estimation by directly matching polyhedral models to gray value gradients. International Journal of Computer Vision 23, 283–302 (1997)

    Article  Google Scholar 

  7. Tan, T., Sullivan, G., Baker, K.: Model-based localization and recognition of road vehicles. International Journal of Computer Vision 27, 5–25 (1998)

    Article  Google Scholar 

  8. Suveg, I., Gosselman, G.: Mutual information based evaluation of 3d building models. In: Proceedings of the International Conference on Pattern Recognition, Quebec City, Canada, vol. 3, pp. 188–197 (2002)

    Google Scholar 

  9. Leventon, M.E., Wells III, W., Grimson, W.: Multiple view 2d-3d mutual information registration. In: DARPA IMage Understanding Workshop, pp. 625–630 (1997)

    Google Scholar 

  10. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, vol. 1, p. 511 (2001)

    Google Scholar 

  11. Garland, M., Heckbert, P.: Surface simplification using quadric error metrics. In: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, pp. 209–216 (1997)

    Google Scholar 

  12. Decaudin, P.: Cartoon-looking rendering of 3d scenes. Technical Report 2919, INRIA (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pong, HK., Cham, TJ. (2006). Object Detection Using a Cascade of 3D Models. In: Narayanan, P.J., Nayar, S.K., Shum, HY. (eds) Computer Vision – ACCV 2006. ACCV 2006. Lecture Notes in Computer Science, vol 3852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11612704_29

Download citation

  • DOI: https://doi.org/10.1007/11612704_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31244-4

  • Online ISBN: 978-3-540-32432-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics