Skip to main content
Log in

Multireference object pose retrieval with volume feedback

  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract.

Pose retrieval of a rigid object from monocular video sequences or images is addressed. Initially, the object pose is estimated in each image assuming flat depth maps. Shape-from-silhouette is then applied to make a 3-D model (volume), which is used for a new round of pose estimations, this time by a model-based method that gives better estimates. Before repeating this process by building a new volume, pose estimates are adjusted to reduce error by maximizing a novel quality factor for shape-from-silhouette volume reconstruction. The feedback loop is terminated when pose estimates do not change much, as compared with those produced by the previous iteration. Based on a theoretical study of the proposed system, a test of convergence to a given set of poses is devised. Reliable performance of the system is also proved by several experiments on both synthetic and real image sequences. No model is assumed for the object and no feature point is detected or tracked as there is no problematic feature matching or correspondence. Our method can be used for 3-D object tracking in video, 3-D modeling, and volume reconstruction from video.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aloimonos Y (1990) Perpective approximations. Image Vision Comput 8(3):177-192

    Google Scholar 

  2. Apostol TM (1967) Calculus, vol 2, chap 2.8.18. Blaisdell, Waltham, MA

  3. Avanaki AN, Hamidzadeh B, Kossentini F (2000) Multi-objective retrieval of object pose from video. In: Proceedings of the IEEE conference on tools with artificial intelligence, Vancouver, BC, Canada, November 2000, pp 242-249

  4. Barron J, Klette R (2002) Quantitative color optical flow. In: Proceedings of the 16th international conference on pattern recognition, Quebec City, QC, Canada, August 2002, 4:251-255

  5. Bellman R (1964a) Perturbation techniques in mathematics, physics, and engineering, chap 1.22. Holt, Rinehart and Winston, New York, pp 41-42

  6. Bellman R (1964b) Perturbation techniques in mathematics, physics, and engineering. Holt, Rinehart and Winston, New York, p 40

  7. Bozdagi G(1994) 3-D motion estimation and wireframe adaptation including photometric effects for model-based coding of facial image sequences. IEEE Trans Circuits Sys Video Technol 4(3):246-256

    Article  Google Scholar 

  8. Broadhurst A(2001) A probabilistic framework for space carving. In: Proceedings of the IEEE international conference on computer vision, Vancouver, BC, Canada, July 2001, pp 388-393

  9. Chen L, Huang T, Ostermann J (1997) Animated talking head with personalized 3D head model. In: Proceedings of the 1st workshop on multimedia signal processing, Princeton, NJ, June 1997, pp 274-279

  10. Chen S-C, Kashyap RL, Ghafoor A (2000) Semantic models for multimedia database searching and browsing. Kluwer, Dordrecht

  11. Eckert G(2001) Shape refinement for reconstructing 3d objects using an analysis-systhesis approach. In: Proceedings of the IEEE international conference on image processing, Thessaloniki, Greece, October 2001, pp 903-906

  12. Elagin E(1998) Automatic pose estimation system for human faces based on bunch graph matching technology. In: Proceedings of the 3rd IEEE international conference on automatic face and gesture recognition, Nara, Japan, April 1998, pp 136-141

  13. Harville M(1999) 3D pose tracking with linear depth and brightness constraints. In: Proceedings of the IEEE international conference on computer vision, Kerkyra, Greece, September 1999, pp 206-213

  14. Heinzmann J, Zelinsky A (1998) 3-D facial pose and gaze point estimation using a robust real-time tracking paradigm. In: Proceedings of the 3rd IEEE international conference on automatic face and gesture recognition, Nara, Japan, April 1998, pp 142-147

  15. Hogg T, Talhami H(1997) Practical approach to view-based synergetic pose estimation. In: Proceedings of the IEEE TENCON conference, Brisbane, Australia, December 1997, 2:663-666

  16. Horaud R(1997) Object pose: the link between weak perspective, paraperspective, and full perspective. Int J Comput Vision 22(2):173-189

    Article  Google Scholar 

  17. Kutulakos KN, Seitz SM (2000) A theory of shape by space carving. Int J Comput Vision 38(3):199-218

    Article  MATH  Google Scholar 

  18. Laurentini A (1994) The visual hull concept for silhouette-based image understanding. IEEE Trans Patt Recog Mach Intell 16(2):150-162

    Article  Google Scholar 

  19. Levoy M, Hanrahan P (1996) Light field rendering. In: Proceedings of the 23rd annual conference on computer graphics and interactive techniques, New Orleans, LA, August 1996, pp 31-42

  20. Lu C(2000) Fast and globally convergent pose estimation from video images. IEEE Trans Patt Recog Mach Intell 22(6):610-622

    Article  Google Scholar 

  21. Mulligan J(2001) Trinocular stereo: a real-time algorithm and its evaluation. In: Proceedings of the IEEE workshop on stereo and multi-baseline vision, Kauai, HI, December 2001, pp 10-17

  22. Pollefeys M(1999) Self-calibrartion and metric reconstruction inspite of varying and unknown camera parameters. Int J Comput Vision 32(1):7-25

    Article  Google Scholar 

  23. Reed MK, Allen PK (1999) 3D modeling from range imagery: an incremental method with a planning component. Image Vision Comput 17:99-111

    Article  MATH  Google Scholar 

  24. Szeliski R (1993) Rapid octree construction from image sequences. CVGIP Image Understand 58(1):23-32

    Article  Google Scholar 

  25. Thomas JI, Oliensis J (1999) Dealing with noise in multi-frame structure from motion. Comput Vision Image Understand 76(2):109-123

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alireza Nasiri Avanaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Avanaki, A.N., Hamidzadeh, B. & Kossentini, F. Multireference object pose retrieval with volume feedback. Multimedia Systems 9, 561–574 (2004). https://doi.org/10.1007/s00530-003-0128-x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-003-0128-x

Keywords

Navigation