Skip to main content
Log in

Region-based pose tracking with occlusions using 3D models

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Despite great progress achieved in 3-D pose tracking during the past years, occlusions and self-occlusions are still an open issue. This is particularly true in silhouette-based tracking where even visible parts cannot be tracked as long as they do not affect the object silhouette. Multiple cameras or motion priors can overcome this problem. However, multiple cameras or appropriate training data are not always readily available. We propose a framework in which the pose of 3-D models is found by minimising the 2-D projection error through minimisation of an energy function depending on the pose parameters. This framework makes it possible to handle occlusions and self-occlusions by tracking multiple objects and object parts simultaneously. Therefore, each part is described by its own image region each of which is modeled by one probability density function. This allows to deal with occlusions explicitly, which includes self-occlusions between different parts of the same object as well as occlusions between different objects. The results we present for simulations and real-world scenes demonstrate the improvements achieved in monocular and multi-camera settings. These improvements are substantiated by quantitative evaluations, e.g. based on the HumanEVA benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abidi, M.A., Chandra, T.: Pose estimation for camera calibration and landmark tracking. In: Proceedings of the International Conference on Robotics and Automation, vol. 1, pp. 420–426. Cincinnati (1990)

  2. Ablavsky, V., Thangali, A., Sclaroff, S.: Layered graphical models for tracking partially-occluded objects. In: Proc. 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, New York (2008)

  3. Agarwal A., Triggs B.: Recovering 3D human pose from monocular images. IEEE Transac. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006)

    Article  Google Scholar 

  4. Aubert G., Marlaud M., Faugeras O., Jehan-Besson S.: Image segmentation using active contours: calculus of variations or shape gradients?. SIAM J. Appl. Math. 63(6), 2128–2154 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Beveridge, J.: Local search algorithms for geometric object recognition: Optimal correspondence and pose. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst (1993)

  6. Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: Proc. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8–15. Santa Barbara (1998)

  7. Brox T., Weickert J.: A TV flow based local scale estimate and its application to texture discrimination. J. Vis. Commun. Image Represent. 17(5), 1053–1073 (2006)

    Article  Google Scholar 

  8. Brox, T., Rosenhahn, B., Cremers D., Seidel, H.P.: High accuracy optical flow serves 3-D pose tracking: exploiting contour and flow based constraints. In: Leonardis, A., Bischof, H., Pinz, A. (eds) European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol. 3952, pp. 98–111. Springer, Graz (2006a)

  9. Brox, T., Rosenhahn, B., Kersting, U., Cremers, D.: Nonparametric density estimation for human pose tracking. In: Franke, K., Müller, K., Nickolay, B., Schäfer, R. Pattern recognition lecture notes in computer science, vol. 4174, pp. 546–555. Springer, Berlin (2006)

  10. Brubaker M.A., Fleet D.J., Hertzmann A.: Physics-based person tracking using the anthropomorphic walker. Int. J. Comput. Vis. 87(1/2), 140–155 (2010)

    Article  Google Scholar 

  11. Bălan, A.O., Black, M.J.: The naked truth: estimating body shape under clothing. In: Forsyth, D., Torr, P., Zisserman, A. Computer Vision—ECCV 2008, Part II, Lecture Notes in Computer Science, vol. 5303, pp. 15–29. Springer, Berlin (2008)

  12. Corazza S., Mündermann L., Gambaretto E., Ferrigno G., Andriacchi T.P.: Markerless motion capture through visual hull, articulated ICP and subject specific model generation. Int. J. Comput. Vis. 87(1/2), 156–169 (2010)

    Article  Google Scholar 

  13. Dambreville S., Sandhu R., Yezzi A., Tannenbaum A.: Robust 3D pose estimation and efficient 2D region-based segmentation from a 3D shape prior. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision—ECCV 2008, Part II, Lecture Notes in Computer Science, vol. 5303, pp. 169–182. Springer, Berlin (2008)

    Google Scholar 

  14. Drummond T., Cipolla R.: Real-time tracking of multiple articulated structures in multiple views. In: Vernon, D. (ed) Computer Vision—ECCV 2000, Part II, Lecture Notes in Computer Science, vol. 1843, pp. 20–36. Springer, Berlin (2000)

    Google Scholar 

  15. Forsyth D.A., Arikan O., Ikemoto L., O’Brien J., Ramanan D.: Computational studies of human motion: part 1, tracking and motion synthesis. Found. Trends Comput. Graph. Vis. 1(2–3), 77–254 (2005)

    Article  Google Scholar 

  16. Gall J., Rosenhahn B., Seidel H.P.: Drift-free tracking of rigid and articulated objects. In: Proceedings of 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition, IEEE Computer Society Press, New York (2008)

  17. Gall J., Rosenhahn B., Brox T., Seidel H.P.: Optimization and filtering for human motion capture. Int. J. Comput. Vis. 87(1/2), 75–92 (2010)

    Article  Google Scholar 

  18. Gavrila D.M.: The visual analysis of human movement: a survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999)

    Article  MATH  Google Scholar 

  19. Gonzalez R.C., Woods R.E.: Digital image processing, 2nd edn. Addison–Wesley, Reading (2002)

    Google Scholar 

  20. Grabner H., Matas J.G., Gool L.V., Cattin P.: Tracking the invisible: Learning where the object might be. In: Proc. 2010 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), pp. 1285–1292. IEEE Computer Society Press, New York (2010)

  21. Huang Y., Essa I.: Tracking multiple objects through occlusions. In: Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1051–1058, IEEE Computer Society Press, New York (2005)

  22. ISO/CIE CIE colorimetry—part 4: 1976 L*a*b* colour space. ISO 11664-4:2008(E)/CIE S 014-4/E:2007 (2007)

  23. Joshi N., Avidan S., Matusik W., Kriegman D.: (2007) Synthetic aperture tracking: tracking through occlusions. In: Proceedings of Eleventh International Conference on Computer Vision, IEEE Computer Society Press, New York

  24. Kim K., Davis L.: Multi-camera tracking and segmentation of occluded people on ground plane using search-guided particle filtering. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) Computer vision—ECCV 2006, Part II, Lecture Notes in Computer Science, vol. 3953, pp. 98–109. Springer, Berlin (2006)

  25. Kriegman D., Vijayakumar B., Ponce J.: Constraints for recognizing and locating curved 3D objects from monocular image features. In: Sandini, G. (ed.) Computer Vision—ECCV ’92, Lecture Notes in Computer Science, vol. 588, pp. 829–833. Springer, Berlin (1992)

    Google Scholar 

  26. Kullback S., Leibler R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  27. Lankton S., Tannenbaum A.: Localizing region-based active contours. IEEE Transac. Image Process. 17(11), 2029–2039 (2008)

    Article  MathSciNet  Google Scholar 

  28. Lowe D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  29. Lowe D.G.: Fitting parameterized three-dimensional models to images. IEEE Transac. Pattern Anal. Mach. Intell. 13(2), 441–450 (1991)

    Article  Google Scholar 

  30. Moeslund T.B., Hilton A., Krüger V.: A survey of advances in vision-based human motion capture and analysis. Int. J. Comput. Vis. 104(2), 90–126 (2006)

    Google Scholar 

  31. Mory B., Ardon R., Thiran J.: Variational segmentation using fuzzy region competition and local non-parametric probability density functions. In: Proc. Eleventh International Conference on Computer Vision, IEEE Computer Society Press, New York (2007)

  32. Murray R.M., Li Z., Sastry S.S.: A mathematical introduction to robotic manipulation. CRC Press, Boca Raton (1994)

    MATH  Google Scholar 

  33. Ormoneit, D., Sidenbladh, H., Black, M.J., Hastie, T.: Learning and tracking cyclic human motion. In: Leen, T.K., Dietterich, T.G., Tresp, V. Advances in Neural Information Processing Systems 13, pp. 894–900 The MIT Press, MA (2001)

  34. Ottlik A., Nagel H.H.: Initialization of model-based vehicle tracking in video sequences of inner-city intersections. Int. J. Comput. Vis. 80(2), 211–225 (2008)

    Article  Google Scholar 

  35. Özuysal M., Lepetit V., Fleuret F., Fua P.: Feature harvesting for tracking-by-detection. In: Computer Vision—ECCV 2006, Part III, Lecture Notes in Computer Science, vol. 3953, pp. 592–605. Springer, Graz (2006)

  36. Poppe R.: Vision-based human motion analysis: An overview. Comput. Vis. Image Underst. 108(1–2), 4–18 (2007)

    Article  Google Scholar 

  37. Pressigout M., Marchand E.: Real-time 3d model-based tracking: Combining edge and texture information. In: IEEE Int. Conf. on Robotics and Automation, ICRA’06. pp. 2726–2731. Orlando (2006)

  38. Prisacariu V.A., Reid I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. In: Proceedings of the 20th British Machine Vision Conference (2009)

  39. Ramanan D., Forsyth D.A., Zisserman A.: Tracking people by learning their appearance. IEEE Transac. Pattern Ana. Mach. Intell. 29(1), 65–81 (2007)

    Article  Google Scholar 

  40. Rosenhahn B., Sommer G.: Adaptive pose estimation for different corresponding entities. In: Van Gool, L. (ed) Pattern Recognition, Lecture Notes in Computer Science, vol. 2449, pp. 265–273. Springer, Berlin (2004)

    Google Scholar 

  41. Rosenhahn B., Brox T., Weickert J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. Int. J. Comput. Vis. 73(3), 243–262 (2007)

    Article  Google Scholar 

  42. Rosenhahn, B., Schmaltz, C., Brox, T., Weickert, J., Cremers, D., Seidel, H.P.: Markerless motion capture of man-machine interaction. In: Proceedings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Washington (2008)

  43. Sandhu, R., Dambreville, S., Yezzi, A., Tannenbaum, A.: Non-rigid 2D-3D pose estimation and 2D image segmentation. In: Proceedings of 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 786–793. IEEE Computer Society Press, Washington (2009)

  44. Schmaltz C., Rosenhahn B., Brox T., Cremers D., Weickert J., Wietzke L., Sommer G.: Region-based pose tracking. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern recognition and image analysis, Lecture Notes in Computer Science, vol. 4478, pp. 56–63. Springer, Girona (2007)

    Chapter  Google Scholar 

  45. Schmaltz C., Rosenhahn B., Brox T., Weickert J., Cremers D., Wietzke L., Sommer G.: Occlusion modeling by tracking multiple objects. In: Hambrecht, F., Schnörr, C., Jähne, B. (eds) Pattern Recognition, Lecture Notes in Computer Science, vol. 4713, pp. 173–183. Springer, Berlin (2007)

    Google Scholar 

  46. Schmaltz C., Rosenhahn B., Brox T., Weickert J., Wietzke L., Sommer G.: Dealing with self-occlusion in region based motion capture by means of internal regions. In: Perales, F.J., Fisher, R.B. (eds) Articulated Motion and Deformable Objects, Lecture Notes in Computer Science, vol 5098, pp. 102–111. Springer, Berlin (2008)

    Chapter  Google Scholar 

  47. Shevlin F.: Analysis of orientation problems using Plucker lines. In: Proc. 14th International Conference on Pattern Recognition, vol. 1, pp. 685–689. IEEE Computer Society Press, Washington (1998)

  48. Sigal L., Balan A., Black M.: HUMANEVA: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1/2), 4–27 (2010)

    Article  Google Scholar 

  49. Stoer J., Bulirsch R.: Introduction to Numerical Analysis. Springer, Berlin (1980)

    Google Scholar 

  50. Sudderth E.B., Mandel M.I., Freeman W.T., Willsky A.S.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: Saul, L.K., Weiss, Y., Bottou, L. (eds) Advances in Neural Information Processing Systems 17, pp. 1369–1376. MIT Press, Cambridge (2005)

    Google Scholar 

  51. Sun, M., Su, H., Savarese, S., Fei-Fei, L.: A multi-view probabilistic model for 3D object classes. In: Proc. 2009 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), pp. 1247–1254. IEEE Computer Society Press, New York (2009)

  52. Sundaresan A., Chellappa R.: Multicamera tracking of articulated human motion using shape and motion cues. IEEE Transac. Image Process. 18(9), 2114–2126 (2009)

    Article  MathSciNet  Google Scholar 

  53. Vondrak, M., Sigal, L., Jenkins, O.C.: Physical simulation for probabilistic motion tracking. In: Proc. 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, New York (2008)

  54. Wong A.K.C., You M.: Entropy and distance of random graphs with application of structural pattern recognition. IEEE Transac. Pattern Anal. Mach. Intell. 7(5), 599–609 (1985)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Schmaltz.

Additional information

We gratefully acknowledge funding by the German Research Foundation (DFG) under the project We 2602/5-2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schmaltz, C., Rosenhahn, B., Brox, T. et al. Region-based pose tracking with occlusions using 3D models. Machine Vision and Applications 23, 557–577 (2012). https://doi.org/10.1007/s00138-010-0317-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-010-0317-5

Keywords

Navigation