Abstract
Despite great progress achieved in 3-D pose tracking during the past years, occlusions and self-occlusions are still an open issue. This is particularly true in silhouette-based tracking where even visible parts cannot be tracked as long as they do not affect the object silhouette. Multiple cameras or motion priors can overcome this problem. However, multiple cameras or appropriate training data are not always readily available. We propose a framework in which the pose of 3-D models is found by minimising the 2-D projection error through minimisation of an energy function depending on the pose parameters. This framework makes it possible to handle occlusions and self-occlusions by tracking multiple objects and object parts simultaneously. Therefore, each part is described by its own image region each of which is modeled by one probability density function. This allows to deal with occlusions explicitly, which includes self-occlusions between different parts of the same object as well as occlusions between different objects. The results we present for simulations and real-world scenes demonstrate the improvements achieved in monocular and multi-camera settings. These improvements are substantiated by quantitative evaluations, e.g. based on the HumanEVA benchmark.
Similar content being viewed by others
References
Abidi, M.A., Chandra, T.: Pose estimation for camera calibration and landmark tracking. In: Proceedings of the International Conference on Robotics and Automation, vol. 1, pp. 420–426. Cincinnati (1990)
Ablavsky, V., Thangali, A., Sclaroff, S.: Layered graphical models for tracking partially-occluded objects. In: Proc. 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, New York (2008)
Agarwal A., Triggs B.: Recovering 3D human pose from monocular images. IEEE Transac. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006)
Aubert G., Marlaud M., Faugeras O., Jehan-Besson S.: Image segmentation using active contours: calculus of variations or shape gradients?. SIAM J. Appl. Math. 63(6), 2128–2154 (2003)
Beveridge, J.: Local search algorithms for geometric object recognition: Optimal correspondence and pose. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst (1993)
Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: Proc. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8–15. Santa Barbara (1998)
Brox T., Weickert J.: A TV flow based local scale estimate and its application to texture discrimination. J. Vis. Commun. Image Represent. 17(5), 1053–1073 (2006)
Brox, T., Rosenhahn, B., Cremers D., Seidel, H.P.: High accuracy optical flow serves 3-D pose tracking: exploiting contour and flow based constraints. In: Leonardis, A., Bischof, H., Pinz, A. (eds) European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol. 3952, pp. 98–111. Springer, Graz (2006a)
Brox, T., Rosenhahn, B., Kersting, U., Cremers, D.: Nonparametric density estimation for human pose tracking. In: Franke, K., Müller, K., Nickolay, B., Schäfer, R. Pattern recognition lecture notes in computer science, vol. 4174, pp. 546–555. Springer, Berlin (2006)
Brubaker M.A., Fleet D.J., Hertzmann A.: Physics-based person tracking using the anthropomorphic walker. Int. J. Comput. Vis. 87(1/2), 140–155 (2010)
Bălan, A.O., Black, M.J.: The naked truth: estimating body shape under clothing. In: Forsyth, D., Torr, P., Zisserman, A. Computer Vision—ECCV 2008, Part II, Lecture Notes in Computer Science, vol. 5303, pp. 15–29. Springer, Berlin (2008)
Corazza S., Mündermann L., Gambaretto E., Ferrigno G., Andriacchi T.P.: Markerless motion capture through visual hull, articulated ICP and subject specific model generation. Int. J. Comput. Vis. 87(1/2), 156–169 (2010)
Dambreville S., Sandhu R., Yezzi A., Tannenbaum A.: Robust 3D pose estimation and efficient 2D region-based segmentation from a 3D shape prior. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision—ECCV 2008, Part II, Lecture Notes in Computer Science, vol. 5303, pp. 169–182. Springer, Berlin (2008)
Drummond T., Cipolla R.: Real-time tracking of multiple articulated structures in multiple views. In: Vernon, D. (ed) Computer Vision—ECCV 2000, Part II, Lecture Notes in Computer Science, vol. 1843, pp. 20–36. Springer, Berlin (2000)
Forsyth D.A., Arikan O., Ikemoto L., O’Brien J., Ramanan D.: Computational studies of human motion: part 1, tracking and motion synthesis. Found. Trends Comput. Graph. Vis. 1(2–3), 77–254 (2005)
Gall J., Rosenhahn B., Seidel H.P.: Drift-free tracking of rigid and articulated objects. In: Proceedings of 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition, IEEE Computer Society Press, New York (2008)
Gall J., Rosenhahn B., Brox T., Seidel H.P.: Optimization and filtering for human motion capture. Int. J. Comput. Vis. 87(1/2), 75–92 (2010)
Gavrila D.M.: The visual analysis of human movement: a survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999)
Gonzalez R.C., Woods R.E.: Digital image processing, 2nd edn. Addison–Wesley, Reading (2002)
Grabner H., Matas J.G., Gool L.V., Cattin P.: Tracking the invisible: Learning where the object might be. In: Proc. 2010 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), pp. 1285–1292. IEEE Computer Society Press, New York (2010)
Huang Y., Essa I.: Tracking multiple objects through occlusions. In: Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1051–1058, IEEE Computer Society Press, New York (2005)
ISO/CIE CIE colorimetry—part 4: 1976 L*a*b* colour space. ISO 11664-4:2008(E)/CIE S 014-4/E:2007 (2007)
Joshi N., Avidan S., Matusik W., Kriegman D.: (2007) Synthetic aperture tracking: tracking through occlusions. In: Proceedings of Eleventh International Conference on Computer Vision, IEEE Computer Society Press, New York
Kim K., Davis L.: Multi-camera tracking and segmentation of occluded people on ground plane using search-guided particle filtering. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) Computer vision—ECCV 2006, Part II, Lecture Notes in Computer Science, vol. 3953, pp. 98–109. Springer, Berlin (2006)
Kriegman D., Vijayakumar B., Ponce J.: Constraints for recognizing and locating curved 3D objects from monocular image features. In: Sandini, G. (ed.) Computer Vision—ECCV ’92, Lecture Notes in Computer Science, vol. 588, pp. 829–833. Springer, Berlin (1992)
Kullback S., Leibler R.A.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Lankton S., Tannenbaum A.: Localizing region-based active contours. IEEE Transac. Image Process. 17(11), 2029–2039 (2008)
Lowe D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Lowe D.G.: Fitting parameterized three-dimensional models to images. IEEE Transac. Pattern Anal. Mach. Intell. 13(2), 441–450 (1991)
Moeslund T.B., Hilton A., Krüger V.: A survey of advances in vision-based human motion capture and analysis. Int. J. Comput. Vis. 104(2), 90–126 (2006)
Mory B., Ardon R., Thiran J.: Variational segmentation using fuzzy region competition and local non-parametric probability density functions. In: Proc. Eleventh International Conference on Computer Vision, IEEE Computer Society Press, New York (2007)
Murray R.M., Li Z., Sastry S.S.: A mathematical introduction to robotic manipulation. CRC Press, Boca Raton (1994)
Ormoneit, D., Sidenbladh, H., Black, M.J., Hastie, T.: Learning and tracking cyclic human motion. In: Leen, T.K., Dietterich, T.G., Tresp, V. Advances in Neural Information Processing Systems 13, pp. 894–900 The MIT Press, MA (2001)
Ottlik A., Nagel H.H.: Initialization of model-based vehicle tracking in video sequences of inner-city intersections. Int. J. Comput. Vis. 80(2), 211–225 (2008)
Özuysal M., Lepetit V., Fleuret F., Fua P.: Feature harvesting for tracking-by-detection. In: Computer Vision—ECCV 2006, Part III, Lecture Notes in Computer Science, vol. 3953, pp. 592–605. Springer, Graz (2006)
Poppe R.: Vision-based human motion analysis: An overview. Comput. Vis. Image Underst. 108(1–2), 4–18 (2007)
Pressigout M., Marchand E.: Real-time 3d model-based tracking: Combining edge and texture information. In: IEEE Int. Conf. on Robotics and Automation, ICRA’06. pp. 2726–2731. Orlando (2006)
Prisacariu V.A., Reid I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. In: Proceedings of the 20th British Machine Vision Conference (2009)
Ramanan D., Forsyth D.A., Zisserman A.: Tracking people by learning their appearance. IEEE Transac. Pattern Ana. Mach. Intell. 29(1), 65–81 (2007)
Rosenhahn B., Sommer G.: Adaptive pose estimation for different corresponding entities. In: Van Gool, L. (ed) Pattern Recognition, Lecture Notes in Computer Science, vol. 2449, pp. 265–273. Springer, Berlin (2004)
Rosenhahn B., Brox T., Weickert J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. Int. J. Comput. Vis. 73(3), 243–262 (2007)
Rosenhahn, B., Schmaltz, C., Brox, T., Weickert, J., Cremers, D., Seidel, H.P.: Markerless motion capture of man-machine interaction. In: Proceedings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Washington (2008)
Sandhu, R., Dambreville, S., Yezzi, A., Tannenbaum, A.: Non-rigid 2D-3D pose estimation and 2D image segmentation. In: Proceedings of 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 786–793. IEEE Computer Society Press, Washington (2009)
Schmaltz C., Rosenhahn B., Brox T., Cremers D., Weickert J., Wietzke L., Sommer G.: Region-based pose tracking. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern recognition and image analysis, Lecture Notes in Computer Science, vol. 4478, pp. 56–63. Springer, Girona (2007)
Schmaltz C., Rosenhahn B., Brox T., Weickert J., Cremers D., Wietzke L., Sommer G.: Occlusion modeling by tracking multiple objects. In: Hambrecht, F., Schnörr, C., Jähne, B. (eds) Pattern Recognition, Lecture Notes in Computer Science, vol. 4713, pp. 173–183. Springer, Berlin (2007)
Schmaltz C., Rosenhahn B., Brox T., Weickert J., Wietzke L., Sommer G.: Dealing with self-occlusion in region based motion capture by means of internal regions. In: Perales, F.J., Fisher, R.B. (eds) Articulated Motion and Deformable Objects, Lecture Notes in Computer Science, vol 5098, pp. 102–111. Springer, Berlin (2008)
Shevlin F.: Analysis of orientation problems using Plucker lines. In: Proc. 14th International Conference on Pattern Recognition, vol. 1, pp. 685–689. IEEE Computer Society Press, Washington (1998)
Sigal L., Balan A., Black M.: HUMANEVA: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1/2), 4–27 (2010)
Stoer J., Bulirsch R.: Introduction to Numerical Analysis. Springer, Berlin (1980)
Sudderth E.B., Mandel M.I., Freeman W.T., Willsky A.S.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: Saul, L.K., Weiss, Y., Bottou, L. (eds) Advances in Neural Information Processing Systems 17, pp. 1369–1376. MIT Press, Cambridge (2005)
Sun, M., Su, H., Savarese, S., Fei-Fei, L.: A multi-view probabilistic model for 3D object classes. In: Proc. 2009 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), pp. 1247–1254. IEEE Computer Society Press, New York (2009)
Sundaresan A., Chellappa R.: Multicamera tracking of articulated human motion using shape and motion cues. IEEE Transac. Image Process. 18(9), 2114–2126 (2009)
Vondrak, M., Sigal, L., Jenkins, O.C.: Physical simulation for probabilistic motion tracking. In: Proc. 2008 IEEE Computer Society Conference of Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, New York (2008)
Wong A.K.C., You M.: Entropy and distance of random graphs with application of structural pattern recognition. IEEE Transac. Pattern Anal. Mach. Intell. 7(5), 599–609 (1985)
Author information
Authors and Affiliations
Corresponding author
Additional information
We gratefully acknowledge funding by the German Research Foundation (DFG) under the project We 2602/5-2.
Rights and permissions
About this article
Cite this article
Schmaltz, C., Rosenhahn, B., Brox, T. et al. Region-based pose tracking with occlusions using 3D models. Machine Vision and Applications 23, 557–577 (2012). https://doi.org/10.1007/s00138-010-0317-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-010-0317-5