Skip to main content
Log in

Human tracking and silhouette extraction for human–robot interaction systems

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this study, we propose a new integrated computer vision system designed to track multiple human beings and extract their silhouette with a pan-tilt stereo camera, so that it can assist in gesture and gait recognition in the field of Human–Robot Interaction (HRI). The proposed system consists of three modules: detection, tracking and silhouette extraction. These modules are robust to camera movements, and they work interactively in near real-time. Detection was performed by camera ego-motion compensation and disparity segmentation. For tracking, we present an efficient mean shift-based tracking method in which the tracking objects are characterized as disparity weighted color histograms. The silhouette was obtained by two-step segmentation. A trimap was estimated in advance and then effectively incorporated into the graph-cut framework for fine segmentation. The proposed system was evaluated with respect to ground truth data, and it was shown to detect and track multiple people very well and also produce high-quality silhouettes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ahn J-H, Byun H (2006) Accurate foreground extraction using graph cut with trimap estimation. LNCS 4319:1185–1194

    Google Scholar 

  2. Ahn J-H, Kwak S, Choi C, Kim K, Byun H (2006) An integrated robot vision system for multiple human tracking and silhouette extraction. LNCS 4282:575–583

    Google Scholar 

  3. Ahn J-H, Kim K, Byun H (2006) Robust object segmentation using graph cut with object and background seed estimation. Proc Int Conf Pattern Recognit 2:361–364

    Google Scholar 

  4. Balaguer C, Gimenez A, Huete AJ, Sabatini AM, Topping M, Bolmsjo G (2006) The MATS robot: service climbing robot for personal assistance. IEEE Robot Autom Mag 13(1):51–58

    Article  Google Scholar 

  5. Balcells M, DeMenthon D, Doermann D (2004) An appearance-based approach for consistent labeling of humans and objects in video. Pattern Anal Appl 7(4):373–385

    Article  MathSciNet  Google Scholar 

  6. Basanez L, Rosell J (2005) Robotic polishing systems. IEEE Robot Autom Mag 12(3):35–43

    Article  Google Scholar 

  7. Baumberg A, Hogg DC (1997) Learning deformable models for tracking the human body. In: Shah M, Jain R (eds) Motion-based recognition. Kluwer, Dordrecht, pp 39–60

  8. Bischoff R, Graefe V (2004) HERMES—a versatile personal robotic assiatance. IEEE Proc Spec Issue Hum Interact Robots Psychol Enrich 92(11):1759–1779

    Google Scholar 

  9. Boykov Y, Jolly M (2001) Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. Proc Int Conf Comput Vis 1:105–112

    Google Scholar 

  10. Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137

    Article  Google Scholar 

  11. Bradski GR, Davis J (2000) Motion segmentation and pose estimation with motion history gradients. In: IEEE workshop on applications of computer vision, pp 174–184

  12. Breazeal C (2003) Designing sociable robots. Robot Auton Syst 42:167–175

    Article  MATH  Google Scholar 

  13. Chang C, Ansari R (2005) Kernel particle filter for visual tracking. IEEE Signal Process Lett 12(3):242–245

    Article  Google Scholar 

  14. Choi C, Ahn J-H, Lee S, Byun H (2006) Disparity weighted histogram-based object tracking for mobile robot systems. LNCS 4282:584–593

    Google Scholar 

  15. Chu CW, Cohen I (2005) Posture and gesture recognition using 3D body shapes decomposition. Proc Int Conf Comput Vis Pattern Recognit 3:69–69

    Google Scholar 

  16. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  17. Comaniciu D, Ramesh V (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577

    Article  Google Scholar 

  18. Crowley J (1997) Vision for man–machine interaction. Robot Auton Syst 19:347–358

    Article  Google Scholar 

  19. Cucchiara R, Grana C, Prati A, Vezzani R (2005) Probabilistic posture classification for human-behavior analysis. IEEE Trans Syst Man Cybern Part A Syst Hum 35(1):42–54

    Article  Google Scholar 

  20. Darrell T, Gordon G, Harville M, Woodfill J (2000) Integrated person tracking using stereo, colour, and pattern detection. Int J Comput Vis 15:175–185

    Article  Google Scholar 

  21. Erickson JK (2006) Living the dream—an overview of the Mars exploration project. IEEE Robot Autom Mag 13(2):12–18

    Article  MathSciNet  Google Scholar 

  22. Fernyhough J, Cohn AG, Hogg DC (2000) Constructing qualitative event models automatically from video input. Image Vis Comput 18(9):81–103

    Article  Google Scholar 

  23. Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Rob Auton Syst 42:143–166

    Article  MATH  Google Scholar 

  24. Foresti GL, Micheloni C (2003) A robust feature tracker for active surveillance of outdoor scenes. Electron Lett Comput Vis Image Anal 1(1):21–34

    Google Scholar 

  25. Gavrila DM (1999) The visual analysis of human movement: a survey. Comput Vis Image Underst 73(1):82–98

    Article  MATH  Google Scholar 

  26. Hager G, Dewan M, Stewart C (2004) Multiple kernel tracking with ssd. Proc IEEE Conf Comput Vis Pattern Recognit 1:790–797

    Google Scholar 

  27. Haritaoglu I, Harwood D, Davis LS (2000) W4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 1:790–797

    Google Scholar 

  28. Harris C, Stephens MJ (1988) A combined corner and edge detector. In: Proceedings of the fourth Alvey vision conference, pp 147–151

  29. Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C Appl Rev 34(3):334–352

    Article  Google Scholar 

  30. Jung B, Sukhatme GS (2004) Detecting moving objects using a single camera on a mobile robot in an outdoor environment. In: Proceedings of internationall conference on intelligent autonomous systems, pp 980–987

  31. Kohli P, Torr PHS (2006) Efficiently solving dynamic Markov random fields using graph cuts. Proc Int Conf Comput Vis 2:922–929

    Google Scholar 

  32. Kolmogorov V, Criminisi A, Blake A, Cross G, Rother C (2006) Probabilistic fusion of stereo with color and contrast for bilayer segmentation. IEEE Trans Pattern Anal Mach Intell 28(9):1480–1492

    Article  Google Scholar 

  33. Konolige K (1997) Small vision systems: hardware and implementation. In: Proceedings of international symposium on robotics research, pp 111-116

  34. Li H, Greenspan M (2005) Multi-scale gesture recognition from time-varying contours. Proc IEEE Int Conf Comput Vis 1:236-243

    Google Scholar 

  35. Li Y, Sun J, Shum H-Y (2005) Video object cut and paste. ACM Trans Graph 24(3):595–600

    Article  Google Scholar 

  36. Li Y, Sun J, Tang C-K, Shum H-Y (2005) Lazy snapping. ACM Trans Graph 23(3):303–308

    Article  Google Scholar 

  37. Lin J, Parhi KK (2005) VLSI architectures for stereoscopic video, disparity matching and object extraction. In: Proceedings of 2005 IEEE international symposium on circuits and systems, pp 2373–2376

  38. Liu Z, Sarkar S (2005) Effect of silhouette quality on hard problems in gait recognition. IEEE Trans Syst Man Cybern Part B 35(2):170–183

    Article  Google Scholar 

  39. Ludington B, Johnson E, Vachtsevanos G (2006) Augmenting UAV autonomy. IEEE Rob Autom Mag 13(3):63–71

    Article  Google Scholar 

  40. McKenna S, Jabri S, Duric Z, Rosenfeld A, Wechsler H (2000) Tracking groups of people. Comput Vis Image Underst 80(1-2):42–56

    Article  MATH  Google Scholar 

  41. Messelodi S, Modena CM, Zanin M (2005) A computer vision system for the detection and classification of vehicles at urban road intersections. Pattern Anal Appl 8(1-2):42–56

    MathSciNet  Google Scholar 

  42. Meyer D, Denzler J, Niemann H (1998) Model based extraction of articulated objects in image sequences for gait analysis. In: Proceedings of IEEE international conference on image processing, pp 78-81

  43. Meyer D, Psl J, Niemann H (1998) Gait classification with HMM’s for trajectories of body parts extracted by mixture densities. In: Proceedings of British machine vision conference, pp 459–468

  44. Pineau J, Montemerlo M, Pollack M, Roy N, Thrun S (2003) Towards robotic assistants in nursing homes: challenges and results. Rob Auton Syst 42:271–281

    Article  MATH  Google Scholar 

  45. Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  46. Shi J, Tomasi C (1994) Good features to track. In: Proceedings of international conference on computer vision and pattern recognition, pp 593–600

  47. Stauffer C, Grimson WEL (2000) Learning patterns of activity using real-time tracking. IEEE Trans Pattern Anal Mach Intell 22(8):747-757

    Article  Google Scholar 

  48. Tiand T, Tomasi C (1996) Comparision of approaches to egomotion computation. In: Proceedings of IEEE conference computer vision and pattern recognition, pp 315–320

  49. Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall, London

    MATH  Google Scholar 

  50. Wang RR, Huang T (2004) A framework of joint object tracking and event detection. Pattern Anal Appl 7(4):343–355

    Article  Google Scholar 

  51. Wang L, Tan T, Ning H, Hu W (2003) Silhouette analysis-based gait recognition for human identification. IEEE Trans Pattern Anal Mach Intell 25(12):1505–1518

    Article  Google Scholar 

  52. Yang C, Duraiswami R, Davis L (2005) Efficient mean-shift tracking via a new similarity measure. Proc IEEE Conf Comput Vis Pattern Recognit 1:176–183

    Google Scholar 

  53. Yang M, Wang S, Lin Y (2005) A multimodal fusion system for people detection and tracking. Int J Imaging Syst Technol 15:131–142

    Article  Google Scholar 

  54. Zelnik-Manor L, Irani M (2006) Statistical analysis of dynamic actions. IEEE Trans Pattern Anal Mach Intell 28(9):1530–1535

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by the MIC (Ministry of Information and Communication), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Advancement) [IITA-2007-(C1090-0701-0046)]. This research was supported by the KOSEF (Korea Science and Engineering Foundation) (R01-2007-000-11683-0).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyeran Byun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahn, JH., Choi, C., Kwak, S. et al. Human tracking and silhouette extraction for human–robot interaction systems. Pattern Anal Applic 12, 167–177 (2009). https://doi.org/10.1007/s10044-008-0112-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-008-0112-3

Keywords

Navigation