Skip to main content

Advertisement

Log in

An appearance-based approach for consistent labeling of humans and objects in video

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present an approach for consistently labeling people and for detecting human–object interactions using mono-camera surveillance video. The approach is based on a robust appearance-based correlogram model combined with histogram information to model color distributions of people and objects in the scene. The models are dynamically built from non-stationary objects, which are the outputs of background subtraction, and are used to identify objects on a frame-by-frame basis. We are able to detect when people merge into groups and to segment them even during partial occlusion. We can also detect when a person deposits or removes an object. The models persist when a person or object leaves the scene and are used to identify them when they reappear. Experiments show that the models are able to accommodate perspective foreshortening that occurs with overhead camera angles, as well as partial occlusion. The results show that this is an effective approach that is able to provide important information to algorithms performing higher-level analysis, such as activity recognition, where human–object interactions play an important role.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Haritaoglu I, Harwood D, Davis LS (2000) W4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22(8):809–830

    Article  Google Scholar 

  2. McKenna SJ, Jabri S, Duric Z, Rosenfeld A, Wechsler H (2000) Tracking groups of people. Comput Vis Image Understanding 80(1):42–56

    Article  MATH  Google Scholar 

  3. Mittal A, Davis LS (2002) M2Tracker: a multi-view approach to segmenting and tracking people in a cluttered scene using region-based stereo. In: Proceedings of the 7th European conference on computer vision (ECCV 2002), Copenhagen, Denmark, May/June 2002, vol 1, pp 18–36

  4. Krumm J, Harris S, Meyers B, Brumitt B, Hale M, Shafer S (2000) Multi-camera multi-person tracking for EasyLiving. In: Proceedings of the 3rd IEEE international workshop on visual surveillance (VS 2000), Dublin, Ireland, July 2000, pp 3–10

  5. Wren C, Azarbayejani A, Darrel T, Pentland A (1997) Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785

    Article  Google Scholar 

  6. Senior A, Hampapur A, Tian Y-L, Brown L, Pankanti S, Bolle R (2001) Appearance models for occlusion handling. In: Proceedings of the 2nd IEEE international workshop on performance evaluation of tracking and surveillance (PETS 2001), Kauai, Hawaii, December 2001

  7. Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR’99), Fort Collins, Colorado, June 1999, pp 246–252

  8. Fuentes LM, Velastin SA (2001) People tracking in surveillance applications. In: Proceedings of the 2nd IEEE international workshop on performance evaluation of tracking and surveillance (PETS 2001), Kauai, Hawaii, December 2001

  9. Moon H, Chellappa R, Rosenfeld A (2001) 3D object tracking using shape-encoded particle propagation. In: Proceedings of the 8th IEEE international conference on computer vision (ICCV 2001), Vancouver, Canada, July 2001, pp 307–314

  10. Elgammal AM, Davis LS (2001) Probabilistic framework for segmenting people under occlusion. In: Proceedings of the 8th IEEE international conference on computer vision (ICCV 2001), Vancouver, Canada, July 2001, vol 2, pp 145–152

  11. Philomin V, Davis LS, Duraiswami R (2000) Tracking humans from a moving platform. In: Proceedings of the 15th IEEE international conference on pattern recognition (ICPR 2000), Barcelona, Spain, September 2000, pp 4171–4179

  12. Nakajima C, Pontil M, Heisele B, Poggio T (2000) People recognition in image sequences by supervised learning. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks (IJCNN 2000), Como, Italy, July 2000

  13. Raja Y, McKenna SJ, Gong S (1998) Segmentation and tracking using color mixture models. In: Proceedings of the 3rd Asian conference on computer vision (ACCV’98), Hong Kong, China, January 1998, vol 1, pp 601–614

  14. Huang J, Kumar SR, Mitra M, Zhu W-J, Zabih R (1999) Spatial color indexing and applications. Int J Comput Vis 35(3):245–268

    Article  Google Scholar 

  15. Li J, Chua CS, Ho YK (2002) Color based multiple people tracking. In: Proceedings of the 7th international conference on control, automation, robotics and vision (ICARCV 2002), Singapore, December 2002, pp 309–314

  16. Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621

    Google Scholar 

  17. Rao A, Srihari RK, Zhang Z (2000) Geometric histograms: a distribution of geometric configurations of color subsets. In: Proceedings of the international conference on internet imaging, San Jose, California, January 2000, vol 3964, pp 91–101

  18. Kovalev V, Petrou M (1996) Multidimensional co-occurrence matrices for object recognition and matching. Graph Models Image Process 58(3):187–197

    Article  Google Scholar 

  19. Kovalev V, Volmer S (1998) Color co-occurrence descriptors for querying-by-example. In: Proceedings of the 5th international conference on multimedia modeling (MMM’98), Lausanne, Switzerland, October 1998, pp 32–38

  20. Kim K, Chalidabhongse TH, Harwood D, Davis LS (2004) Background modeling by codebook construction. In: Proceedings of the IEEE international conference on image processing (ICIP 2004), Singapore, October 2004

  21. Kohonen T (1988) Learning vector quantization. Neural Netw 1:3–16

    Article  Google Scholar 

  22. Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vis 7(1):11–32

    Article  Google Scholar 

  23. Pass G, Zabih R (1999) Histogram refinement for content-based image retrieval. ACM J Multimedia Syst 7(3):234–240

    Article  Google Scholar 

  24. Horprasert T, Harwood D, Davis LS (2000) A robust background subtraction and shadow detection. In: Proceedings of the 4th Asian conference on computer vision (ACCV 2000), Taipei, Taiwan, January 2000

  25. Kalman RE (1960) A new approach to linear filtering and prediction problems. Basic Eng D–T ASME 82(1):35–45

    Google Scholar 

  26. Isard M, Blake A (1998) Condensation—conditional density propagation for visual tracking. Int J Comput Vis 29(1):5–28

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martí Balcells.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balcells, M., DeMenthon, D. & Doermann, D. An appearance-based approach for consistent labeling of humans and objects in video. Pattern Anal Applic 7, 373–385 (2004). https://doi.org/10.1007/s10044-004-0237-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-004-0237-y

Keywords

Navigation