Abstract. The use of hand gestures provides an attractive means of interacting naturally with a computer-generated display. Using one or more video cameras, the hand movements can potentially be interpreted as meaningful gestures. One key problem in building such an interface without a restricted setup is the ability to localize and track the human arm robustly in video sequences. This paper proposes a multiple-cue localization scheme combined with a tracking framework to reliably track the dynamics of the human arm in unconstrained environments. The localization scheme integrates the multiple cues of motion, shape, and color for locating a set of key image features. Using constraint fusion, these features are tracked by a modified extended Kalman filter that exploits the articulated structure of the human arm. Moreover, an interaction scheme between tracking and localization is used for improving the estimation process while reducing the computational requirements. The performance of the localization/tracking framework is validated with the help of extensive experiments and simulations. These experiments include tracking with calibrated stereo camera and uncalibrated broadcast video.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received: 19 January 2001 / Accepted: 27 December 2001
Correspondence to: R. Sharma
Rights and permissions
About this article
Cite this article
Azoz, Y., Devi, L., Yeasin, M. et al. Tracking the human arm using constraint fusion and multiple-cue localization. Machine Vision and Applications 13, 286–302 (2003). https://doi.org/10.1007/s00138-002-0110-1
Issue Date:
DOI: https://doi.org/10.1007/s00138-002-0110-1