skip to main content
10.1145/2504335.2504337acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
research-article

Toward a 3D body part detection video dataset and hand tracking benchmark

Authors Info & Claims
Published:29 May 2013Publication History

ABSTRACT

The purpose of this paper is twofold. First, we introduce our Microsoft Kinect--based video dataset of American Sign Language (ASL) signs designed for body part detection and tracking research. This dataset allows researchers to experiment with using more than 2-dimensional (2D) color video information in gesture recognition projects, as it gives them access to scene depth information. Not only can this make it easier to locate body parts like hands, but without this additional information, two completely different gestures that share a similar 2D trajectory projection can be difficult to distinguish from one another. Second, as an accurate hand locator is a critical element in any automated gesture or sign language recognition tool, this paper assesses the efficacy of one popular open source user skeleton tracker by examining its performance on random signs from the above dataset. We compare the hand positions as determined by the skeleton tracker to ground truth positions, which come from manual hand annotations of each video frame. The purpose of this study is to establish a benchmark for the assessment of more advanced detection and tracking methods that utilize scene depth data. For illustrative purposes, we compare the results of one of the methods previously developed in our lab for detecting a single hand to this benchmark.

References

  1. Developer SDK, toolkit & documentation | kinect for windows. http://www.microsoft.com/enus/kinectforwindows/develop/.Google ScholarGoogle Scholar
  2. OpenNI SDK | OpenNI. http://www.openni.org/openni-sdk/.Google ScholarGoogle Scholar
  3. V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, A. Stefan, and A. Thangali. The American Sign Language Lexicon Video Dataset, June 2008.Google ScholarGoogle Scholar
  4. P. Doliotis, A. Stefan, C. McMurrough, D. Eckhard, and V. Athitsos. Comparing gesture recognition accuracy using color and depth information. In Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments - PETRA '11, page 1, New York, New York, USA, 2011. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. I. Guyon, V. Athitsos, P. Jangyodsuk, B. Hamner, and H. Escalante. Chalearn gesture challenge: Design and first results. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pages 1--6, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. J. Iddan and G. Yahav. G.: 3d imaging in the studio (and elsewhere. In: SPIE, pages 48--55, 2001.Google ScholarGoogle Scholar
  7. J. B. Kruskal and M. Liberman. The symmetric time warping algorithm: From continuous to discrete. In Time Warps. Addison-Wesley, 1983.Google ScholarGoogle Scholar
  8. H. Lane, R. J. Hoffmeister, and B. Bahan. A Journey into the Deaf-World. DawnSign Press, San Diego, CA, 1996.Google ScholarGoogle Scholar
  9. H. Nanda and K. Fujimura. Visual tracking using depth data. In Computer Vision and Pattern Recognition Workshop, 2004. CVPRW '04. Conference on, page 37, june 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Schein. At home among strangers. Gallaudet U. Press, Washington, DC, 1989.Google ScholarGoogle Scholar
  11. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1297--1304, june 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Stefan, H. Wang, and V. Athitsos. Towards automated large vocabulary gesture search. Proceedings of the 2nd International Conference on PErvsive Technologies Related to Assistive Environments - PETRA '09, pages 1--8, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Valli, editor. The Gallaudet Dictionary of American Sign Language. Gallaudet U. Press, Washington, DC, 2006.Google ScholarGoogle Scholar
  14. M. Van den Bergh and L. Van Gool. Combining rgb and tof cameras for real-time 3d hand gesture interaction. In Applications of Computer Vision (WACV), 2011 IEEE Workshop on, pages 66--72, jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Wang, A. Stefan, S. Moradi, V. Athitsos, C. Neidle, and F. Kamangar. A system for large vocabulary sign search. In Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part I, ECCV'10, pages 342--353, Berlin, Heidelberg, 2012. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Toward a 3D body part detection video dataset and hand tracking benchmark

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Other conferences
                    PETRA '13: Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments
                    May 2013
                    413 pages
                    ISBN:9781450319737
                    DOI:10.1145/2504335

                    Copyright © 2013 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 29 May 2013

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader