skip to main content
10.1145/2207676.2208303acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Instructing people for training gestural interactive systems

Published: 05 May 2012 Publication History

Abstract

Entertainment and gaming systems such as the Wii and XBox Kinect have brought touchless, body-movement based interfaces to the masses. Systems like these enable the estimation of movements of various body parts from raw inertial motion or depth sensor data. However, the interface developer is still left with the challenging task of creating a system that recognizes these movements as embodying meaning. The machine learning approach for tackling this problem requires the collection of data sets that contain the relevant body movements and their associated semantic labels. These data sets directly impact the accuracy and performance of the gesture recognition system and should ideally contain all natural variations of the movements associated with a gesture. This paper addresses the problem of collecting such gesture datasets. In particular, we investigate the question of what is the most appropriate semiotic modality of instructions for conveying to human subjects the movements the system developer needs them to perform. The results of our qualitative and quantitative analysis indicate that the choice of modality has a significant impact on the performance of the learnt gesture recognition system; particularly in terms of correctness and coverage.

References

[1]
Chalearn gesture dataset (cgd2011), chalearn, california, 2011.
[2]
Aggarwal, J., and Ryoo, M. Human activity analysis: A review. ACM Computing Surveys (2011). To appear.
[3]
Breiman, L. Random forests. Machine Learning 45, 1 (2001).
[4]
Bruner, J. Toward a theory of instruction. Belknap Press of Harvard University Press, 1966.
[5]
Charbonneau, E., Miller, A., and LaViola, J. Teach me to dance: Exploring player experience and performance in full body dance games.
[6]
Fothergill, S., Harle, R., and Holden, S. Modelling the model athlete : Automatic coaching of rowing technique. In Structural, Syntactic, and Statistical Pattern Recognition, vol. 5342 of LNCS (2008), 372--381.
[7]
Furui, S., Nakamura, M., Ichiba, T., and Iwano, K. Why is the recognition of spontaneous speech so hard? In Text, Speech and Dialogue, V. Matouek, P. Mautner, and T. Pavelka, Eds., vol. 3658 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2005, 747--747.
[8]
Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. Actions as space-time shapes. Transactions on Pattern Analysis and Machine Intelligence 29, 12 (December 2007), 2247--2253.
[9]
Guest, A. H. Labanotation, or, Kinetography Laban: The System of Analyzing and Recording Movements. Dance Books, 1996.
[10]
Hwang, B.-W., K. S., and Lee, S.-W. A full-body gesture database for automatic gesture recognition. In Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition, FGR '06, IEEE Computer Society) (2006), 243--248.
[11]
Kress, and van Leeuwen. Reading Images: Grammar of Visual Design. Routledge, 1996.
[12]
Kuehne, H., J. H. G. E. P. T., and Serre, T. HMDB: a large video database for human motion recognition. In Proceedings of the International Conference on Computer Vision (ICCV) (2011).
[13]
Laptev, I., Marszalek, M., Schmid, C., and Rozenfeld, B. Learning realistic human actions from movies. In CVPR, IEEE Computer Society (2008).
[14]
Lin, Z., Jiang, Z., and Davis, L. S. Recognizing actions by shape-motion prototype trees. In ICCV, IEEE (2009), 444--451.
[15]
Liu, J. G., Luo, J. B., and Shah, M. Recognizing realistic actions from videos 'in the wild'. In CVPR (2009), 1996--2003.
[16]
Marszałek, M., Laptev, I., and Schmid, C. Actions in context. In CVPR, IEEE (2009), 2929--2936.
[17]
McNeil, D. Hand and Mind, What Gestures Reveal about Thought. The University of Chicago Press, 1992.
[18]
Nowozin, S., and Shotton, J. Action points: A representation for low-latency online human action recognition.
[19]
Nunnally, J. C., and Bernstein, I. H. Psychometric Theory. McGraw-Hill, 1994.
[20]
Oh, S., Hoogs, A., Perera, A., Cuntoor, N., Chen, C.-C., Lee, J. T., Mukherjee, S., and et al. A large-scale benchmark dataset for event recognition in surveillance video. In CVPR (2011).
[21]
Padmanabhan, M., Ramaswamy, G., Ramabhadran, B., Gopalakrishnan, P. S., and Dunn, C. Issues involved in voicemail data collection. In DARPA Hub 4 Workshop (1998).
[22]
Peirce, C. On a new list of categories. Proceedings of the American Academy of Arts and Sciences (1867).
[23]
Poppe, R. A survey on vision-based human action recognition. Image and Vision Computing 28, 6 (2010), 976--990.
[24]
Quinn, D. Personal communication with David Quinn (RARE, UK), August 2011.
[25]
Rijsbergen, C. J. V. Information Retrieval. Butterworths, 1979.
[26]
Rodriguez, M. D., Ahmed, J., and Shah, M. Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In CVPR, IEEE Computer Society (2008).
[27]
Schindler, K., and Gool, L. J. V. Action snippets: How many frames does human action recognition require? In CVPR, IEEE Computer Society (2008).
[28]
Schüldt, C., Laptev, I., and Caputo, B. Recognizing human actions: A local SVM approach. In ICPR (2004), 32--36.
[29]
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. Real-time human pose recognition in parts from a single depth image. In CVPR (2011).
[30]
Stone, E., and Skubic, M. Evaluation of an inexpensive depth camera for passive in-home fall risk assessment. In Pervasive Health Conference (2011).
[31]
Turaga, P. K., Chellappa, R., Subrahmanian, V. S., and Udrea, O. Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Techn 18, 11 (2008), 1473--1488.
[32]
Weinland, D., Ronfard, R., and Boyer, E. A survey of vision-based methods for action representation, segmentation and recognition. Tech. rep., INRIA, February 2010.
[33]
Yao, A., Gall, J., Fanelli, G., and van Gool, L. Does human action recognition benefit from pose estimation? In BMVC (2011).

Cited By

View all
  • (2024)A Survey of Cutting-edge Multimodal Sentiment AnalysisACM Computing Surveys10.1145/365214956:9(1-38)Online publication date: 25-Apr-2024
  • (2024)IoT-Enhanced Gesture Recognition System for Healthcare Applications2024 International Telecommunications Conference (ITC-Egypt)10.1109/ITC-Egypt61547.2024.10620584(184-189)Online publication date: 22-Jul-2024
  • (2024)A Brief Review of Sign Language Recognition Methods and Cutting-edge Technologies2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10603746(1233-1242)Online publication date: 12-Apr-2024
  • Show More Cited By

Index Terms

  1. Instructing people for training gestural interactive systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
    May 2012
    3276 pages
    ISBN:9781450310154
    DOI:10.1145/2207676
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 May 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data collection
    2. instructing movement
    3. machine learning
    4. natural gesture recognition

    Qualifiers

    • Research-article

    Conference

    CHI '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)60
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Survey of Cutting-edge Multimodal Sentiment AnalysisACM Computing Surveys10.1145/365214956:9(1-38)Online publication date: 25-Apr-2024
    • (2024)IoT-Enhanced Gesture Recognition System for Healthcare Applications2024 International Telecommunications Conference (ITC-Egypt)10.1109/ITC-Egypt61547.2024.10620584(184-189)Online publication date: 22-Jul-2024
    • (2024)A Brief Review of Sign Language Recognition Methods and Cutting-edge Technologies2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10603746(1233-1242)Online publication date: 12-Apr-2024
    • (2024)Leveraging spatio-temporal features using graph neural networks for human activity recognitionPattern Recognition10.1016/j.patcog.2024.110301150(110301)Online publication date: Jun-2024
    • (2024)Common datasets in the field of gesture recognitionGesture Recognition10.1016/B978-0-443-28959-0.00008-X(17-33)Online publication date: 2024
    • (2024)High Dimensional Computing Approach to Detection and Learning Gesture BiometricsIntelligent Computing10.1007/978-3-031-62273-1_35(551-565)Online publication date: 15-Jun-2024
    • (2023)Brave New GES World: A Systematic Literature Review of Gestures and Referents in Gesture Elicitation StudiesACM Computing Surveys10.1145/363645856:5(1-55)Online publication date: 7-Dec-2023
    • (2023)MOST: Model-Based Compression with Outlier Storage for Time Series DataProceedings of the ACM on Management of Data10.1145/36267371:4(1-29)Online publication date: 12-Dec-2023
    • (2023)Fusing Skeletons and Texts Based on GCN-CNN for Action RecognitionProceedings of the 15th International Conference on Digital Image Processing10.1145/3604078.3604087(1-6)Online publication date: 19-May-2023
    • (2023)iFAD Gestures: Understanding Users’ Gesture Input Performance with Index-Finger Augmentation DevicesProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580928(1-17)Online publication date: 19-Apr-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media