Skip to main content

Recognizing Unfamiliar Gestures for Human-Robot Interaction Through Zero-Shot Learning

  • Conference paper
  • First Online:
Book cover 2016 International Symposium on Experimental Robotics (ISER 2016)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 1))

Included in the following conference series:

Abstract

Human communication is highly multimodal, including speech, gesture, gaze, facial expressions, and body language. Robots serving as human teammates must act on such multimodal communicative inputs from humans, even when the message may not be clear from any single modality. In this paper, we explore a method for achieving increased understanding of complex, situated communications by leveraging coordinated natural language, gesture, and context. These three problems have largely been treated separately, but unified consideration of them can yield gains in comprehension [1, 12].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://rpal.cs.cornell.edu/projects/unfamiliar-gestures.

References

  1. Artzi, Y., Zettlemoyer, L.: UW SPF: The University of Washington Semantic Parsing Framework (2013)

    Google Scholar 

  2. Chen, Q., Georganas, N.D., Petriu, E.M.: Real-time vision-based hand gesture recognition using haar-like features. In: Instrumentation and Measurement Technology Conference Proceedings, IMTC 2007, pp. 1–6. IEEE, May 2007. doi:10.1109/IMTC.2007.379068

  3. Eldon, M., Whitney, D., Tellex, S.: Interpreting Multimodal Referring Expressions in Real Time (2015). https://edge.edx.org/assetv1:Brown+CSCI2951-K+2015_T2+type@asset+block@eldon15.pdf

  4. Escalera, S., et al.: Chalearn multi-modal gesture recognition 2013: grand challenge and workshop summary. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, pp. 365–368. ACM (2013)

    Google Scholar 

  5. Escalera, S., et al.: Multi-modal gesture recognition challenge 2013: dataset and results. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, pp. 445–452. ACM (2013)

    Google Scholar 

  6. Gawron, P., et al.: Eigengestures for natural human computer interface. arXiv:1105.1293 [cs] 103, pp. 49–56 (2011). doi:10.1007/978-3-642-23169-8_6, http://arxiv.org/abs/1105.1293. Accessed 29 Oct 2015

  7. Ge, S.S., Yang, Y., Lee, T.H.: Hand gesture recognition and tracking based on distributed locally linear embedding. Image Vis. Comput. 26(12), 1607–1620 (2008). ISSN: 0262-8856. doi:10.1016/j.imavis.2008.03.004, http://www.sciencedirect.com/science/article/pii/S0262885608000693. Accessed 18 Nov 2015

  8. Guyon, I., et al.: The ChaLearn gesture dataset (CGD 2011). Mach. Vis. Appl. 25(8), 1929–1951 (2014). ISSN 0932-8092, 1432-1769. doi:10.1007/s00138-014-0596-3, http://link.springer.com/article/10.1007/s00138-014-0596-3. Accessed 02 Mar 2016

  9. Huang, C.-M., Mutlu, B.: Modeling and evaluating narrative gestures for humanlike robots. In: Robotics: Science and Systems (2013)

    Google Scholar 

  10. Jetley, S., et al.: Prototypical Priors: From Improving Classification to Zero- Shot Learning. arXiv:1512.01192 [cs] (3 December 2015). http://arxiv.org/abs/1512.01192. Accessed 29 Jan 2016

  11. Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28, 11–21 (1972)

    Article  Google Scholar 

  12. Kollar, T., et al.: Generalized grounding graphs: a probabilistic framework for understanding grounded language. In: JAIR (2013). https://people.csail.mit.edu/sachih/home/wp-content/uploads/2014/04/G3_JAIR.pdf

  13. Kondo, Y.: Body gesture classification based on bag-of-features in frequency domain of motion. In: 2012 IEEE RO-MAN, pp. 386–391 (2012). doi:10.1109/ROMAN.2012.6343783

  14. Luo, D., Ohya, J.: Study on human gesture recognition from moving camera images. In: 2010 IEEE International Conference on Multimedia and Expo (ICME), pp. 274–279, July 2010. doi:10.1109/ICME.2010.5582998

  15. Mikolov, T., et al.: Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs] (16 January 2013). arXiv:1301.3781, http://arxiv.org/abs/1301.3781. Accessed 30 Mar 2016

  16. Palatucci, M., et al.: Zero-shot learning with semantic output codes. In: Neural Information Processing Systems (NIPS), December 2009

    Google Scholar 

  17. Sauppé, A., Mutlu, B.: Robot deictics: how gesture and context shape referential communication. In: Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, HRI 2014, New York, NY, USA, pp. 342–349. ACM (2014). ISBN 978-1-4503-2658-2. doi:10.1145/2559636.2559657, http://doi.acm.org/10.1145/2559636.2559657. Accessed 19 Nov 2015

  18. Segers, V., Connan, J.: Real-time gesture recognition using eigenvectors (2009). http://www.cs.uwc.ac.za/~jconnan/publications/Paper%2056%20-%20Segers.pdf

  19. Socher, R., et al.: Zero-Shot Learning Through Cross-Modal Transfer. arXiv: 1301.3666 [cs] (16 January 2013). arXiv:1301.3666, http://arxiv.org/abs/1301.3666. Accessed 25 Jan 2016

  20. Takano, W., Hamano, S., Nakamura, Y.: Correlated space formation for human whole-body motion primitives and descriptive word labels. Rob. Auton. Syst. 66, 35–43 (2015)

    Article  Google Scholar 

  21. Mahbub, U., Imtiaz, H.: One-Shot-Learning Gesture Recognition Using Motion History Based Gesture Silhouettes (2013). doi:10.12792/iciae2013

  22. Wan, J., et al.: One-shot learning gesture recognition from RGB-D data using bag of features. J. Mach. Learn. Res. 14(1), 2549–2582 (2013). ISSN 1532-4435. http://dl.acm.org/citation.cfm?id=2567709.2567743. Accessed 25 Jan 2016

  23. Di, W., Zhu, F., Shao, L.: One shot learning gesture recognition from RGBD images. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 7–12. doi:10.1109/CVPRW.2012.6239179, June 2012

  24. Wu, J.: Fusing multi-modal features for gesture recognition. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI 2013, New York, NY, USA, pp. 453–460. ACM (2013). ISBN 978-1-4503-2129-7. doi:10.1145/2522848.2532589, http://doi.acm.org/10.1145/2522848.2532589. Accessed 31 Mar 2016

  25. Yin, Y., Davis, R.: Gesture spotting and recognition using salience detection and concatenated hidden Markov models. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI 2013, New York, NY, USA, pp. 489–494. ACM (2013). ISBN: 978-1-4503-2129-7. doi:10.1145/2522848.2532588, http://doi.acm.org/10.1145/2522848.2532588. Accessed 22 Jan 2016

  26. Zhou, Y., et al.: Kernel-based sparse representation for gesture recognition. Pattern Recogn. 46(12), 3208–3222 (2013). ISSN 0031-3203. doi:10.1016/j.patcog.2013.06.007, http://dx.doi.org/10.1016/j.patcog.2013. Accessed 29 Jan 2016

Download references

Acknowledgements

This material is based upon research supported by the Office of Naval Research under Award Number N00014-16-1-2080. We are grateful for this support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wil Thomason .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Thomason, W., Knepper, R.A. (2017). Recognizing Unfamiliar Gestures for Human-Robot Interaction Through Zero-Shot Learning. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds) 2016 International Symposium on Experimental Robotics. ISER 2016. Springer Proceedings in Advanced Robotics, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-50115-4_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50115-4_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50114-7

  • Online ISBN: 978-3-319-50115-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics