Abstract
This study focuses on gesture recognition in mobile interaction settings, i.e. when the interacting partners are walking. This kind of interaction requires a particular coordination, e.g. by staying in the field of view of the partner, avoiding obstacles without disrupting group composition and sustaining joint attention during motion. In literature, various studies have proven that gestures are in close relation in achieving such goals.
Thus, a mobile robot moving in a group with human pedestrians, has to identify such gestures to sustain group coordination. However, decoupling of the inherent -walking- oscillations and gestures, is a big challenge for the robot. To that end, we employ video data recorded in uncontrolled settings and detect arm gestures performed by human-human pedestrian pairs by adopting a signal processing approach. Namely, we exploit the fact that there is an inherent oscillatory motion at the upper limbs arising from the gait, independent of the view angle or distance of the user to the camera. We identify arm gestures as disturbances on these oscillations. In doing that, we use a simple pitch detection method from speech processing and assume data involving a low frequency periodicity to be free of gestures. In testing, we employ a video data set recorded in uncontrolled settings and show that we achieve a detection rate of 0.80.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
For instance, a hand waving gesture can refer to acceptance or rejection depending on the affective state or attitude of the performer.
- 2.
Although the amplitude of the oscillations vary with the view angle, we expect their frequency to be reasonably stable, provided that there is not significant (self)occlusion.
- 3.
Specifically, benefits of arm swings to gait economy involve decreasing shoulder and elbow joint torques, offsetting motion of the legs, reducing vertical ground reaction moments and attendant muscle forces, thereby reducing metabolic energy expenditure [18]. They also produce counter-rotations of the pelvis and thorax to maintain stability and a steady visual platform by minimizing head movements [19, 31].
- 4.
85 min of 1080p and 60 fps video from 8 cameras with more than 2700 identities.
- 5.
In addition, trajectories on image plane are provided in a piece-wise linear manner and relating real-world coordinates can be computed using homography matrices.
- 6.
Here, we exclude fine-grained gestures arising from finger and wrist motion.
- 7.
\(T=1\) is considered to give satisfactory results.
- 8.
With current depth sensors, observing the environments at the scale of the ones in Fig. 1 is perhaps not possible, if not with some very expensive equipment.
References
Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. IEEE TPAMI 33(9), 1806–1819 (2011)
Bochinski, E., Eiselein, V., Sikora, T.: Training a convolutional neural network for multi-class object detection using solely virtual world data. In: AVSS, pp. 278–285. IEEE (2016)
Breazeal, C., Kidd, C.D., Thomaz, A.L., Hoffman, G., Berlin, M.: Effects of nonverbal communication on efficiency and robustness in human-robot teamwork. In: IROS, pp. 708–713. IEEE (2005)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR, pp. 7291–7299 (2017)
Consortium for the physics and psychology of human crowd dynamics: a glossary for research on human crowd dynamics. Collect. Dyn. 4, 1–13 (2019)
De-León-Gómez, V., Luo, Q., Kalouguine, A., Pámanes, J.A., Aoustin, Y., Chevallereau, C.: An essential model for generating walking motions for humanoid robots. Robot. Auton. Syst. 112, 229–243 (2019)
Di Scala, G., et al.: Efficiency of sensorimotor networks: posture and gait in young and older adults. Exp. Aging Res. 45, 41–56 (2019). https://doi.org/10.1080/0361073X.2018.1560108
Ferreira, J.P., Crisostomo, M.M., Coimbra, A.P.: Human gait acquisition and characterization. IEEE Trans. IM 58(9), 2979–2988 (2009)
Goldin-Meadow, S.: Using our hands to change our minds. WIREs Cogn. Sci. 8(1–2), e1368 (2017)
Gorobtsov, A., Andreev, A., Markov, A., Skorikov, A., Tarasov, P.: Features of solving the inverse dynamic method equations for the synthesis of stable walking robots controlled motion. In: SPIIRAS Proceedings, vol. 18, pp. 85–122, February 2019. https://doi.org/10.15622/sp.18.1.85-122
Haddington, P., Mondada, L., Nevile, M.: Interaction and Mobility: Language and The Body in Motion, vol. 20. Walter de Gruyter, Berlin (2013)
Karam, M.: A framework for research and design of gesture-based human-computer interactions. Ph.D. thesis, University of Southampton (2006)
Katz, P.S.: Evolution of central pattern generators and rhythmic behaviours. Philos. Trans. R. Soc. B 371(1685), 20150057 (2016)
Krippendorff, K.: Reliability in content analysis: some common misconceptions and recommendations. Hum. Commun. Res. 30(3), 411–433 (2004)
Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. Sage, Thousand Oaks (2018)
McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press, Chicago (1992)
Meng, S., Jin, S., Li, J., Hashimoto, K., Guo, S., Dai, S.: The analysis of human walking stability using ZMP in sagittal plane. In: 2017 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), pp. 496–501. IEEE (2017)
Meyns, P., Bruijn, S.M., Duysens, J.: The how and why of arm swing during human walking. Gait Posture 38(4), 555–562 (2013)
Punt, M., Bruijn, S.M., Wittink, H., van Dieën, J.H.: Effect of arm swing strategy on local dynamic stability of human gait. Gait Posture 41(2), 504–509 (2015)
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)
Riemenschneider, H.: YACVID (2018). http://yacvid.hayko.at/. Accessed 01 Apr 2019
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Ristani, E., Solera, F., Zou, R.S., Cucchiara, R., Tomasi, C.T.: DukeMTMC Project (2018). http://vision.cs.duke.edu/DukeMTMC/. Accessed 29 Mar 2019
Ross, M., Shaffer, H., Cohen, A., Freudberg, R., Manley, H.: Average magnitude difference function pitch extractor. IEEE Trans. ASSP 22(5), 353–362 (1974)
Salem, M., Kopp, S., Wachsmuth, I., Rohlfing, K., Joublin, F.: Generation and evaluation of communicative robot gesture. IJSR 2, 201–217 (2012)
Saponaro, G., Jamone, L., Bernardino, A., Salvi, G.: Interactive robot learning of gestures, language and affordances. In: GLU, pp. 83–87 (2017)
Sheikholeslami, S., Moon, A., Croft, E.A.: Cooperative gestures for industry: exploring the efficacy of robot hand configurations in expression of instructional gestures for human–robot interaction. IJRR 36(5–7), 699–720 (2017)
Simon-Martinez, C., et al.: Age-related changes in upper limb motion during typical development. PLoS ONE 13(6), e0198524 (2018)
Solera, F., Calderara, S., Ristani, E., Tomasi, C., Cucchiara, R.: Tracking social groups within and across cameras. IEEE Trans. Cir. Sys. Video Technol. 27(3), 441–453 (2017). https://doi.org/10.1109/TCSVT.2016.2607378
Tracy, J.L., Randles, D., Steckler, C.M.: The nonverbal communication of emotions. Curr. Opin. Behav. Sci. 3, 25–30 (2015)
Van Emmerik, R.E., Hamill, J., McDermott, W.J.: Variability and coordinative function in human gait. Quest 57(1), 102–123 (2005)
Vorochaeva, L.Y., Yatsun, A.S., Jatsun, S.F.: Controlling a quasistatic gait of an exoskeleton on the basis of the expert system. Trudy SPIIRAN 52, 70–94 (2017)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: AAAI (2018)
Yücel, Z., Zanlungo, F., Shiomi, M.: Walk the talk: gestures in mobile interaction. In: ICSR, pp. 220–230 (2017)
Zanlungo, F., Yücel, Z., Brščić, D., Kanda, T., Hagita, N.: Intrinsic group behaviour: dependence of pedestrian dyad dynamics on principal social and personal features. PLoS ONE 12(11), e0187253 (2017)
Acknowledgments
This work was supported by JSPS KAKENHI Grant Number JP18K18168 and JP18H04121. We would like to thank S. Koyama, H. Nguyen, P. Supitayakul and T. Pramot for their help in annotation and F. Zanlungo for his invaluable discussion.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Gregorj, A., Yücel, Z., Hara, S., Monden, A., Shiomi, M. (2019). A Signal Processing Perspective on Human Gait: Decoupling Walking Oscillations and Gestures. In: Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds) Interactive Collaborative Robotics. ICR 2019. Lecture Notes in Computer Science(), vol 11659. Springer, Cham. https://doi.org/10.1007/978-3-030-26118-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-26118-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26117-7
Online ISBN: 978-3-030-26118-4
eBook Packages: Computer ScienceComputer Science (R0)