Skip to main content

Robust Online Gesture Recognition with Crowdsourced Annotations

  • Chapter
  • First Online:
Book cover Gesture Recognition

Abstract

Crowdsourcing is a promising way to reduce the effort of collecting annotations for training gesture recognition systems. Crowdsourced annotations suffer from “noise” such as mislabeling, or inaccurate identification of start and end time of gesture instances. In this paper we present SegmentedLCSS and WarpingLCSS, two template-matching methods offering robustness when trained with noisy crowdsourced annotations to spot gestures from wearable motion sensors. The methods quantize signals into strings of characters and then apply variations of the longest common subsequence algorithm (LCSS) to spot gestures. We compare the noise robustness of our methods against baselines which use dynamic time warping (DTW) and support vector machines (SVM). The experiments are performed on data sets with various gesture classes (10–17 classes) recorded from accelerometers on arms, with both real and synthetic crowdsourced annotations. WarpingLCSS has similar or better performance than baselines in absence of noisy annotations. In presence of 60% mislabeled instances, WarpingLCSS outperformed SVM by 22% F1-score and outperformed DTW-based methods by 36% F1-score on average. SegmentedLCSS yields similar performance as WarpingLCSS, however it performs one order of magnitude slower. Additionally, we show to use our methods to filter out the noise in the crowdsourced annotation before training a traditional classifier. The filtering increases the performance of SVM by 20% F1-score and of DTW-based methods by 8% F1-score on average in the noisy real crowdsourced annotations.

Editors: Isabelle Guyon, Vassilis Athitsos and Sergio Escalera.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The home page for AMT is http://www.mturk.com.

  2. 2.

    The home page for Crowdflower is http://crowdflower.com.

  3. 3.

    Skoda and Opportunity data sets can be downloaded from http://www.wearable.ethz.ch/resources/Dataset.

References

  • J. Aggarwal, M. Ryoo, Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)

    Article  Google Scholar 

  • J. Alon, V. Athitsos, Q. Yuan, S. Sclaroff, A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 31(9), 1685–1699 (2009)

    Article  Google Scholar 

  • R. Amini, P. Gallinari, Semi-supervised learning with an imperfect supervisor. Knowl. Inf. Syst. 8, 385–413 (2005)

    Article  Google Scholar 

  • D. Angluin, P. Laird, Learning from noisy examples. Mach. Learn. 2, 343–370 (1988). April

    Google Scholar 

  • O. Banos, A. Calatroni, M. Damas, H. Pomares, I. Rojas, H. Sagha, J. del R. Millán, G. Tröster, R. Chavarriaga, D. Roggen, Kinect=imu? learning mimo signal mappings to automatically translate activity recognition systems across sensor modalities, in Proceedings of the 2012 16th International Symposium on Wearable Computers (ISWC), 2012, pp. 92–99

    Google Scholar 

  • L. Bao, S.S. Intille, Activity recognition from user-annotated acceleration data, in Proceedings of the 2nd International Conference on Pervasive Computing 2004

    Google Scholar 

  • B. Bauer, K. Karl-Friedrich, Towards an automatic sign language recognition system using subunits, in International Gesture Workshop on Gesture and Sign Languages in, Human-Computer Interaction, 2002, pp. 64–75

    Google Scholar 

  • M. Berchtold, M. Budde, D. Gordon, H. Schmidtke, M. Beigl, Actiserv: activity recognition service for mobile phones, in Proceedings of the 2010 14th International Symposium on Wearable Computers (ISWC), 2010, pp. 1–8

    Google Scholar 

  • R. Bowden, D. Windridge, T. Kadir, A. Zisserman, M. Brady, A linguistic feature vector for the visual interpretation of sign language, in European Conference on Computer Vision, ECCV ’04. 2004

    Google Scholar 

  • C.C. Chang, C.J. Lin, LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 27 (2011)

    Google Scholar 

  • L. Chen, J. Hoey, C.D. Nugent, D.J. Cook, Z. Yu, Sensor-based activity recognition, in IEEE Transactions on Systems, Man and Cybernetics 2012

    Google Scholar 

  • H. Cooper, E.-J. Ong, N. Pugeault, R. Bowden, Sign language recognition using sub-units. J. Mach. Learn. Res. 13(1), 2205–2231 (2012)

    MATH  Google Scholar 

  • T. H. Cormen, C. Stein, R. L. Rivest, C. E. Leiserson, Introduction to Algorithms, 2nd edn, (2001). ISBN 0070131511

    Google Scholar 

  • A.P. Dawid, A.M. Skene, Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28(1), 20–28 (1979)

    Article  Google Scholar 

  • J. Deng, H. Tsui, An HMM-based approach for gesture segmentation and recognition, in Proceedings of the International Conference on Pattern Recognition, ICPR ’00 2000

    Google Scholar 

  • A. Doan, R. Ramakrishnan, A.Y. Halevy, Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 89–96 (2011)

    Article  Google Scholar 

  • M. Elmezain, A. Al-Hamadi, B. Michaelis, Improving hand gesture recognition using 3D combined features, in Proceedings of the 2nd International Conference on Machine Vision, ICMV ’09, 2009, pp. 128–132

    Google Scholar 

  • G. Fang, X. Gao, W. Gao, Y. Chen, A novel approach to automatically extracting basic units from chinese sign language, in Proceedings of the 17th International Conference on Pattern Recognition, vol. 4, 2004, pp. 454–457

    Google Scholar 

  • J. Froehlich, M.Y. Chen, S. Consolvo, B. Harrison, J.A. Landay, Myexperience: a system for in situ tracing and capturing of user feedback on mobile phones, in Proceedings of the 5th International Conference on Mobile Systems, Applications and Services, MobiSys ’07 2007

    Google Scholar 

  • D. Frolova, H. Stern, S. Berman, Most probable longest common subsequence for recognition of gesture character input. IEEE Trans. Cybern. 43(3), 871–880 (2013)

    Article  Google Scholar 

  • T.-C. Fu, A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)

    Article  Google Scholar 

  • N. Gayar, F. Schwenker, G. Palm, A study of the robustness of KNN classifiers trained using soft labels, in Artificial Neural Networks in Pattern Recognition, vol. 4087, 2006

    Google Scholar 

  • I. Guyon, J. Makhoul, R. Schwartz, V. Vapnik, What size test set gives good error rate estimates? IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 52–64 (1998)

    Article  Google Scholar 

  • J. Hao, T. Shibata, Digit-writing hand gesture recognition by hand-held camera motion analysis, in Proceedings of the 3rd International Conference on Signal Processing and Communication Systems, ICSPCS ’09, 2009, pp. 1–5

    Google Scholar 

  • B. Hartmann, N. Link, Gesture recognition with inertial sensors and optimized DTW prototypes, in Proceedings of the 2010 IEEE International Conference on Systems Man and Cybernetics (SMC) 2010

    Google Scholar 

  • Z. He, L. Jin, L. Zhen, and J. Huang. Gesture recognition based on 3D accelerometer for cell phones interaction, in IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), 2008, pp. 217–220

    Google Scholar 

  • J. Howe, The Rise of Crowdsourcing, June 2006. http://www.wired.com/wired/archive/14.06/crowds.html. Accessed 20 July 2010

  • P.G. Ipeirotis, F. Provost, J. Wang, Quality management on Amazon Mechanical Turk, in Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10 2010, pp. 64–67

    Google Scholar 

  • H. Junker, O. Amft, P. Lukowicz, G. Tröster, Gesture spotting with body-worn inertial sensors to detect user activities. Pattern Recognit. 41(6), 2010–2024 (2008)

    Article  MATH  Google Scholar 

  • C. Keskin, A. Cemgil, L. Akarun. DTW based clustering to improve hand gesture recognition, in Proceedings of the 2nd International Conference on Human Behavior Unterstanding, HBU’11, 2011, pp. 72–81

    Google Scholar 

  • A. Kittur, E. H. Chi, B. Suh, Crowdsourcing user studies with mechanical turk, in Proceedings of the Twenty-sixth SIGCHI Conference on Human Factors in Computing Systems, CHI ’08, 2008, pp. 453–456

    Google Scholar 

  • M. H. Ko, G. West, S. Venkatesh, M. Kumar, Online context recognition in multisensor systems using dynamic time warping, in Proceedings of the Intelligent Sensors, Sensor Networks and Information Processing Conference 2005

    Google Scholar 

  • W.S. Lasecki, Y.C. Song, H. Kautz, J. P. Bigham, Real-time crowd labeling for deployable activity recognition, in Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW ’13, 2013, pp. 1203–1212

    Google Scholar 

  • N. D. Lawrence, B. Schölkopf, Estimating a kernel fisher discriminant in the presence of label noise, in Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, 2001, pp. 306–313

    Google Scholar 

  • H.-K. Lee, J.H. Kim, An hmm-based threshold model approach for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 961–973 (1999)

    Article  Google Scholar 

  • L.V. Nguyen-Dinh, D. Roggen, A. Calatroni, G. Tröster, Improving online gesture recognition with template matching methods in accelerometer data, in Proceedings of the 12th International Conference on Intelligent Systems Design and Applications (ISDA) 2012

    Google Scholar 

  • L.V. Nguyen-Dinh, U. Blanke, and G. Tröster, Towards scalable activity recognition: adapting zero-effort crowdsourced acoustic models, in Proceedings of the 12th International Conference on Mobile and Ubiquitous Multimedia, MUM ’13 2013a

    Google Scholar 

  • L.V. Nguyen-Dinh, M. Rossi, U. Blanke, and G. Tröster, Combining crowd-generated media and personal data: Semi-supervised learning for context recognition, in Proceedings of the 1st ACM International Workshop on Personal Data Meets Distributed Multimedia, PDM ’13 2013b

    Google Scholar 

  • L.V. Nguyen-Dinh, C. Waldburger, D. Roggen, G. Tröster, Tagging human activities in video by crowdsourcing, in Proceedings of the ACM International Conference on Multimedia Retrieval, ICMR ’13 2013c

    Google Scholar 

  • L.V. Nguyen-Dinh, A. Calatroni, G. Tröster, Towards a unified system for multimodal activity spotting: challenges and a proposal, in Proceedings of the ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, UbiComp ’14 Adjunct 2014

    Google Scholar 

  • N. Ravi, N.D, P. Mysore, M.L. Littman, Activity recognition from accelerometer data, in Proceedings of the Seventeenth Conference on Innovative Applications of Artificial Intelligence(IAAI), AAAI Press 2005, pp. 1541–1546

    Google Scholar 

  • V.C. Raykar, S. Yu, L.H. Zhao, G.H. Valadez, C. Florin, L. Bogoni, L. Moy, Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)

    MathSciNet  Google Scholar 

  • D. Roggen, A. Calatroni, M. Rossi, T. Holleczek, K. Forster, G. Troster, et al. Collecting complex activity data sets in highly rich networked sensor environments, in Proceedings of the 7th International Conference on Networked Sensing Systems. IEEE Press, 2010

    Google Scholar 

  • M. Rossi, O. Amft, G. Tröster, Recognizing daily life context using web-collected audio data, in Proceedings of the 16th IEEE International Symposium on Wearable Computers (ISWC) 2012

    Google Scholar 

  • T. Schlömer, B. Poppinga, N. Henze, S. Boll, Gesture recognition with a Wii controller, in Proceedings of the 2nd International Conference on Tangible and Embedded Interaction 2008

    Google Scholar 

  • V.S. Sheng, F. Provost, P.G. Ipeirotis, Get another label? improving data quality and data mining using multiple, noisy labelers, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08 2008

    Google Scholar 

  • H. Stern, M. Shmueli, S. Berman, Most discriminating segment—longest common subsequence (MDSLCS) algorithm for dynamic hand gesture classification. Pattern Recognit. Lett. 34(15), 1980–1989 (2013)

    Article  Google Scholar 

  • T. Stiefmeier, D. Roggen, G. Ogris, P. Lukowicz, G. Tröster, Wearable activity tracking in car manufacturing. IEEE Pervasive Comput. Mag. 7(2), 1–6 (2008)

    Article  Google Scholar 

  • M. Stikic, D. Larlus, S. Ebert, B. Schiele, Weakly supervised recognition of daily life activities with wearable sensors. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2521–2537 (2011)

    Article  Google Scholar 

  • K. Van Laerhoven, D. Kilian, B. Schiele, Using rhythm awareness in long-term activity recognition, in Proceedings of the IEEE International Symposium on Wearable Computers (ISWC) 2008

    Google Scholar 

  • C. Vogler, D.N. Metaxas, Toward scalability in ASL recognition: breaking down signs into phonemes, in Gesture-Based Communication in Human-Computer Interaction, Lecture Notes in Computer Science 1999, pp. 211–224

    Google Scholar 

  • J.A. Ward, P. Lukowicz, H.W. Gellersen, Performance metrics for activity recognition. ACM Trans. Intell. Syst. Technol. 2(1), 6 (2011)

    Article  Google Scholar 

  • A. Wilson, A. Bobick, Parametric hidden markov models for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21, 884–900 (1999)

    Article  Google Scholar 

  • J. Wu, G. Pan, D. Zhang, G. Qi, S. Li, Gesture recognition with a 3-D accelerometer, in Proceedings of the 6th International Conference on Ubiquitous Intelligence and Computing, UIC ’09, 2009, pp. 25–38

    Google Scholar 

  • H.S. Yoon, J. Soh, Y.J. Bae, H.S. Yang, Hand gesture recognition using combined features of location, angle and velocity. Pattern Recogn. 34(7), 1491–1501 (2001)

    Article  MATH  Google Scholar 

  • M.C. Yuen, I. King, K.S. Leung, A survey of crowdsourcing systems, in SocialCom/PASSAT, 2011, pp. 766–773

    Google Scholar 

  • P. Zappi, C. Lombriser, T. Stiefmeier, E. Farella, D. Roggen, L. Benini, G. Tröster, Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection, in Proceedings of the 5th European Conference on Wireless Sensor Networks, EWSN’08, 2008, pp. 17–33

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Dr. Daniel Roggen (University of Sussex) for his useful comments. This work has been supported by the Swiss Hasler Foundation project Smart-DAYS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long-Van Nguyen-Dinh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Nguyen-Dinh, LV., Calatroni, A., Tröster, G. (2017). Robust Online Gesture Recognition with Crowdsourced Annotations. In: Escalera, S., Guyon, I., Athitsos, V. (eds) Gesture Recognition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-57021-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57021-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57020-4

  • Online ISBN: 978-3-319-57021-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics