Abstract
Crowdsourcing is a promising way to reduce the effort of collecting annotations for training gesture recognition systems. Crowdsourced annotations suffer from “noise” such as mislabeling, or inaccurate identification of start and end time of gesture instances. In this paper we present SegmentedLCSS and WarpingLCSS, two template-matching methods offering robustness when trained with noisy crowdsourced annotations to spot gestures from wearable motion sensors. The methods quantize signals into strings of characters and then apply variations of the longest common subsequence algorithm (LCSS) to spot gestures. We compare the noise robustness of our methods against baselines which use dynamic time warping (DTW) and support vector machines (SVM). The experiments are performed on data sets with various gesture classes (10–17 classes) recorded from accelerometers on arms, with both real and synthetic crowdsourced annotations. WarpingLCSS has similar or better performance than baselines in absence of noisy annotations. In presence of 60% mislabeled instances, WarpingLCSS outperformed SVM by 22% F1-score and outperformed DTW-based methods by 36% F1-score on average. SegmentedLCSS yields similar performance as WarpingLCSS, however it performs one order of magnitude slower. Additionally, we show to use our methods to filter out the noise in the crowdsourced annotation before training a traditional classifier. The filtering increases the performance of SVM by 20% F1-score and of DTW-based methods by 8% F1-score on average in the noisy real crowdsourced annotations.
Editors: Isabelle Guyon, Vassilis Athitsos and Sergio Escalera.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The home page for AMT is http://www.mturk.com.
- 2.
The home page for Crowdflower is http://crowdflower.com.
- 3.
Skoda and Opportunity data sets can be downloaded from http://www.wearable.ethz.ch/resources/Dataset.
References
J. Aggarwal, M. Ryoo, Human activity analysis: a review. ACM Comput. Surv. 43(3), 16 (2011)
J. Alon, V. Athitsos, Q. Yuan, S. Sclaroff, A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 31(9), 1685–1699 (2009)
R. Amini, P. Gallinari, Semi-supervised learning with an imperfect supervisor. Knowl. Inf. Syst. 8, 385–413 (2005)
D. Angluin, P. Laird, Learning from noisy examples. Mach. Learn. 2, 343–370 (1988). April
O. Banos, A. Calatroni, M. Damas, H. Pomares, I. Rojas, H. Sagha, J. del R. Millán, G. Tröster, R. Chavarriaga, D. Roggen, Kinect=imu? learning mimo signal mappings to automatically translate activity recognition systems across sensor modalities, in Proceedings of the 2012 16th International Symposium on Wearable Computers (ISWC), 2012, pp. 92–99
L. Bao, S.S. Intille, Activity recognition from user-annotated acceleration data, in Proceedings of the 2nd International Conference on Pervasive Computing 2004
B. Bauer, K. Karl-Friedrich, Towards an automatic sign language recognition system using subunits, in International Gesture Workshop on Gesture and Sign Languages in, Human-Computer Interaction, 2002, pp. 64–75
M. Berchtold, M. Budde, D. Gordon, H. Schmidtke, M. Beigl, Actiserv: activity recognition service for mobile phones, in Proceedings of the 2010 14th International Symposium on Wearable Computers (ISWC), 2010, pp. 1–8
R. Bowden, D. Windridge, T. Kadir, A. Zisserman, M. Brady, A linguistic feature vector for the visual interpretation of sign language, in European Conference on Computer Vision, ECCV ’04. 2004
C.C. Chang, C.J. Lin, LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 27 (2011)
L. Chen, J. Hoey, C.D. Nugent, D.J. Cook, Z. Yu, Sensor-based activity recognition, in IEEE Transactions on Systems, Man and Cybernetics 2012
H. Cooper, E.-J. Ong, N. Pugeault, R. Bowden, Sign language recognition using sub-units. J. Mach. Learn. Res. 13(1), 2205–2231 (2012)
T. H. Cormen, C. Stein, R. L. Rivest, C. E. Leiserson, Introduction to Algorithms, 2nd edn, (2001). ISBN 0070131511
A.P. Dawid, A.M. Skene, Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28(1), 20–28 (1979)
J. Deng, H. Tsui, An HMM-based approach for gesture segmentation and recognition, in Proceedings of the International Conference on Pattern Recognition, ICPR ’00 2000
A. Doan, R. Ramakrishnan, A.Y. Halevy, Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 89–96 (2011)
M. Elmezain, A. Al-Hamadi, B. Michaelis, Improving hand gesture recognition using 3D combined features, in Proceedings of the 2nd International Conference on Machine Vision, ICMV ’09, 2009, pp. 128–132
G. Fang, X. Gao, W. Gao, Y. Chen, A novel approach to automatically extracting basic units from chinese sign language, in Proceedings of the 17th International Conference on Pattern Recognition, vol. 4, 2004, pp. 454–457
J. Froehlich, M.Y. Chen, S. Consolvo, B. Harrison, J.A. Landay, Myexperience: a system for in situ tracing and capturing of user feedback on mobile phones, in Proceedings of the 5th International Conference on Mobile Systems, Applications and Services, MobiSys ’07 2007
D. Frolova, H. Stern, S. Berman, Most probable longest common subsequence for recognition of gesture character input. IEEE Trans. Cybern. 43(3), 871–880 (2013)
T.-C. Fu, A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
N. Gayar, F. Schwenker, G. Palm, A study of the robustness of KNN classifiers trained using soft labels, in Artificial Neural Networks in Pattern Recognition, vol. 4087, 2006
I. Guyon, J. Makhoul, R. Schwartz, V. Vapnik, What size test set gives good error rate estimates? IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 52–64 (1998)
J. Hao, T. Shibata, Digit-writing hand gesture recognition by hand-held camera motion analysis, in Proceedings of the 3rd International Conference on Signal Processing and Communication Systems, ICSPCS ’09, 2009, pp. 1–5
B. Hartmann, N. Link, Gesture recognition with inertial sensors and optimized DTW prototypes, in Proceedings of the 2010 IEEE International Conference on Systems Man and Cybernetics (SMC) 2010
Z. He, L. Jin, L. Zhen, and J. Huang. Gesture recognition based on 3D accelerometer for cell phones interaction, in IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), 2008, pp. 217–220
J. Howe, The Rise of Crowdsourcing, June 2006. http://www.wired.com/wired/archive/14.06/crowds.html. Accessed 20 July 2010
P.G. Ipeirotis, F. Provost, J. Wang, Quality management on Amazon Mechanical Turk, in Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10 2010, pp. 64–67
H. Junker, O. Amft, P. Lukowicz, G. Tröster, Gesture spotting with body-worn inertial sensors to detect user activities. Pattern Recognit. 41(6), 2010–2024 (2008)
C. Keskin, A. Cemgil, L. Akarun. DTW based clustering to improve hand gesture recognition, in Proceedings of the 2nd International Conference on Human Behavior Unterstanding, HBU’11, 2011, pp. 72–81
A. Kittur, E. H. Chi, B. Suh, Crowdsourcing user studies with mechanical turk, in Proceedings of the Twenty-sixth SIGCHI Conference on Human Factors in Computing Systems, CHI ’08, 2008, pp. 453–456
M. H. Ko, G. West, S. Venkatesh, M. Kumar, Online context recognition in multisensor systems using dynamic time warping, in Proceedings of the Intelligent Sensors, Sensor Networks and Information Processing Conference 2005
W.S. Lasecki, Y.C. Song, H. Kautz, J. P. Bigham, Real-time crowd labeling for deployable activity recognition, in Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW ’13, 2013, pp. 1203–1212
N. D. Lawrence, B. Schölkopf, Estimating a kernel fisher discriminant in the presence of label noise, in Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, 2001, pp. 306–313
H.-K. Lee, J.H. Kim, An hmm-based threshold model approach for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 961–973 (1999)
L.V. Nguyen-Dinh, D. Roggen, A. Calatroni, G. Tröster, Improving online gesture recognition with template matching methods in accelerometer data, in Proceedings of the 12th International Conference on Intelligent Systems Design and Applications (ISDA) 2012
L.V. Nguyen-Dinh, U. Blanke, and G. Tröster, Towards scalable activity recognition: adapting zero-effort crowdsourced acoustic models, in Proceedings of the 12th International Conference on Mobile and Ubiquitous Multimedia, MUM ’13 2013a
L.V. Nguyen-Dinh, M. Rossi, U. Blanke, and G. Tröster, Combining crowd-generated media and personal data: Semi-supervised learning for context recognition, in Proceedings of the 1st ACM International Workshop on Personal Data Meets Distributed Multimedia, PDM ’13 2013b
L.V. Nguyen-Dinh, C. Waldburger, D. Roggen, G. Tröster, Tagging human activities in video by crowdsourcing, in Proceedings of the ACM International Conference on Multimedia Retrieval, ICMR ’13 2013c
L.V. Nguyen-Dinh, A. Calatroni, G. Tröster, Towards a unified system for multimodal activity spotting: challenges and a proposal, in Proceedings of the ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, UbiComp ’14 Adjunct 2014
N. Ravi, N.D, P. Mysore, M.L. Littman, Activity recognition from accelerometer data, in Proceedings of the Seventeenth Conference on Innovative Applications of Artificial Intelligence(IAAI), AAAI Press 2005, pp. 1541–1546
V.C. Raykar, S. Yu, L.H. Zhao, G.H. Valadez, C. Florin, L. Bogoni, L. Moy, Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)
D. Roggen, A. Calatroni, M. Rossi, T. Holleczek, K. Forster, G. Troster, et al. Collecting complex activity data sets in highly rich networked sensor environments, in Proceedings of the 7th International Conference on Networked Sensing Systems. IEEE Press, 2010
M. Rossi, O. Amft, G. Tröster, Recognizing daily life context using web-collected audio data, in Proceedings of the 16th IEEE International Symposium on Wearable Computers (ISWC) 2012
T. Schlömer, B. Poppinga, N. Henze, S. Boll, Gesture recognition with a Wii controller, in Proceedings of the 2nd International Conference on Tangible and Embedded Interaction 2008
V.S. Sheng, F. Provost, P.G. Ipeirotis, Get another label? improving data quality and data mining using multiple, noisy labelers, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08 2008
H. Stern, M. Shmueli, S. Berman, Most discriminating segment—longest common subsequence (MDSLCS) algorithm for dynamic hand gesture classification. Pattern Recognit. Lett. 34(15), 1980–1989 (2013)
T. Stiefmeier, D. Roggen, G. Ogris, P. Lukowicz, G. Tröster, Wearable activity tracking in car manufacturing. IEEE Pervasive Comput. Mag. 7(2), 1–6 (2008)
M. Stikic, D. Larlus, S. Ebert, B. Schiele, Weakly supervised recognition of daily life activities with wearable sensors. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2521–2537 (2011)
K. Van Laerhoven, D. Kilian, B. Schiele, Using rhythm awareness in long-term activity recognition, in Proceedings of the IEEE International Symposium on Wearable Computers (ISWC) 2008
C. Vogler, D.N. Metaxas, Toward scalability in ASL recognition: breaking down signs into phonemes, in Gesture-Based Communication in Human-Computer Interaction, Lecture Notes in Computer Science 1999, pp. 211–224
J.A. Ward, P. Lukowicz, H.W. Gellersen, Performance metrics for activity recognition. ACM Trans. Intell. Syst. Technol. 2(1), 6 (2011)
A. Wilson, A. Bobick, Parametric hidden markov models for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21, 884–900 (1999)
J. Wu, G. Pan, D. Zhang, G. Qi, S. Li, Gesture recognition with a 3-D accelerometer, in Proceedings of the 6th International Conference on Ubiquitous Intelligence and Computing, UIC ’09, 2009, pp. 25–38
H.S. Yoon, J. Soh, Y.J. Bae, H.S. Yang, Hand gesture recognition using combined features of location, angle and velocity. Pattern Recogn. 34(7), 1491–1501 (2001)
M.C. Yuen, I. King, K.S. Leung, A survey of crowdsourcing systems, in SocialCom/PASSAT, 2011, pp. 766–773
P. Zappi, C. Lombriser, T. Stiefmeier, E. Farella, D. Roggen, L. Benini, G. Tröster, Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection, in Proceedings of the 5th European Conference on Wireless Sensor Networks, EWSN’08, 2008, pp. 17–33
Acknowledgements
The authors would like to thank Dr. Daniel Roggen (University of Sussex) for his useful comments. This work has been supported by the Swiss Hasler Foundation project Smart-DAYS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Nguyen-Dinh, LV., Calatroni, A., Tröster, G. (2017). Robust Online Gesture Recognition with Crowdsourced Annotations. In: Escalera, S., Guyon, I., Athitsos, V. (eds) Gesture Recognition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-57021-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-57021-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57020-4
Online ISBN: 978-3-319-57021-1
eBook Packages: Computer ScienceComputer Science (R0)