Abstract
Gesture recognition is a technology often used in human-computer interaction applications. Dynamic time warping (DTW) is one of the techniques used in gesture recognition to find an optimal alignment between two sequences. Oftentimes a pre-processing of sequences is required to remove variations due to different camera or body orientations or due to different skeleton sizes between the reference gesture sequences and the test gesture sequences. We discuss a set of pre-processing methods to make the gesture recognition mechanism robust to these variations. DTW computes a dissimilarity measure by time-warping the sequences on a per sample basis by using the distance between the current reference and test sequences. However, all body joints involved in a gesture are not equally important in computing the distance between two sequence samples. We propose a weighted DTW method that weights joints by optimizing a discriminant ratio. Finally, we demonstrate the performance of our pre-processing and the weighted DTW method and compare our results with the conventional DTW and state-of-the-art.








Similar content being viewed by others
References
Adams NH, Bartsch MA, Shifrin J, Wakefield GH (2004) Time series alignment for music information retrieval. In: ISMIR
Amin TB, Mahmood I (2008) Speech recognition using dynamic time warping. In: International conference on advances in space technologies. doi:10.1109/ICAST.2008.4747690
Baum L (1972) An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3:1–8
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164–171. doi:10.1214/aoms/1177697196
Bellman R, Kalaba R (1959) On adaptive control processes. IRE Trans Autom Control 4(2):1–9
Brodley CE, Utgoff PE (1995) Multivariate decision trees. Mach Learn 19(1):45–77
Celebi S, Aydin AS, Temiz TT, Arici T (2013) Gesture recognition using skeleton data with weighted dynamic time warping. In: Computer vision theory and applications, Visapp
Chang YJ, Chen SF, Huang JD (2011) A kinect-based system for physical rehabilitation: a pilot study for young adults with motor disabilities. Res Dev Disabil 32(6):2566–2570. doi:10.1016/j.ridd.2011.07.002. http://www.sciencedirect.com/science/article/pii/S0891422211002587
Corradini A (2001) Dynamic time warping for off-line recognition of a small gesture vocabulary. In: Proceedings IEEE ICCV workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, 2001. IEEE, pp 82–89
Efrat A, Fan Q (2007) Venkatasubramanian S Curve matching, time warping, and light fields: new algorithms for computing similarity between curves. J Math Imaging Vis 27(3):203–216
Freeman WT, Roth M (1994) Orientation histograms for hand gesture recognition. In: International workshop on automatic face and gesture recognition, pp 296–301
Gehrig D, Kuehne H, Woerner A, Schultz T (2009) Hmm-based human motion recognition with optical flow data. In: IEEE international conference on humanoid robots (Humanoids 2009), Paris, France
Hong P, Huang TS, Turk M (2000) Gesture modeling and recognition using finite state machines. In: Proceedings of the fourth IEEE international conference on automatic face and gesture recognition 2000, FG ’00. IEEE Computer Society, Washington, DC, USA, p 410. http://dl.acm.org/citation.cfm?id=795661.796191
Jain HP, Subramanian A, Das S, Mittal A (2011) Real-time upper-body human pose estimation using a depth camera. In: Proceedings of the 5th international conference on computer vision/computer graphics collaboration techniques, MIRAGE’11. Springer, Berlin, Heidelberg, pp 227–238. http://dl.acm.org/citation.cfm?id=2050320.2050340
Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recog 44(9):2231–2240
Kim SJ, Magnani A, Boyd SP (2005) Robust fisher discriminant analysis. In: Neural information processing systems
Kuzmanic A, Zanchi V (2007) Hand shape classification using dtw and lcss as similarity measures for vision-based gesture recognition system. In: EUROCON, 2007. The international conference on computer as a tool. IEEE, pp 264–269
Lee HK, Kim J (1999) An hmm-based threshold model approach for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21(10):961–973. doi:10.1109/34.799904
Liang R, Ouhyoung M (1998) A real-time continuous gesture recognition system for sign language. In: Proceedings third IEEE international conference on automatic face and gesture recognition, 1998. IEEE, pp 558–567
Lichtenauer JF, Hendriks EA, Reinders M (2008) Sign language recognition by combining statistical dtw and independent classification. IEEE Trans Pattern Anal Mach Intell 30(11):2040–2046
Müller M (2007) Information retrieval for music and motion, vol 6. Springer, Berlin
Myers CS, Habiner LF (1981) A comparative study of several dynamic time-warping algorithms for connected-word. Bell Syst Tech J
Quam D (1990) Gesture recognition with a dataglove. In: Proceedings of the IEEE 1990 national aerospace and electronics conference 1990, NAECON 1990, vol 2, pp 755–760. doi:10.1109/NAECON.1990.112862
Rath T, Manmatha R (2003) Word image matching using dynamic time warping. In: Proceedings IEEE computer society conference on computer vision and pattern recognition 2003, vol 2, pp. II–521–II–527. doi:10.1109/CVPR.2003.1211511
Rekha J, Bhattacharya J, Majumder S (2011) Shape, texture and local movement hand gesture features for indian sign language recognition. In: 3rd international conference on trendz in information sciences and computing, (TISC) 2011, pp 30–35. doi:10.1109/TISC.2011.6169079
Reyes M, Dominguez G, Escalera S (2011) Feature weighting in dynamic time warping for gesture recognition in depth data. In: IEEE international conference on computer vision workshops (ICCV Workshops) 2011, pp 1182–1188. doi:10.1109/ICCVW.2011.6130384
Ryden F, Chizeck HJ, Kosari SN, King H, Hannaford B (2011) Using kinect and a haptic interface for implementation of real-time virtual fixtures. In: Robotics sciences and systems, workshop on RGB-D: advanced reasoning with depth cameras, Los Angeles
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49. doi:10.1109/TASSP.1978.1163055
Schlömer T, Poppinga B, Henze N, Boll S (2008) Gesture recognition with a wii controller. In: Proceedings of the 2nd international conference on tangible and embedded interaction, TEI ’08. ACM, New York, NY, USA, pp 11–14. doi:10.1145/1347390.1347395
Senin P (2008) Dynamic time warping algorithm review. Information and Computer Science Department University of Hawaii at Manoa Honolulu, USA (2008)
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR, vol 2, p 7
Starner T, Pentland A (1996) Real-time american sign language recognition from video using hidden Markov models. In: International symposium on computer vision
Stowers J, Hayes M, Bainbridge-Smith A (2011) Altitude control of a quadrotor helicopter using depth map from microsoft kinect sensor. In: IEEE international conference on mechatronics, (ICM) 2011, pp 358–362. doi:10.1109/ICMECH.2011.5971311
Tappert C, Suen C, Wakahara T (1990) The state of the art in online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 12(8):787–808
Wang SB, Quattoni A, Morency LP, Demirdjian D, Darrell T (2006) Hidden conditional random fields for gesture recognition. In: IEEE computer society conference on computer vision and pattern recognition 2006, vol 2, pp 1521–1527. doi:10.1109/CVPR.2006.132
Wenjun T, Chengdong W, Shuying Z, Li J (2010) Dynamic hand gesture recognition using motion trajectories and key frames. In: 2nd international conference on advanced computer control, (ICACC) 2010, vol 3, pp 163–167. doi:10.1109/ICACC.2010.5486760
Wikipedia (2012) Dynamic time warping. http://en.wikipedia.org/wiki/Dynamic_time_warping. (online) Accessed 1 Aug 2008
Wilson AD (2010) Using a depth camera as a touch sensor. In: ACM international conference on interactive tabletops and surfaces, ITS ’10. ACM, New York, NY, USA, pp 69–72. doi:10.1145/1936652.1936665.
Wilson AD, Bobick AF (1999) Parametric hidden Markov models for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21:884–900. doi:10.1109/34.790429
Acknowledgements
We would like to thank to all Sehir University student who participated in our gesture database recordings and patiently performed all gestures to help our experiments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arici, T., Celebi, S., Aydin, A.S. et al. Robust gesture recognition using feature pre-processing and weighted dynamic time warping. Multimed Tools Appl 72, 3045–3062 (2014). https://doi.org/10.1007/s11042-013-1591-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1591-9