Multimedia content analysis on gesture event detection for a SMART TV Keyboard application

Togootogtokh, Enkhtogtokh; Shih, Timothy K.

doi:10.1007/s11042-016-3385-3

Multimedia content analysis on gesture event detection for a SMART TV Keyboard application

Published: 05 March 2016

Volume 76, pages 7341–7363, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Enkhtogtokh Togootogtokh¹ &
Timothy K. Shih¹

313 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

We have proposed an effective machine learning method to analyze multimedia content addressing gesture event detection and recognition. Our machine learning method is based on well-studied techniques such that Procrustes Analysis, Combination of Local and Global Representations, Linear Shape Model, and application to SMART TV Virtual Keyboard. In this paper, we address gesture event detection specially fingertip gesture detection to get smart and advanced usage of technology. Our modern vision keyboard could be a good next generation replacement of SMART TV remote control. It can be more economical as we don’t need physical object like traditional keyboard, remote control and their energy resources like batteries. More information and demonstrations of the proposed keyboard can be accessed at http://video.minelab.tw/MCAoGED/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Computer vision-based hand gesture recognition for human-robot interaction: a review

Article Open access 19 July 2023

Jing Qi, Li Ma, … Yushu Yu

Real-Time Human Pose Detection and Recognition Using MediaPipe

Toward human activity recognition: a survey

Article 20 October 2022

Gulshan Saleem, Usama Ijaz Bajwa & Rana Hammad Raza

References

Abdulameer MH, Sheikh ASNH, Othman ZA et al. (2014) A modified active appearance model based on an adaptive artificial bee colony. Sci World J
Anderson TW, Gupta SD (1963) Some inequalities on characteristic roots of matrices. Biometrika 50:522–524
Article MathSciNet MATH Google Scholar
Andrea C (2001) Dynamic time warping for offline recognition of a small gesture vocabulary. In: Proceedings of the IEEE ICCV workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, July–August, p 83
Atchle WR, Edwin HB (1975) Multivariate statistical methods, among-groups covariation. Dowden, Hutchinson & Ross
Google Scholar
Baggio DL (2012) Mastering OpenCV with practical computer vision projects. Packt Publishing Ltd
Baker S, Matthews I (2001) Equivalence and efficiency of image alignment algorithms. Comput Vision Pattern Recognition, CVPR 1:I–1090, IEEE, 2001
Google Scholar
Baxter J (2000) A model of inductive bias learning. J Artif Intell Res 12:149–198
MathSciNet MATH Google Scholar
Beltrami E (1873) On bilinear functions. SVD and signal processing, pp 9–18
Berge T, Jos MF (1977) Orthogonal Procrustes rotation for two or more matrices. Psychometrika 42(2):267–276
Article MathSciNet MATH Google Scholar
Berge T, Jos MF, Dirk LK (1984) Orthogonal rotations to maximal agreement for two or more matrices of different column orders. Psychometrika 49(1):49–55
Article Google Scholar
Brown T, Thomas RC (2000) Finger tracking for the digital desk. Proc First Australasian User Interface Conf 11–16
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Kluwer, Boston, pp 1–43
Google Scholar
Cambridge Hand Gesture Dataset. http://www.iis.ee.ic.ac.uk/icvl/ges_db.htm
Cardoso JF (1999) High-order contrasts for independent component analysis. Neural Comput 11(1):157–192
Article Google Scholar
Cauchy AL. Sur l’équationa l’aide de laquelle on détermine les inégalités séculaires des mouvements des planetes. Exer de math 4(1)74–195
Charniak E (1993) Statistical language learning. MIT Press, Cambridge
Google Scholar
Chennubhotla C, Allan J (2001) Sparse PCA. extracting multi-scale structure from data. Computer vision, ICCV 2001. Proc Eighth IEEE Int Conf 1
Christian VH, François B (2001) Bare-hand human computer interaction. Proc 2001 Workshop Percetive User Interfaces, Orlando, Florida, USA, 1–8
Cliff N (1966) Orthogonal rotation to congruence. Psychometrika 31(1):33–42
Article MathSciNet Google Scholar
Commandeur JJ (1991) Matching configurations. DSWO Press, Leiden University, pp 13–61
Cootes TF, Gareth JE, Christopher JT et al. (1998) A comparative evaluation of active appearance model algorithms. BMVC 98:680–689
Cootes TF, Kittipanya-ngam P (2002) Comparing variations on the active appearance model algorithm. In BMVC, pp 1–10, 2002
Crowley JL, Berard F, Coutaz J et al. (1995) Finger tacking as an input device for augmented reality. Proc Int Workshop Automatic face Gesture Recognition, Zurich, Switzerland, 195–200
Derpanis KG (2005) Mean shift clustering, Lecture notes. http://www.cse.yorku.ca/~kosta/CompVis_Notes/mean_shift.pdf
Dijksterhuis GB, Gower JC (1992) The interpretation of generalized procrustes analysis and allied methods. Food Qual Prefer 3(2):67–87
Article Google Scholar
Edwards, GJ, Christopher JT, Timothy FC et al. (1998) Interpreting face images using active appearance models. automatic face and gesture recognition, proceedings. Third IEEE Int Conf IEEE
Everson R (1998) Orthogonal, but not orthonormal, procrustes problems. Adv Comput Math
Fisher RA, Winifred AM (1923) CP32 studies in crop variation, II: the manurialresponse of different potato varieties. J Agric Sci Camb 13:311–320
Article Google Scholar
Forbes K, Eugene F (2005) An efficient search algorithm for motion data using weighted PCA. Proceedings of the 2005 ACM SIGGRAPH. ACM, 2005
Francois R, Medioni G (1999) Adaptive color background modeling for real-time segmentation of video streams. In: International conference on imaging science, systems, and technology, Las Vegas, pp 227–232
Gavrila DM, Davis LS (1995) Towards 3-d model-based tracking and recognition of human movement: multi-view approach. IEEE Int Workshop automatic face- and gesture recognition. IEEE Computer Society, Zurich, 272–277
Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53:325–338
Article MathSciNet MATH Google Scholar
Gower JC (1975) Generalized procrustes analysis. Psychometrika 40(1):33–51
Article MathSciNet MATH Google Scholar
Gower J (1995) Orthogonal and projection procrustes analysis
Gower JC, Dijksterhuis GB (2004) Procrustes problems. Oxford University Press, Oxford
Book MATH Google Scholar
Green B (1952) The orthogonal approximation of an oblique structure in factor analysis. Psychometrika 17(4):429–440
Article MathSciNet MATH Google Scholar
Green BF, Gower JC (1979) A problem with congruence. Annual meeting of the psychometric society, Monterey, California
Gross R, Matthews I, Baker S (2005) Generic vs. person specific active appearance models. Image Vis Comput 23(11):1080–1093
Article Google Scholar
Gruen AW, Akca MD (2003) Generalized procrustes analysis and its applications in photogrammetry
Holzmann GJ (1925) Finite state machine: Ebook. http://www.spinroot.com/spin/Doc/Book91_PDF/F1.pdf
Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417–441
Article MATH Google Scholar
Hou XW, Li SZ, Zhang H, Cheng Q (2001) Direct appearance models. Computer Vision and Pattern Recognition, 2001 CVPR 1:I–828, IEEE, 2001
Google Scholar
Hubert M, Sanne E (2004) Robust PCA and classification in biosciences. Bioinformatics 20(11):1728–1736
Article Google Scholar
Hurley JR, Cattell RB (1962) Producing direct rotation to test a hypothesized factor structure. Behav Sci 7(2):258–262
Article Google Scholar
Igual L, Perez-Sala X, Escalera S, Angulo C, Dela TF (2014) Continuous generalized procrustes analysis. Pattern Recogn 47(2):659–671
Article MATH Google Scholar
Jeffers JNR (1967) Two case studies in the application of principal component analysis. Appl Stat 225–236
Jolliffe L (2002) Principal component analysis. Wiley, New York
MATH Google Scholar
Jordan C (1874) Mémoire sur les formes bilinéaires. J Math Pures Appl 19:35–54
MATH Google Scholar
Karhunen J, Jyrki J (1994) Representation and separation of signals using nonlinear PCA type learning. Neural Netw 7(1):113–127
Article Google Scholar
Keaton T, Dominguez SM, Sayed AH et al. (2002) SNAP&TELL: a multi-modal wearable computer interface for browsing the environment. Proc Sixth Int Symposium Wearable Comput, 2002. (ISWC 2002), 75–82
Kiers HAL, ten Berge JMF (1992) Minimization of a class of matrix trace functions by means of refined majorization. Psychometrika 57(3):371–382
Article MathSciNet MATH Google Scholar
Kristof W, Wingersky B (1971) A generalization of the orthogonal Procrustes rotation procedure to more than two matrices. Proc Ann Convention Am Psychol Assoc. American Psychological association, 1971
Lee HK, Kim JH (1999) An HMM-based threshold model approach for gesture recognition. IEEE Trans Pattern Anal Mach Intell 21:961–973
Article Google Scholar
Li F, Wechsler H (2005) Open set face recognition using transduction. IEEE Trans Pattern Anal Mach Intell 27:1686–1697
Article Google Scholar
Lingoes JC, Ingwer B (1978) A direct approach to individual differences scaling using increasingly complex transformations. Psychometrika 43(4):491–519
Article MathSciNet MATH Google Scholar
Lu W-L, Little JJ (2006) Simultaneous tracking and action recognition using the pca-hog descriptor. In: The 3rd Canadian conference on computer and robot vision, 2006. Quebec, pp 6–13
Lu H, Plataniotis KN, Venetsanopoulos AN (2006) MPCA: multilinear principal component analysis of tensor objects. Neural Netw IEEE Trans 19(1):18–39
Google Scholar
Marcell S. Hand posture and gesture dataset. http://www.idiap.ch/resource/gestures/
Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G. (1998) Kernel PCA and de-noising in feature spaces. In NIPS, vol 4(5)
Mosier CI (1939) Determining a simple sturcture when loadings for certain tests are known. Psychometrika 4:149–162
Article MATH Google Scholar
Oka K, Sato Y, Koike H (2002) Real-time gesture event detection tracking and gesture recognition. Comput Graph Appl IEEE 22:64–71
Article Google Scholar
Papandreou G, Maragos P (2008) Adaptive and constrained algorithms for inverse compositional active appearance model fitting. Comput Vision Patt Recognition CVPR 1–8
Pearson K (1901) Principal components analysis. London, Edinb, Dublin Philos Mag J Sci 6(2):572–575
Google Scholar
Peay ER (1988) Multidimensional rotation and scaling of configurations to optimal agreement. Psychometrika 53(2):199–208
Article MathSciNet MATH Google Scholar
Preisendorfer RW (1988) In: Mobley CD (ed) Principal component analysis in meteorology and oceanography, vol 425. Elsevier, Amsterdam
Google Scholar
Quach KG, Duong CN, Luu K et al. (2012) Gabor wavelet-based appearance models. In: Computing and communication technologies, research, innovation, and vision for the future (RIVF), 1–6
Quek FKH, Mysliwiec T, Zhao M et al. (1995) Finger mouse: a freehand pointing computer interface. Proc Int Workshop Automatic Face Gesture Recognition, Zurich, Switzerland, 372–377
Ramage D (2007) Hidden Markov models fundamentals, Lecture notes. http://cs229.stanford.edu/section/cs229-hmm.pdf
Rao CR (1964) The use and interpretation of principal component analysis in applied research. Sankhyā: Indian J Stat Ser A 26:329–358
MathSciNet MATH Google Scholar
Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interaction: a survey. Artif Intell Rev 43:1–54
Article Google Scholar
Ren Y, Zhang F (2009) Hand gesture recognition based on meb-svm. In: Second international conference on embedded software and systems, IEEE computer society, Los Alamitos, pp 344–349
Ross A Procrustes analysis, Technical report, Department of computer science and engineering, University of South Carolina, SC 29208
Sato Y, Kobayashi Y, Koike H et al. (2000) Fast tracking of hands and gesture event detection in infrared images for augmented desk interface. Proc Fourth IEEE Int Conf Automatic Face Gesture Recognition, 462–467, 28–30
Schönemann PH (1966) A generalized solution of the orthogonal Procrustes problem. Psychometrika 31(1):1–10
Article MathSciNet MATH Google Scholar
Schönemann PH, Robert MC (1970) Fitting one matrix to another under choice of a central dilation and a rigid motion. Psychometrika 35(2):245–255
Article Google Scholar
Senin P (2008) Dynamic time warping algorithm review, technical report. http://csdl.ics.hawaii.edu/techreports/08-04/08-04.pdf
Sigal L, Sclaroff S, Athitsos V et al. (2004) Skin color-based video segmentation under time-varying illumination. IEEE Trans Pattern Anal Mach Intell 862–877
Song G, Ai H, Xu GY et al. (2003) Hierarchical direct appearance model for elastic labeled graph localization. Third Int Symposium Multispectral Image Process Pattern Recognition 139–144
Stewart GW (1993) On the early history of the singular value decomposition. SIAM Rev 35(4):551–566
Article MathSciNet MATH Google Scholar
Thirumuruganathan S (2010) A detailed introduction to K-nearest neighbor (KNN) algorithm. http://saravananthirumuruganathan.wordpress.com/2010/05/17/a-detailed-introduction-to-k-nearest-neighbor-knn-algorithm/
Tomita A, Ishii JR (1994) Hand shape extraction from a sequence of digitized gray-scale images”, 20th Int. Conf. Industrial Electronics, Control and Instrumentation. IECON ’94 3:1925–1930
Google Scholar
Vidal R, Ma Y (2005) Generalized principal component analysis. IEEE Trans Pattern Anal Mach Intell 27:1945–1960
Article Google Scholar
Wang RY, Popovi J (2009) Real-time hand-tracking with a color glove. ACM SIGGRAPH 2009 papers, 1–8
Wöhler C, Anlauf JK (1999) An adaptable time-delay neural-network algorithm for image sequence analysis. IEEE Trans Neural Netw 10:1531–1536
Article Google Scholar
Wu Y, Ma B, Yang M, Zhang J, Jia Y (2014) Metric learning based structural appearance model for robust visual tracking. Circuits Syst Video Technol IEEE Trans 24(5):865–877
Article Google Scholar
Wu Y, Shan Y, Zhangy Z et al. (2000) VISUAL PANEL: from an ordinary paper to a wireless and mobile input device. Technical report, MSR-TR-2000 Microsoft Research Corporation, http://www.research.microsoft.com, October 2000
Yan Y, Liu G, Ricci E et al. (2013) Multi-task linear discriminant analysis for multi-view action recognition. Image Process (ICIP), 20th IEEE Int Conf 2842–2846
Yan Y, Ricci E, Subramanian R et al. (2013) No matter where you are: flexible graph-guided multi-task learning for multi-view head pose classification under target motion. Comput Vision (ICCV), IEEE Int Conf 1177–1184
Yan Y, Shen H, Liu G, Ma Z, Gao C, Sebe N (2014) GLocal tells you more: coupling GLocal structural for feature selection with sparsity for image and video classification. Comput Vis Image Underst 124:99–109
Article Google Scholar

Download references

Author information

Authors and Affiliations

MINE Lab, Department of Computer Science and Information Engineering, National Central University (NCU), No. 300, Jhongda Rd., Jhongli City, Taoyuan County, 32001, Taiwan, China
Enkhtogtokh Togootogtokh & Timothy K. Shih

Authors

Enkhtogtokh Togootogtokh
View author publications
You can also search for this author in PubMed Google Scholar
Timothy K. Shih
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enkhtogtokh Togootogtokh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Togootogtokh, E., Shih, T.K. Multimedia content analysis on gesture event detection for a SMART TV Keyboard application. Multimed Tools Appl 76, 7341–7363 (2017). https://doi.org/10.1007/s11042-016-3385-3

Download citation

Received: 09 March 2015
Revised: 28 December 2015
Accepted: 23 February 2016
Published: 05 March 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11042-016-3385-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Multimedia content analysis on gesture event detection for a SMART TV Keyboard application

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

Real-Time Human Pose Detection and Recognition Using MediaPipe

Toward human activity recognition: a survey

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimedia content analysis on gesture event detection for a SMART TV Keyboard application

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

Real-Time Human Pose Detection and Recognition Using MediaPipe

Toward human activity recognition: a survey

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation