A visual attention-based method to address the midas touch problem existing in gesture-based interaction

Wu, Huiyue; Wang, Jianmin

doi:10.1007/s00371-014-1060-0

A visual attention-based method to address the midas touch problem existing in gesture-based interaction

Original Article
Published: 23 January 2015

Volume 32, pages 123–136, (2016)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Huiyue Wu¹ &
Jianmin Wang²

763 Accesses
12 Citations
Explore all metrics

Abstract

The “Midas Touch” problem has long been a difficult problem existing in gesture-based interaction. This paper proposes a visual attention-based method to address this problem from the perspective of cognitive psychology. There are three main contributions in this paper: (1) a visual attention-based parallel perception model is constructed by combining top-down and bottom-up attention, (2) a framework is proposed for dynamic gesture spotting and recognition simultaneously, and (3) a gesture toolkit is created to facilitate gesture design and development. Experimental results show that the proposed method has a good performance for both isolated and continuous gesture recognition tasks. Finally, we highlight the implications of this work for the design and development of all gesture-based applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Computer vision-based hand gesture recognition for human-robot interaction: a review

Article Open access 19 July 2023

Jing Qi, Li Ma, … Yushu Yu

Toward human activity recognition: a survey

Article 20 October 2022

Gulshan Saleem, Usama Ijaz Bajwa & Rana Hammad Raza

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

Article Open access 12 June 2023

Victor Chang, Rahman Olamide Eniola, … Qianwen Ariel Xu

References

Betke, M., Gips, J., Fleming, P.: The Camera Mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. Neural Syst. Rehabil. Eng. 10(1), 1–10 (2002)
Article Google Scholar
Colaco, A., Kirmani, A., Yang, H.S., Gong, N.W., Schmandt, C., Goyal, V.K. Mime: Compact, low-power 3D gesture sensing for interaction with head-mounted displays. In: Proceedings of the ACM Symposium of User Interface Software and Technology (UIST’13), pp. 227–236 (2013)
Cover, T.M., Thomas, J.A.: Entropy, Relative Entropy and Mutual Information. Elements of Information Theory. Wiley, New York (1991)
Google Scholar
Elmezain, M., Hamadi, A.A., Michaelis, B.: Improving hand gesture recognition using 3D combined features. In Second International Conference on Machine Vision, pp. 128–132 (2009)
Feng, Z.Q., Zhang, M.M., Pan, Z.G., Yang, B., Xu, T., Tang, H.K., Li, Y.: 3D-Freehand-pose initialization based on operator’s cognitive behavioral models. Vis. Comput. 26(6–8), 607–617 (2010)
Article Google Scholar
Hilliges, O., Izadi, S., Wilson, A.D., Hodges, S., Mendoza, A.G., Butz, A.: Interactions in the air: adding further depth to interactive tabletops. In: Proceedings of the ACM Symposium of User Interface Software and Technology (UIST’ 09). pp. 139–148 (2009)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Itti, L.: Models of Bottom-Up and Top-Down Visual Attention. California Institute of Technology, Pasadena (2000)
Google Scholar
Jacob, R.J.K.: Eye Movement-Based Human-Computer Interaction Techniques: Toward Non-Command Interfaces. Advances in Human-Computer Interaction, vol. 4, pp. 151–190. Ablex Publishing Co., Norwood (1993)
Jonides, J.: Further towards a model of the mind’s eye’s movement. Bull. Psychon. Soc. 21(4), 247–250 (1983)
Article Google Scholar
Kato, H., Billinghurst, M., Poupyrev, I.: Virtual object manipulation on a table-top AR environment. In: Proceedings of the ISAR2000, pp. 111–119 (2000)
Kjeldsen, R., Levas, A., Pinhanez, C.: Dynamically reconfigurable vision-based user interfaces. Mach. Vis. Appl. 16(1), 6–12 (2004)
Article Google Scholar
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol 4, 219–227 (1985)
Google Scholar
Kölsch, M., Turk, M., Höllerer, T.: Vision-based interfaces for mobility. In: Proceedings of IEEE International Conference on Mobile and Ubiquitous Systems (Mobiquitous’04), pp. 86–94 (2004)
Kristensson, P.O., Nicholson, T.F.W., Quigley, A.: Continuous recognition of one-handed and two-handed gestures using 3d full-body motion tracing sensors. In: Proceedings of the 17th International Conference on Intelligent User Interfaces (IUI’12), pp. 89–92 (2012)
Lee, H., Kim, J.: An HMM-based threshold model approach for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 961–973 (1999)
Article Google Scholar
Liang, H., Yuan, J.S., Thalmann, D., Zhang, Z.Y.: Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization. Vis. Comput. 29(6–8), 837–848 (2013)
Article Google Scholar
Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H.Freeman, San Francisco (1982)
Google Scholar
Mo, Z.Y., Lewis, J.P., Neumann, U.: SmartCanvas: a gesture-driven intelligent drawing desk. In: Proceedings of the ACM Symposium of User Interface Software and Technology (UIST’05), pp. 239–243 (2005)
Mujibiya, A., Miyaki, T., Rekimoto, J.: Anywhere touchtyping: text input on arbitrary surface using depth sensing. In: Proceedings of the ACM Symposium of User Interface Software and Technology (UIST’ 10), pp. 443–444 (2010)
Nianjun, L., Brain, C.L., Peter, J.K., Richard, A.D.: Model structure selection ${\$}$ training algorithms for a HMM gesture recognition system. In: International IWFHR, pp. 100–106 (2004)
Pan, Z.G., Li, Y., Zhang, M.M., Sun, C., Guo, K.D., Tang, X., Zhou, S.Z.Y.: A real-time multi-cue hand tracking algorithm based on computer vision. In: Proceedings of the 2010 IEEE Virtual Reality Conference, pp. 219–222 (2010)
Pedersoli, F., Benini, S., Adami, N., Leonardi, R.: XKin: an open source framework for hand pose and gesture recognition using Kinect. Vis. Comput. 30(10), 1107–1122 (2014)
Article Google Scholar
Peng, B., Qian, G.: Online gesture spotting from visual hull data. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1175–1188 (2011)
Article Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE. 77(2), 257–286 (1989)
Article Google Scholar
Rovelo, G., Vanacken, D., Luyten, K., Abad, F., Camahort, E.: Multi-viewer gesture-based interaction for omni-directional video. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’14), pp. 4077–4086 (2014)
Salah, A.A., Alpaydin, E., Akarun, L.: A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 420–425 (2002)
Article Google Scholar
Shen, Y., Ong, S.K., Nee, A.Y.C.: Vision-based hand interaction in augmented reality environment. Int. J. Hum. Comput. Interact. 27(6), 523–544 (2011)
Article Google Scholar
Song, P., Goh, W.B., Hutama, W., Fu, C.W., Liu, X.P.: A handle bar metaphor for virtual object manipulation with mid-air interaction. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’12), pp. 1297–1306 (2012)
Tian, M.: Top-down attention motivated research on perception model. Ph.D. thesis, Beijing Jiaotong University, China (2007)
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cognit. Psychol. 12(1), 97–136 (1980)
Article Google Scholar
Ungerleider, L.G., Mishkin, M.: Two cortical visual systems. In: Ingle, D.J., Goodale, M.A., Mansfield, R.W. (eds.) Analysis of Visual Behavior, pp. 549–586. The MIT Press, Cambridge (1982)
Google Scholar
Vatavu, R.D.: User-defined gestures for free-hand TV control. In: Proceedings of the 10th European Conference on Interactive TV and Video (EuroITV’12), pp. 45–48 (2012)
Walter, R., Bailly, G., Muller, J.: StrikeAPose: revealing mid-air gestures on public displays. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’13), pp. 841–850 (2013)
Wilson, A.D.: Robust computer vision-based detection of pinching for one and two-handed gesture input. In: Proceedings of the ACM Symposium of User Interface Software and Technology (UIST’ 06), pp. 255–258 (2006)
Yang, H.D., Park, A.Y., Lee, S.W.: Gesture spotting and recognition for human-robot interaction. IEEE Trans. Robot. 23(2), 256–270 (2007)
Article Google Scholar
Yang, M.H., Ahuja, N., Tabb, M.: Extraction of 2D motion trajectories and its application to hand gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1061–1074 (2002)
Article Google Scholar
Wu, H.Y., Zhang, F.J., Liu, Y.J., Hu, Y.H., Dai, G.Z.: Vision-based gesture interfaces toolkit for interactive games. J. Softw. 22(5), 1067–1081 (2011)
Google Scholar

Download references

Acknowledgments

We thank the financial support from the National Natural Science Foundation of China, No. 61202344; the Fundamental Research Funds for the Central Universities, Sun Yat-Sen University, No. 1209119; Special Project on the Integration of Industry, Education and Research of Guangdong Province, No. 2012B091000062; the Fundamental Research Funds for the Central Universities, Tongji University, No. 0600219052, 0600219053. We would like to express our great appreciation to editor and reviewers.

Author information

Authors and Affiliations

The School of Communication and Design, Sun Yat-Sen University, Guangzhou, China
Huiyue Wu
User Experience Lab, The College of Arts and Media, Tongji University, Shanghai, China
Jianmin Wang

Authors

Huiyue Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jianmin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huiyue Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, H., Wang, J. A visual attention-based method to address the midas touch problem existing in gesture-based interaction. Vis Comput 32, 123–136 (2016). https://doi.org/10.1007/s00371-014-1060-0

Download citation

Published: 23 January 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s00371-014-1060-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A visual attention-based method to address the midas touch problem existing in gesture-based interaction

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

Toward human activity recognition: a survey

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A visual attention-based method to address the midas touch problem existing in gesture-based interaction

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

Toward human activity recognition: a survey

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation