Commanding a drone through body poses, improving the user experience

Yam-Viramontes, Brandon; Cardona-Reyes, Héctor; González-Trejo, Javier; Trujillo-Espinoza, Cristian; Mercado-Ravell, Diego

doi:10.1007/s12193-022-00396-0

Commanding a drone through body poses, improving the user experience

Original Paper
Published: 23 September 2022

Volume 16, pages 357–369, (2022)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Brandon Yam-Viramontes¹,
Héctor Cardona-Reyes³,
Javier González-Trejo²,
Cristian Trujillo-Espinoza² &
…
Diego Mercado-Ravell ORCID: orcid.org/0000-0002-7416-3190³

438 Accesses
Explore all metrics

Abstract

In this work, we propose the use of Multimodal Human-Computer Interfaces (MHCI) through body poses to command a drone in an easy and intuitive way. First, the human-user pose is recovered from a video stream, with the help of the open source library Open Pose. Then, a Support Vector Classifier (SVC), trained to distinguish between different body poses, is used to interpret eleven different human poses as the most important high level commands to the drone. The proposed strategy was successfully implemented to remotely control a drone through a web interface, where the user can interact with a drone in a remote location using only a web interface. Real-time experiments were carried out with fourteen different volunteers, selected to represent different segments of the population, with different ages, gender, experience with technology, socioeconomic class, etc., in order to evaluate the user experience with the help of a User Experience Questionnaire (UEQ), demonstrating satisfactory results. The study suggests that the use of the proposed MHCI received good acceptance between the participants, even in users without previous experience with drones, and received excellent scores in terms of attractiveness, stimulation and novelty from most of the volunteers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Face Recognition and Hand Gesture Control for Tello Drone Navigation

Real-Time Human-Posture Recognition for Human-Drone Interaction Using Monocular Vision

Usability Evaluation of Newly Developed Three-Dimensional Input Device for Drone Operation

Availability of data and materials

Video available at https://youtu.be/aUho-uN1DzM.

Code availability

Not applicable.

References

Choudhury S, Solovey K, Kochenderfer MJ, Pavone M (2021). Efficient large-scale multi-drone delivery using transit networks. J Artif Int Res 70:757–788. https://doi.org/10.1613/jair.1.12450
Tzelepi M, Tefas A (2019) Graph embedded convolutional neural networks in human crowd detection for drone flight safety. IEEE Trans Emerging Top Comput Intell. https://doi.org/10.1109/TETCI.2019.2897815
Kim J, Kim S, Ju C, Son H (2019) Unmanned aerial vehicles in agriculture: a review of perspective of platform, control, and applications. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2932119
Dowling L, Poblete T, Hook I, Tang H, Tan Y, Glenn W, Unnithan RR (2018) Accurate indoor mapping using an autonomous unmanned aerial vehicle (UAV). CoRR arXiv:1808.01940
Gonçalves J, Renato H (2015) UAV photogrammetry for topographic monitoring of coastal areas. ISPRS J Photogram Remote Sens. https://doi.org/10.1016/j.isprsjprs.2015.02.009
Kaufmann E, Loquercio A, Ranftl R, Dosovitskiy A, Koltun V, Scaramuzza D (2018) Deep drone racing: learning agile flight in dynamic environments. CoRR arXiv:1806.08548
Oviatt S, Schuller B, Cohen PR, Sonntag D, Potamianos G, Krüger A (eds) (2017) The handbook of multimodal-multisensor interfaces: foundations, user modeling, and common modality combinations—volume 1, vol 14. Association for Computing Machinery and Morgan & Claypool
Karray F, Alemzadeh M, Abou Saleh J, Arab MN (2008) Human–computer interaction: overview on state of the art. Int J Smart Sens Intell Syst 1(1178–5608), 137–159. https://doi.org/10.21307/ijssis-2017-283
Vernier F, Nigay L (2001) A framework for the combination and characterization of output modalities. In: Palanque P, Paternò F (eds) Interactive systems design, specification, and verification. Springer, Berlin, Heidelberg, pp 35–50
Chapter Google Scholar
Kaushik D, Jain R (2014) Natural user interfaces: trend in virtual interaction. Int J Latest Technol Eng Manag Appl Sci 3(4):141–143
Haddadin S, Suppa M, Fuchs S, Bodenmüller T, Albu-Schäffer A, Hirzinger G (2011) Towards the robotic co-worker. In: Robotics research. Springer, pp 261–282
Martínez-de Dios JR, Torres-González A, Paneque JL, Fuego-García D, Ramírez JRA, Ollero A (2018) Aerial robot coworkers for autonomous localization of missing tools in manufacturing plants. In: 2018 international conference on unmanned aircraft systems (ICUAS), Dallas, pp 1063–1069. https://doi.org/10.1109/ICUAS.2018.8453291
Jofré N, Rodríguez G, Alvarado Y, Fernández J, Guerrero R (2018) Natural user interfaces: a physical activity trainer. Springer, pp 122–131. https://doi.org/10.1007/978-3-319-75214-3_12
Rybarczyk Y, Cointe C, Gonçalves T, Minhoto V, Deters JK, Villarreal S, Gonzalvo AA, Baldeon J, Esparza D (2018) On the use of natural user interfaces in physical rehabilitation: a web-based application for patients with hip prosthesis. J Sci Technol Arts 10(2):15–24. https://doi.org/10.7559/citarj.v10i2.402
Sethu-Jones G, Bianchi-Berthouze N, Bielski R, Julier S (2010) Towards a situated, multimodal interface for multiple UAV control, pp 1739 – 1744. https://doi.org/10.1109/ROBOT.2010.5509960
Aliprantis J, Konstantakis M, Nikopoulou R, Mylonas P, Caridakis G (2019) Natural interaction in augmented reality context
Ju MH, Kang HB (2007) Human robot interaction using face pose recognition, pp 1–2. https://doi.org/10.1109/ICCE.2007.341353
Fernández RAS, Sanchez-Lopez JL, Sampedro C, Bavle H, Molina M, Campoy P (2016) Natural user interfaces for human-drone multi-modal interaction. In: 2016 international conference on unmanned aircraft systems (ICUAS), pp 1013–1022
Mercado-Ravell D, Castillo P, Lozano R (2019) Visual detection and tracking with UAVs, following a mobile object. Adv Robot 33:1–15. https://doi.org/10.1080/01691864.2019.1596834
Article Google Scholar
Berra E, Cuautle R (2013) Interfaz natural para el control de drone mediante kinect natural interface control using drone kinect. J Cienc Ingen 5:53–65
Google Scholar
Cai C, Yang S, Yan P, Tian J, Du L, Yang X (2019) Real-time human-posture recognition for human-drone interaction using monocular vision
Cao Z, Simon T, Wei SE, Sheikh Y (2016) Openpose: realtime multi-person 2d pose estimation using part affinity fields. In: IEEE transactions on pattern analysis and machine intelligence, vol. 43(1), pp. 172–186. https://doi.org/10.1109/TPAMI.2019.2929257
Yam-Viramontes BA, Mercado-Ravell D (2020) Implementation of a natural user interface to command a drone. In: 2020 international conference on unmanned aircraft systems (ICUAS), pp 1139–1144. https://doi.org/10.1109/ICUAS48674.2020.9213995
Schrepp M, Hinderks A, Thomaschewski J (2017) Construction of a benchmark for the user experience questionnaire (UEQ). Int J Interact Multimedia Artif Intell 4:40–44. https://doi.org/10.9781/ijimai.2017.445
Article Google Scholar
Rogers Y, Sharp H, Preece J (2011) Interaction design: beyond human–computer interaction, 3rd edn. Wiley
Glonek G, Pietruszka M (2012) Natural user interfaces (NUI): review. J Appl Comput Sci 20:27–45
Google Scholar
Chen X (2014) Xbox one natural user interface. http://www.xiaoji-chen.com/2014/xbox-one-natural-user-interface
Regazzoni D, Rizzi C, Vitali A (2018) Virtual reality applications: guidelines to design natural user interface. In: ASME 2018 international design engineering technical conferences and computers and information in engineering conference. American Society of Mechanical Engineers Digital Collection
Del Ra W (2011) Brave NUI world: designing natural user interfaces for touch and gesture by Daniel Wigdor and Dennis Wixon. SIGSOFT Softw. Eng. Notes 36(6):29–30. https://doi.org/10.1145/2047414.2047439
Article Google Scholar
Perera A, Law YW, Chahl J (2019) Drone-action: an outdoor recorded drone video dataset for action recognition. Drones 3:82. https://doi.org/10.3390/drones3040082
Article Google Scholar
Cauchard J, E J, Zhai K, Landay J (2015) Drone & me: an exploration into natural human-drone interaction, pp 361–365. https://doi.org/10.1145/2750858.2805823
Martin B, Hanington B (2012) Universal methods of design: 100 ways to research complex problems, develop innovative ideas, and design effective solutions. Rockport Publishers, Beverly
Google Scholar
Sekii T (2018) Pose proposal networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision—ECCV 2018. Springer, pp 350–366
https://wrnch.ai/
Güler RA, Neverova N, Kokkinos I (2018) Densepose: dense human pose estimation in the wild. CoRR arXiv:1802.00434
Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 43(1):172–186
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/bf00994018
Article MATH Google Scholar
Powering the world’s robots. https://www.ros.org/
Ubuntu 18.04.5 lts (bionic beaver). https://releases.ubuntu.com/18.04/
Apache http server project. https://https.apache.org/
Ros bridge suite. http://wiki.ros.org/rosbridge/_suite
Marrs T (2017) JSON at work: practical data integration for the web. O’Reilly Media, Inc
The standard ros javascript library. http://wiki.ros.org/roslibjs
Websockets in javascript. https://developer.mozilla.org/en-US/docs/Web/API/WebSocket
Openpose ros library. https://github.com/firephinx/openpose_ros
Ros web video server library. http://wiki.ros.org/web/_video/_server
Fielding R, Gettys J, Mogul J, Frystyk H, Masinter L, Leach P, Berners-Lee T (1999) Hypertext transfer protocol—http/1.1
Lubbers P, Albers B, Salim F (2011) Working with audio and video. In: Pro HTML5 programming. Springer, pp 83–106
Berners-Lee T, Masinter L, McCahill M, et al. (1994) Uniform resource locators (url)
Hassenzahl M, Tractinsky N (2006) User experience—a research agenda. Behav Inf Technol 25(2):91–97
Article Google Scholar
Väänänen-Vainio-Mattila K, Roto V, Hassenzahl M (2008) Towards practical user experience evaluation methods. In: Meaningful measures: valid useful user experience measurement (VUUM) pp 19–22
Schrepp M, Hinderks A, Thomaschewski J (2014) Applying the user experience questionnaire (UEQ) in different evaluation scenarios. In: International conference of design, user experience, and usability. Springer, pp 383–392

Download references

Funding

This work was supported by the Mexican National Council of Science and Technology CONACYT, and the FORDECyT project 296737 “Consorcio en Inteligencia Artificial”.

Author information

Authors and Affiliations

Instituto Tecnológico Superior de Jerez ITSJ, Jerez de García Salinas, Mexico
Brandon Yam-Viramontes
Center for Research in Mathematics CIMAT AC, Campus Zacatecas, Zacatecas, Mexico
Javier González-Trejo & Cristian Trujillo-Espinoza
Investigadores CONACYT at Center for Research in Mathematics CIMAT AC, Campus Zacatecas, Zacatecas, Mexico
Héctor Cardona-Reyes & Diego Mercado-Ravell

Authors

Brandon Yam-Viramontes
View author publications
You can also search for this author in PubMed Google Scholar
Héctor Cardona-Reyes
View author publications
You can also search for this author in PubMed Google Scholar
Javier González-Trejo
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Trujillo-Espinoza
View author publications
You can also search for this author in PubMed Google Scholar
Diego Mercado-Ravell
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. First and fourth authors were in charge of the system implementation under the supervision of second and fifth authors. Third author developed the Support Vector Classifier to distinguish between the different body poses. First and Forth authors performed the experiments with human-users and evaluated the user’s experience. The paper writing was mainly done by First, Second, Third and Fifth authors. Finally, the Corresponding author was in charge of the whole project supervision. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Diego Mercado-Ravell.

Ethics declarations

Conflict of interest

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yam-Viramontes, B., Cardona-Reyes, H., González-Trejo, J. et al. Commanding a drone through body poses, improving the user experience. J Multimodal User Interfaces 16, 357–369 (2022). https://doi.org/10.1007/s12193-022-00396-0

Download citation

Received: 20 April 2021
Accepted: 15 September 2022
Published: 23 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s12193-022-00396-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Commanding a drone through body poses, improving the user experience

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Face Recognition and Hand Gesture Control for Tello Drone Navigation

Real-Time Human-Posture Recognition for Human-Drone Interaction Using Monocular Vision

Usability Evaluation of Newly Developed Three-Dimensional Input Device for Drone Operation

Availability of data and materials

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now