Skip to main content

Advertisement

Log in

Commanding a drone through body poses, improving the user experience

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

In this work, we propose the use of Multimodal Human-Computer Interfaces (MHCI) through body poses to command a drone in an easy and intuitive way. First, the human-user pose is recovered from a video stream, with the help of the open source library Open Pose. Then, a Support Vector Classifier (SVC), trained to distinguish between different body poses, is used to interpret eleven different human poses as the most important high level commands to the drone. The proposed strategy was successfully implemented to remotely control a drone through a web interface, where the user can interact with a drone in a remote location using only a web interface. Real-time experiments were carried out with fourteen different volunteers, selected to represent different segments of the population, with different ages, gender, experience with technology, socioeconomic class, etc., in order to evaluate the user experience with the help of a User Experience Questionnaire (UEQ), demonstrating satisfactory results. The study suggests that the use of the proposed MHCI received good acceptance between the participants, even in users without previous experience with drones, and received excellent scores in terms of attractiveness, stimulation and novelty from most of the volunteers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and materials

Video available at https://youtu.be/aUho-uN1DzM.

Code availability

Not applicable.

References

  1. Choudhury S, Solovey K, Kochenderfer MJ, Pavone M (2021). Efficient large-scale multi-drone delivery using transit networks. J Artif Int Res 70:757–788. https://doi.org/10.1613/jair.1.12450

  2. Tzelepi M, Tefas A (2019) Graph embedded convolutional neural networks in human crowd detection for drone flight safety. IEEE Trans Emerging Top Comput Intell. https://doi.org/10.1109/TETCI.2019.2897815

  3. Kim J, Kim S, Ju C, Son H (2019) Unmanned aerial vehicles in agriculture: a review of perspective of platform, control, and applications. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2932119

  4. Dowling L, Poblete T, Hook I, Tang H, Tan Y, Glenn W, Unnithan RR (2018) Accurate indoor mapping using an autonomous unmanned aerial vehicle (UAV). CoRR arXiv:1808.01940

  5. Gonçalves J, Renato H (2015) UAV photogrammetry for topographic monitoring of coastal areas. ISPRS J Photogram Remote Sens. https://doi.org/10.1016/j.isprsjprs.2015.02.009

  6. Kaufmann E, Loquercio A, Ranftl R, Dosovitskiy A, Koltun V, Scaramuzza D (2018) Deep drone racing: learning agile flight in dynamic environments. CoRR arXiv:1806.08548

  7. Oviatt S, Schuller B, Cohen PR, Sonntag D, Potamianos G, Krüger A (eds) (2017) The handbook of multimodal-multisensor interfaces: foundations, user modeling, and common modality combinations—volume 1, vol 14. Association for Computing Machinery and Morgan & Claypool

  8. Karray F, Alemzadeh M, Abou Saleh J, Arab MN (2008) Human–computer interaction: overview on state of the art. Int J Smart Sens Intell Syst 1(1178–5608), 137–159. https://doi.org/10.21307/ijssis-2017-283

  9. Vernier F, Nigay L (2001) A framework for the combination and characterization of output modalities. In: Palanque P, Paternò F (eds) Interactive systems design, specification, and verification. Springer, Berlin, Heidelberg, pp 35–50

    Chapter  Google Scholar 

  10. Kaushik D, Jain R (2014) Natural user interfaces: trend in virtual interaction. Int J Latest Technol Eng Manag Appl Sci 3(4):141–143

  11. Haddadin S, Suppa M, Fuchs S, Bodenmüller T, Albu-Schäffer A, Hirzinger G (2011) Towards the robotic co-worker. In: Robotics research. Springer, pp 261–282

  12. Martínez-de Dios JR, Torres-González A, Paneque JL, Fuego-García D, Ramírez JRA, Ollero A (2018) Aerial robot coworkers for autonomous localization of missing tools in manufacturing plants. In: 2018 international conference on unmanned aircraft systems (ICUAS), Dallas, pp 1063–1069. https://doi.org/10.1109/ICUAS.2018.8453291

  13. Jofré N, Rodríguez G, Alvarado Y, Fernández J, Guerrero R (2018) Natural user interfaces: a physical activity trainer. Springer, pp 122–131. https://doi.org/10.1007/978-3-319-75214-3_12

  14. Rybarczyk Y, Cointe C, Gonçalves T, Minhoto V, Deters JK, Villarreal S, Gonzalvo AA, Baldeon J, Esparza D (2018) On the use of natural user interfaces in physical rehabilitation: a web-based application for patients with hip prosthesis. J Sci Technol Arts 10(2):15–24. https://doi.org/10.7559/citarj.v10i2.402

  15. Sethu-Jones G, Bianchi-Berthouze N, Bielski R, Julier S (2010) Towards a situated, multimodal interface for multiple UAV control, pp 1739 – 1744. https://doi.org/10.1109/ROBOT.2010.5509960

  16. Aliprantis J, Konstantakis M, Nikopoulou R, Mylonas P, Caridakis G (2019) Natural interaction in augmented reality context

  17. Ju MH, Kang HB (2007) Human robot interaction using face pose recognition, pp 1–2. https://doi.org/10.1109/ICCE.2007.341353

  18. Fernández RAS, Sanchez-Lopez JL, Sampedro C, Bavle H, Molina M, Campoy P (2016) Natural user interfaces for human-drone multi-modal interaction. In: 2016 international conference on unmanned aircraft systems (ICUAS), pp 1013–1022

  19. Mercado-Ravell D, Castillo P, Lozano R (2019) Visual detection and tracking with UAVs, following a mobile object. Adv Robot 33:1–15. https://doi.org/10.1080/01691864.2019.1596834

    Article  Google Scholar 

  20. Berra E, Cuautle R (2013) Interfaz natural para el control de drone mediante kinect natural interface control using drone kinect. J Cienc Ingen 5:53–65

    Google Scholar 

  21. Cai C, Yang S, Yan P, Tian J, Du L, Yang X (2019) Real-time human-posture recognition for human-drone interaction using monocular vision

  22. Cao Z, Simon T, Wei SE, Sheikh Y (2016) Openpose: realtime multi-person 2d pose estimation using part affinity fields. In: IEEE transactions on pattern analysis and machine intelligence, vol. 43(1), pp. 172–186. https://doi.org/10.1109/TPAMI.2019.2929257

  23. Yam-Viramontes BA, Mercado-Ravell D (2020) Implementation of a natural user interface to command a drone. In: 2020 international conference on unmanned aircraft systems (ICUAS), pp 1139–1144. https://doi.org/10.1109/ICUAS48674.2020.9213995

  24. Schrepp M, Hinderks A, Thomaschewski J (2017) Construction of a benchmark for the user experience questionnaire (UEQ). Int J Interact Multimedia Artif Intell 4:40–44. https://doi.org/10.9781/ijimai.2017.445

    Article  Google Scholar 

  25. Rogers Y, Sharp H, Preece J (2011) Interaction design: beyond human–computer interaction, 3rd edn. Wiley

  26. Glonek G, Pietruszka M (2012) Natural user interfaces (NUI): review. J Appl Comput Sci 20:27–45

    Google Scholar 

  27. Chen X (2014) Xbox one natural user interface. http://www.xiaoji-chen.com/2014/xbox-one-natural-user-interface

  28. Regazzoni D, Rizzi C, Vitali A (2018) Virtual reality applications: guidelines to design natural user interface. In: ASME 2018 international design engineering technical conferences and computers and information in engineering conference. American Society of Mechanical Engineers Digital Collection

  29. Del Ra W (2011) Brave NUI world: designing natural user interfaces for touch and gesture by Daniel Wigdor and Dennis Wixon. SIGSOFT Softw. Eng. Notes 36(6):29–30. https://doi.org/10.1145/2047414.2047439

    Article  Google Scholar 

  30. Perera A, Law YW, Chahl J (2019) Drone-action: an outdoor recorded drone video dataset for action recognition. Drones 3:82. https://doi.org/10.3390/drones3040082

    Article  Google Scholar 

  31. Cauchard J, E J, Zhai K, Landay J (2015) Drone & me: an exploration into natural human-drone interaction, pp 361–365. https://doi.org/10.1145/2750858.2805823

  32. Martin B, Hanington B (2012) Universal methods of design: 100 ways to research complex problems, develop innovative ideas, and design effective solutions. Rockport Publishers, Beverly

    Google Scholar 

  33. Sekii T (2018) Pose proposal networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision—ECCV 2018. Springer, pp 350–366

  34. https://wrnch.ai/

  35. Güler RA, Neverova N, Kokkinos I (2018) Densepose: dense human pose estimation in the wild. CoRR arXiv:1802.00434

  36. Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 43(1):172–186

    Article  Google Scholar 

  37. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/bf00994018

    Article  MATH  Google Scholar 

  38. Powering the world’s robots. https://www.ros.org/

  39. Ubuntu 18.04.5 lts (bionic beaver). https://releases.ubuntu.com/18.04/

  40. Apache http server project. https://https.apache.org/

  41. Ros bridge suite. http://wiki.ros.org/rosbridge/_suite

  42. Marrs T (2017) JSON at work: practical data integration for the web. O’Reilly Media, Inc

  43. The standard ros javascript library. http://wiki.ros.org/roslibjs

  44. Websockets in javascript. https://developer.mozilla.org/en-US/docs/Web/API/WebSocket

  45. Openpose ros library. https://github.com/firephinx/openpose_ros

  46. Ros web video server library. http://wiki.ros.org/web/_video/_server

  47. Fielding R, Gettys J, Mogul J, Frystyk H, Masinter L, Leach P, Berners-Lee T (1999) Hypertext transfer protocol—http/1.1

  48. Lubbers P, Albers B, Salim F (2011) Working with audio and video. In: Pro HTML5 programming. Springer, pp 83–106

  49. Berners-Lee T, Masinter L, McCahill M, et al. (1994) Uniform resource locators (url)

  50. Hassenzahl M, Tractinsky N (2006) User experience—a research agenda. Behav Inf Technol 25(2):91–97

    Article  Google Scholar 

  51. Väänänen-Vainio-Mattila K, Roto V, Hassenzahl M (2008) Towards practical user experience evaluation methods. In: Meaningful measures: valid useful user experience measurement (VUUM) pp 19–22

  52. Schrepp M, Hinderks A, Thomaschewski J (2014) Applying the user experience questionnaire (UEQ) in different evaluation scenarios. In: International conference of design, user experience, and usability. Springer, pp 383–392

Download references

Funding

This work was supported by the Mexican National Council of Science and Technology CONACYT, and the FORDECyT project 296737 “Consorcio en Inteligencia Artificial”.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. First and fourth authors were in charge of the system implementation under the supervision of second and fifth authors. Third author developed the Support Vector Classifier to distinguish between the different body poses. First and Forth authors performed the experiments with human-users and evaluated the user’s experience. The paper writing was mainly done by First, Second, Third and Fifth authors. Finally, the Corresponding author was in charge of the whole project supervision. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Diego Mercado-Ravell.

Ethics declarations

Conflict of interest

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yam-Viramontes, B., Cardona-Reyes, H., González-Trejo, J. et al. Commanding a drone through body poses, improving the user experience. J Multimodal User Interfaces 16, 357–369 (2022). https://doi.org/10.1007/s12193-022-00396-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-022-00396-0

Keywords