Skip to main content

Towards a Natural Human-Robot Interaction in an Industrial Environment

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 704))

Abstract

Nowadays, modern industry has adopted robots as part of their processes. In many scenarios, such machines collaborate with humans to perform specific tasks in their same environment or simply guide them in a natural, safe and efficient way. Our approach improves a previously conducted work on a multi-modal human-robot interaction system with different audio acquisition and speech recognition modules for a more natural communication. The semantic interpreter, with the aid of a knowledge manager, parses the resulting transcription and, using contextual information, selects the order that the operator has uttered and sends it to the robot to be executed. This setup is evaluated in a real manufacture scenario in a laboratory environment with a large set of end users both quantitatively and qualitatively. The gathered results reveal that the system behaves robustly and that the assignment was also considered by the end users as manageable, whilst the system in overall was received with a high level of trust and usability.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Amodei D, Anubhai R, Battenberg E, Case C, Casper J, Catanzaro B, Chen J, Chrzanowski M, Coates A, Diamos G, Elsen E, Engel J, Fan L, Fougner C, Han T, Hannun A, Jun B, LeGresley P, Lin L, Narang S, Ng A, Ozair S, Prenger R, Raiman J, Satheesh S, Seetapun D, Sengupta S, Wang Y, Wang Z, Wang C, Xiao B, Yogatama D, Zhan J, Zhu Z (2015) Deep speech 2: end-to-end speech recognition in English and Mandarin

    Google Scholar 

  2. Anastasakos T, McDonough J, Schwartz R, Makhoul J (1996) A compact model for speaker-adaptive training. In: Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, vol 2. IEEE, pp 1137–1140

    Google Scholar 

  3. Antonelli D, Bruno G (2017) Human-robot collaboration using industrial robots. In: 2017 2nd International Conference on Electrical, Automation and Mechanical Engineering (EAME 2017). Atlantis Press

    Google Scholar 

  4. Bernath C, Alvarez A, Arzelus H, Martínez CD (2018) Exploring E2E speech recognition systems for new languages. In: IberSPEECH, pp 102–106

    Google Scholar 

  5. Brooke J et al (1996) Sus-a quick and dirty usability scale. Usability Eval Ind 189(194):4–7

    Google Scholar 

  6. Campione E, Véronis J (1998) A multilingual prosodic database. In: Fifth International Conference on Spoken Language Processing

    Google Scholar 

  7. Casacuberta F, Garcia R, Llisterri J, Nadeu C, Pardo J, Rubio A (1991) Development of Spanish corpora for speech research (ALBAYZIN). In: Workshop on International Cooperation and Standardization of Speech Databases and Speech I/O Assesment Methods, Chiavari, Italy, pp 26–28

    Google Scholar 

  8. Charalambous G, Fletcher S, Webb P (2015) The development of a scale to evaluate trust in industrial human-robot collaboration. Int J Soc Robot 8. https://doi.org/10.1007/s12369-015-0333-8

  9. Gnjatović M, Tasevski J, Nikolić M, Mišković D, Borovac B, Delić V (2012) Adaptive multimodal interaction with industrial robot. In: 2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics. IEEE, pp 329–333

    Google Scholar 

  10. Gopinath RA (1998) Maximum likelihood modeling with gaussian distributions for classification. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), vol 2. IEEE, pp 661–664

    Google Scholar 

  11. Heafield K (1998) KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, pp 187–197

    Google Scholar 

  12. Kennedy J, Lemaignan S, Montassier C, Lavalade P, Irfan B, Papadopoulos F, Senft E, Belpaeme T (2017) Child speech recognition in human-robot interaction: evaluations and recommendations. In: Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp 82–90

    Google Scholar 

  13. Kildal J, Fernández I, Lluvia I, Lázaro I, Aceta C, Vidal N, Susperregi L (2019) Evaluating the UX obtained from a service robot that provides ancillary way-finding support in an industrial environment. In: Advances in Manufacturing Technology XXXIII: Proceedings of the 17th International Conference on Manufacturing Research, Incorporating the 34th National Conference on Manufacturing Research, 10–12 September 2019, Queen’s University, Belfast, vol 9. IOS Press, p 61

    Google Scholar 

  14. Lin Y, Min H, Zhou H, Chen M (2018) A natural language interaction based automatic operating system for industrial robot. In: International Conference on Intelligent Computing. Springer, pp 111–122

    Google Scholar 

  15. Lleida E, Ortega A, Miguel A, Bazán-Gil V, Pérez C, Gómez M, de Prada A (2019) Albayzin 2018 evaluation: the iberspeech-RTVE challenge onspeech technologies for spanish broadcast media. Appl Sci 9(24):5412. https://doi.org/10.3390/app9245412

  16. Maurtua I, Fernandez I, Tellaeche A, Kildal J, Susperregi L, Ibarguren A, Sierra B (2017) Natural multimodal communication for human-robot collaboration. Int J Adv Robot Syst 14:1–12. https://doi.org/10.1177/1729881417716043

    Article  Google Scholar 

  17. Padró L, Stanilovsky E (2012) Freeling 3.0: towards wider multilinguality. In: LREC2012

    Google Scholar 

  18. Park DS, Chan W, Zhang Y, Chiu CC, Zoph B, Cubuk ED, Le QV (2019) Specaugment: a simple data augmentation method for automatic speech recognition. Interspeech 2019. https://doi.org/10.21437/interspeech.2019-2680

  19. Peddinti V, Chen G, Manohar V, Ko T, Povey D, Khudanpur S (2015) JHU ASpIRE system: robust LVCSR with TDNNS, iVector adaptation and RNN-LMS. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp 539–546. https://doi.org/10.1109/ASRU.2015.7404842

  20. Peddinti V, Povey D, Khudanpur S (2015) A time delay neural network architecture for efficient modeling of long temporal contexts. In: INTERSPEECH

    Google Scholar 

  21. Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J, Stemmer G, Vesely K (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society. IEEE Catalog No. CFP11SRW-USB

    Google Scholar 

  22. Povey D, Kingsbury B, Mangu L, Saon G, Soltau H, Zweig G (2005) fMPE: discriminatively trained features for speech recognition. In: Proceedings.(ICASSP 2005). IEEE International Conference on Acoustics, Speech, and Signal Processing 2005, vol 1. IEEE, pp I–961

    Google Scholar 

  23. Pozo A, Aliprandi C, Álvarez A, Mendes C, Neto J, Paulo S, Piccinini N, Raffaelli M (2014) SAVAS: collecting, annotating and sharing audiovisual language resources for automatic subtitling

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Department of Economic Development and Competitiveness of the Basque Government via the LangileOK project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ander González-Docasal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

González-Docasal, A., Aceta, C., Arzelus, H., Álvarez, A., Fernández, I., Kildal, J. (2021). Towards a Natural Human-Robot Interaction in an Industrial Environment. In: D'Haro, L.F., Callejas, Z., Nakamura, S. (eds) Conversational Dialogue Systems for the Next Decade. Lecture Notes in Electrical Engineering, vol 704. Springer, Singapore. https://doi.org/10.1007/978-981-15-8395-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-8395-7_18

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-8394-0

  • Online ISBN: 978-981-15-8395-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics