Skip to main content

Multimodal Object Analysis with Auditory and Tactile Sensing Using Recurrent Neural Networks

  • Conference paper
  • First Online:
Cognitive Systems and Signal Processing (ICCSIP 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1397))

Included in the following conference series:

Abstract

Robots are usually equipped with many different sensors that need to be integrated. While most research is focused on the integration of vision with other senses, we successfully integrate tactile and auditory sensor data from a complex robotic system. Herein, we train and evaluate a neural network for the classification of the content of eight optically identical medicine containers. To investigate the relevance of the tactile modality in classification under realistic conditions, we apply different noise levels to the audio data. Our results show significantly higher robustness to acoustic noise with the combined multimodal network than with the unimodal audio based counterpart.

This research was funded by the German Research Foundation (DFG) and the National Science Foundation of China in project Crossmodal Learning, TRR-169. It is also partially financed by the H2020-MSCA-RISE Project ULTRACEPT. Manfred Eppe acknowledges support via the DFG-funded IDEAS (EP 143/2-1) and LeCAREbot (EP 143/4-1) projects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    To enable control over the acoustic noise, we used a high-quality external microphone and added separately recorded noise of the robot to the signal during the evaluation.

References

  1. Nakamura, T., Nagai, T., Iwahashi, N.: Multimodal object categorization by a robot. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2415–2420 (2007)

    Google Scholar 

  2. Pieropan, A., Salvi, G., Pauwels, K., Kjellström, H.: Audio-visual classification and detection of human manipulation actions. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3045–3052 (2014)

    Google Scholar 

  3. Sun, F., Liu, C., Huang, W., Zhang, J.: Object classification and grasp planning using visual and tactile sensing. IEEE Trans. Syst. Man Cybern. Syst. 46(7), 969–979 (2016)

    Article  Google Scholar 

  4. Eppe, M., Kerzel, M., Griffiths, S., Ng, H.G., Wermter, S.: Combining deep learning for visuo-motor coordination with object detection and tracking to realize a high-level interface for robot object-picking. In: IEEE RAS International Conference on Humanoid Robots (Humanoids), pp. 612–617 (2017)

    Google Scholar 

  5. Kerzel, M., Eppe, M., Heinrich, S., Abawi, F., Wermter, S.: Neurocognitive shared visuomotor network for end-to-end learning of object identification, localization and grasping on a humanoid. In: IEEE Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 19–24 (2019)

    Google Scholar 

  6. Alfadhel, A., Khan, M.A., Cardoso de Freitas, S., Kosel, J.: Magnetic tactile sensor for braille reading. IEEE Sens. J. 16(24), 8700–8705 (2016)

    Google Scholar 

  7. Litvak, D., Zigel, Y., Gannot, I.: Fall detection of elderly through floor vibrations and sound. In: 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 2008, pp. 4632–4635. IEEE (2008)

    Google Scholar 

  8. Wettels, N., Santos, V.J., Johansson, R.S., Loeb, G.E.: Biomimetic tactile sensor array. Adv. Robot. 22(8), 829–849 (2008)

    Article  Google Scholar 

  9. Arian, M.S., Blaine, C.A., Loeb, G.E., Fishel, J.A.: Using the BioTac as a tumor localization tool. In: IEEE Haptics Symposium (HAPTICS), pp. 443–448. IEEE (2014)

    Google Scholar 

  10. Su, Z., Hausman, K., Chebotar, Y., Molchanov, A., Loeb, G.E., Sukhatme, G.S., Schaal, S.: Force estimation and slip detection/classification for grip control using a biomimetic tactile sensor. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 297–303 (2015)

    Google Scholar 

  11. Fishel, J.A., Loeb, G.E.: Sensing tactile microvibrations with the BioTac - comparison with human sensitivity. In: 2012 4th IEEE RAS EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob), pp. 1122–1127 (2012)

    Google Scholar 

  12. Xu, D., Loeb, G.E., Fishel, J.A.: Tactile identification of objects using Bayesian exploration. In: 2013 IEEE International Conference on Robotics and Automation, pp. 3056–3061. IEEE (2013)

    Google Scholar 

  13. Kerzel, M., Ali, M., Ng, H.G., Wermter, S.: Haptic material classification with a multi-channel neural network. In: International Joint Conference on Neural Networks (IJCNN), pp. 439–446 (2017)

    Google Scholar 

  14. Chen, C.L., Snyder, J.O., Ramadge, P.J.: Learning to identify container contents through tactile vibration signatures. In: 2016 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR), pp. 43–48. IEEE (2016)

    Google Scholar 

  15. Durst, R.S., Krotkov, E.P.: Object classification from analysis of impact acoustics. In: Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots, vol. 1, pp. 90–95 (1995)

    Google Scholar 

  16. Luo, S., Zhu, L., Althoefer, K., Liu, H.: Knock-Knock: acoustic object recognition by using stacked denoising autoencoders. Neurocomputing 267, 18–24 (2017)

    Article  Google Scholar 

  17. Eppe, M., Kerzel, M., Strahl, E., Wermter, S.: Deep neural object analysis by interactive auditory exploration with a humanoid robot. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 284–289. IEEE (2018)

    Google Scholar 

  18. Liang, H., Li, S., Ma, X., Hendrich, N., Gerkmann, T., Zhang, J.: Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring. arXiv preprint arXiv:1903.00650 (2019)

  19. Sinapov, J., Schenck, C., Stoytchev, A.: Learning Relational Object Categories using Behavioral Exploration and Multimodal Perception. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), IEEE (2014) 5691–5698

    Google Scholar 

  20. Meeussen, W., et al.: Autonomous door opening and plugging in with a personal robot. In: 2010 IEEE International Conference on Robotics and Automation, pp. 729–736. IEEE (2010)

    Google Scholar 

  21. The Shadow Robot Company: The Shadow Dexterous Hand. https://www.shadowrobot.com/products/dexterous-hand/. Accessed 6 Oct 2020

  22. Wettels, N., Fishel, J.A., Loeb, G.E.: Multimodal tactile sensor. In: Balasubramanian, R., Santos, V.J. (eds.) The Human Hand as an Inspiration for Robot Hand Development. STAR, vol. 95, pp. 405–429. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-03017-3_19

    Chapter  Google Scholar 

  23. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics Speech Sig. Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  24. Eppe, M., Alpay, T., Wermter, S.: Towards end-to-end raw audio music synthesis. In: International Conference on Artificial Neural Networks (ICANN), pp. 137–146 (2018)

    Google Scholar 

  25. Strahl, E., Kerzel, M., Eppe, M., Griffiths, S.: Hear the egg - demonstrating robotic interactive auditory perception. In: International Conference on Intelligent Robots and Systems (IROS), p. 5041 (2018)

    Google Scholar 

  26. Bergstra, J., Yamins, D., Cox, D.D.: Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures (2013)

    Google Scholar 

  27. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  28. Cho, K., et al.: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078 (2014)

  29. Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  30. Eppe, M., Magg, S., Wermter, S.: Curriculum goal masking for continuous deep reinforcement learning. In: International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 183–188 (2019)

    Google Scholar 

  31. Eppe, M., Nguyen, P.D.H., Wermter, S.: From semantics to execution: integrating action planning with reinforcement learning for robotic causal problem-solving. Front. Robot. AI 6, 123 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yannick Jonetzko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jonetzko, Y., Fiedler, N., Eppe, M., Zhang, J. (2021). Multimodal Object Analysis with Auditory and Tactile Sensing Using Recurrent Neural Networks. In: Sun, F., Liu, H., Fang, B. (eds) Cognitive Systems and Signal Processing. ICCSIP 2020. Communications in Computer and Information Science, vol 1397. Springer, Singapore. https://doi.org/10.1007/978-981-16-2336-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-2336-3_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-2335-6

  • Online ISBN: 978-981-16-2336-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics