Skip to main content

Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

  • Conference paper
  • First Online:
Biomedical Applications Based on Natural and Artificial Computing (IWINAC 2017)

Abstract

Context awareness in autonomous robots is usually performed combining localization information, objects identification, human interaction and time of the day. We think that gathering environmental sounds we can improve context recognition. With that purpose, we have designed, developed and tested an Environment Recognition Component (ERC) that provides an extra input to our Context-Awareness Component (CAC) and increases the rate of labeling correctly users’ activities. First element, the Environment Recognition Component (ERC) uses convolutional neural networks to classify acoustic signals and providing information to the Context-Awareness Component (CAC) which infers the user activity using a hierarchical Bayesian network. The work described in this paper evaluates the results of the labeling process in two HRI scenarios: robot and user sharing room and robot, and when the human and the robot are in different rooms. The results showed better accuracy when the ERC uses acoustic signals.

This work was partially supported by Spanish Ministry of Economy and Competitivity under grant TIN2016-76515-R.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    librosa: v0.3.1 library by B. McFee et al., doi:10.5281/zenodo.12714.

  2. 2.

    https://www.aota.org/.

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available at http://tensorflow.org/

  2. Abdel-Hamid, O., Mohamed, A., Jiang, H., Penn, G.: Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4277–4280. IEEE (2012)

    Google Scholar 

  3. Chachada, S., Kuo, C.C.J.: Environmental sound recognition: a survey. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific, pp. 1–9. IEEE (2013)

    Google Scholar 

  4. Dieleman, S., Brakel, P., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 669–674. IEEE (2011)

    Google Scholar 

  5. Fukushima, K.: Features for content-based audio retrieval. Biol. Cybern. 36(4), 193–202 (1980)

    Article  Google Scholar 

  6. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315–323 (2011)

    Google Scholar 

  7. Göker, A., Myrhaug, H.I.: User context and personalisation. In: Workshop proceedings for the 6th European Conference on Case Based Reasoning (2002)

    Google Scholar 

  8. Jiang, H.: Confidence measures for speech recognition: a survey. Speech Commun. 45(4), 455–470 (2005)

    Article  Google Scholar 

  9. Korpipaa, P., Mantyjarvi, J., Kela, J., Keranen, H., Malm, E.J.: Managing context information in mobile devices. IEEE Pervasive Comput. 2(3), 42–51 (2003)

    Article  Google Scholar 

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  11. Lacave, C., Luque, M., Díez, F.J.: Explanation of Bayesian networks and influence diagrams in Elvira. Syst. Man Cybern. Part B: Cybern. IEEE Trans. 37(4), 952–965 (2007)

    Article  Google Scholar 

  12. Liao, L., Fox, D., Kautz, H.: Location-based activity recognition. Adv. Neural Inf. Process. Syst. 18, 787 (2006)

    Google Scholar 

  13. McCarthy, J., Buvac, S.: Formalizing context (expanded notes) (1997)

    Google Scholar 

  14. Mitrović, D., Zeppelzauer, M., Breiteneder, C.: Features for content-based audio retrieval. Adv. Comput. 78, 71–150 (2010)

    Article  Google Scholar 

  15. Moore, D.J., Essa, I.A., Hayes, M.H.: Exploiting human actions and object context for recognition tasks. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 1, pp. 80–86. IEEE (1999)

    Google Scholar 

  16. Piczac, K.: Enviromental sound classification with convolutional neuronal network. In: Proceedings of the 2015 IEEE International Workshop on Machine Learning for Signal Processing. IEEE (2015)

    Google Scholar 

  17. Quigley, M., Conley, K., Gerkey, B.P., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software (2009)

    Google Scholar 

  18. Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. (2017)

    Google Scholar 

  19. Trentin, E., Gori, M.: A survey of hybrid ann/hmm models for automatic speech recognition. Neurocomputing 37(1), 91–126 (2001)

    Article  MATH  Google Scholar 

  20. Wang, X.H., Zhang, D.Q., Gu, T., Pung, H.K.: Ontology based context modeling and reasoning using OWL. In: Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications Workshops, 2004, pp. 18–22. IEEE (2004)

    Google Scholar 

  21. Zhu, C., Sheng, W.: Motion-and location-based online human daily activity recognition. Pervasive Mobile Comput. 7(2), 256–269 (2011)

    Article  MathSciNet  Google Scholar 

  22. Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 322–331. ACM (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco J. Rodríguez Lera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Rodríguez Lera, F.J., Rico, F.M., Matellán, V. (2017). Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals. In: Ferrández Vicente, J., Álvarez-Sánchez, J., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds) Biomedical Applications Based on Natural and Artificial Computing. IWINAC 2017. Lecture Notes in Computer Science(), vol 10338. Springer, Cham. https://doi.org/10.1007/978-3-319-59773-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59773-7_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59772-0

  • Online ISBN: 978-3-319-59773-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics