Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

Rodríguez Lera, Francisco J.; Rico, Francisco Martín; Matellán, Vicente

doi:10.1007/978-3-319-59773-7_22

Francisco J. Rodríguez Lera¹⁸,
Francisco Martín Rico¹⁹ &
Vicente Matellán²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10338))

Included in the following conference series:

International Work-Conference on the Interplay Between Natural and Artificial Computation

1975 Accesses
3 Citations

Abstract

Context awareness in autonomous robots is usually performed combining localization information, objects identification, human interaction and time of the day. We think that gathering environmental sounds we can improve context recognition. With that purpose, we have designed, developed and tested an Environment Recognition Component (ERC) that provides an extra input to our Context-Awareness Component (CAC) and increases the rate of labeling correctly users’ activities. First element, the Environment Recognition Component (ERC) uses convolutional neural networks to classify acoustic signals and providing information to the Context-Awareness Component (CAC) which infers the user activity using a hierarchical Bayesian network. The work described in this paper evaluates the results of the labeling process in two HRI scenarios: robot and user sharing room and robot, and when the human and the robot are in different rooms. The results showed better accuracy when the ERC uses acoustic signals.

This work was partially supported by Spanish Ministry of Economy and Competitivity under grant TIN2016-76515-R.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
librosa: v0.3.1 library by B. McFee et al., doi:10.5281/zenodo.12714.
2.
https://www.aota.org/.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available at http://tensorflow.org/
Abdel-Hamid, O., Mohamed, A., Jiang, H., Penn, G.: Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4277–4280. IEEE (2012)
Google Scholar
Chachada, S., Kuo, C.C.J.: Environmental sound recognition: a survey. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific, pp. 1–9. IEEE (2013)
Google Scholar
Dieleman, S., Brakel, P., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 669–674. IEEE (2011)
Google Scholar
Fukushima, K.: Features for content-based audio retrieval. Biol. Cybern. 36(4), 193–202 (1980)
Article Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315–323 (2011)
Google Scholar
Göker, A., Myrhaug, H.I.: User context and personalisation. In: Workshop proceedings for the 6th European Conference on Case Based Reasoning (2002)
Google Scholar
Jiang, H.: Confidence measures for speech recognition: a survey. Speech Commun. 45(4), 455–470 (2005)
Article Google Scholar
Korpipaa, P., Mantyjarvi, J., Kela, J., Keranen, H., Malm, E.J.: Managing context information in mobile devices. IEEE Pervasive Comput. 2(3), 42–51 (2003)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lacave, C., Luque, M., Díez, F.J.: Explanation of Bayesian networks and influence diagrams in Elvira. Syst. Man Cybern. Part B: Cybern. IEEE Trans. 37(4), 952–965 (2007)
Article Google Scholar
Liao, L., Fox, D., Kautz, H.: Location-based activity recognition. Adv. Neural Inf. Process. Syst. 18, 787 (2006)
Google Scholar
McCarthy, J., Buvac, S.: Formalizing context (expanded notes) (1997)
Google Scholar
Mitrović, D., Zeppelzauer, M., Breiteneder, C.: Features for content-based audio retrieval. Adv. Comput. 78, 71–150 (2010)
Article Google Scholar
Moore, D.J., Essa, I.A., Hayes, M.H.: Exploiting human actions and object context for recognition tasks. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 1, pp. 80–86. IEEE (1999)
Google Scholar
Piczac, K.: Enviromental sound classification with convolutional neuronal network. In: Proceedings of the 2015 IEEE International Workshop on Machine Learning for Signal Processing. IEEE (2015)
Google Scholar
Quigley, M., Conley, K., Gerkey, B.P., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software (2009)
Google Scholar
Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. (2017)
Google Scholar
Trentin, E., Gori, M.: A survey of hybrid ann/hmm models for automatic speech recognition. Neurocomputing 37(1), 91–126 (2001)
Article MATH Google Scholar
Wang, X.H., Zhang, D.Q., Gu, T., Pung, H.K.: Ontology based context modeling and reasoning using OWL. In: Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications Workshops, 2004, pp. 18–22. IEEE (2004)
Google Scholar
Zhu, C., Sheng, W.: Motion-and location-based online human daily activity recognition. Pervasive Mobile Comput. 7(2), 256–269 (2011)
Article MathSciNet Google Scholar
Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 322–331. ACM (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

AI Robolab, University of Luxembourg, Luxembourg, Luxembourg
Francisco J. Rodríguez Lera
Universidad Rey Juan Carlos, Madrid, Spain
Francisco Martín Rico
Robotics Group, Universidad de León, León, Spain
Vicente Matellán

Authors

Francisco J. Rodríguez Lera
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Martín Rico
View author publications
You can also search for this author in PubMed Google Scholar
Vicente Matellán
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco J. Rodríguez Lera .

Editor information

Editors and Affiliations

Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Cartagena, Spain
José Manuel Ferrández Vicente
Departamento de Inteligencia Articial, Universidad Nacional de Educación a Distancia, Madrid, Spain
José Ramón Álvarez-Sánchez
Departamento de Inteligencia Articial, Universidad Nacional de Educación a Distancia, Madrid, Spain
Félix de la Paz López
Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Cartagena, Spain
Javier Toledo Moreo
The Ohio State University, Columbus, Ohio, USA
Hojjat Adeli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodríguez Lera, F.J., Rico, F.M., Matellán, V. (2017). Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals. In: Ferrández Vicente, J., Álvarez-Sánchez, J., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds) Biomedical Applications Based on Natural and Artificial Computing. IWINAC 2017. Lecture Notes in Computer Science(), vol 10338. Springer, Cham. https://doi.org/10.1007/978-3-319-59773-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-59773-7_22
Published: 27 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59772-0
Online ISBN: 978-3-319-59773-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics