Abstract
In this work, we propose an architecture for predicting plausible person-object interactions based on image visible objects and room recognition. First, the system detects objects in the video using a popular framework named “YOLO” (You Only Look Once) and associates each object with their possible interactions. Then, making use of a convolutional neural network, our algorithm recognizes which is the room that appears in the image and filters possible context aware human-object interactions. The main purpose of this project is helping people with memory failures to perform daily activities. Many people have problems carrying out actions that can be natural for the rest. With the aim to assist them, we are interested in the development of methods which allow remembering them the actions they may have forgotten.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Fernández Montenegro, J.M., Villarini, B., Angelopoulou, A., Kapetanios, E., Garcia-Rodriguez, J., Argyriou, V.: A survey of alzheimer’s disease early diagnosis methods for cognitive assessment. Sensors 20(24), 7292 (2020)
Garcia-Rodriguez, J., et al.: COMBAHO: a deep learning system for integrating brain injury patients in society. Pattern Recognit. Lett. 137, 80–90 (2020)
Puig, X.: Watch-and-help: A challenge for social perception and human-ai collaboration. arXiv preprint arXiv:2010.09890 (2020)
Mo, K., Guibas, L., Mukadam, M., Gupta, A., Tulsiani, S.: Where2act: From pixels to actions for articulated 3d objects. arXiv preprint arXiv:2101.02692 (2021)
Sun, Z., Liu, J., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G.: Human action recognition from various data modalities: A review. arXiv preprint arXiv:2012.11866 (2020)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Chollet, F., et al.: Keras. https://keras.io (2015)
Lin, T.Y., et al.: Microsoft coco: Common objects in context (2015)
Le, D.T., Uijlings, J., Bernardi, R.: TUHOI: Trento universal human object interaction dataset. In: Proceedings of the Third Workshop on Vision and Language, Dublin City University and the Association for Computational Linguistics, pp. 17–24, Dublin, Ireland (August 2014)
Dai, R.: Toyota smarthome untrimmed: Real-world untrimmed videos for activity detection. arXiv preprint arXiv:2010.14982 (2020)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018)
Damen, D.: Scaling egocentric vision: the epic-kitchens dataset. In: European Conference on Computer Vision (ECCV) (2018)
Acknowledgement
This work has been funded by the Spanish Government PID2019-104818RB-I00 grant for the MoDeaAS project, supported with Feder funds. This work has also been supported by Spanish national grants for PhD studies FPU17/00166, ACIF/2018/197 and UAFPU2019-13. Experiments were made possible by a generous hardware donation from NVIDIA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fernández, I.S.M., Oprea, S., Castro-Vargas, J.A., Martinez-Gonzalez, P., Garcia-Rodriguez, J. (2022). Estimating Context Aware Human-Object Interaction Using Deep Learning-Based Object Recognition Architectures. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021). SOCO 2021. Advances in Intelligent Systems and Computing, vol 1401. Springer, Cham. https://doi.org/10.1007/978-3-030-87869-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-87869-6_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87868-9
Online ISBN: 978-3-030-87869-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)