Skip to main content

Estimating Context Aware Human-Object Interaction Using Deep Learning-Based Object Recognition Architectures

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1401))

Abstract

In this work, we propose an architecture for predicting plausible person-object interactions based on image visible objects and room recognition. First, the system detects objects in the video using a popular framework named “YOLO” (You Only Look Once) and associates each object with their possible interactions. Then, making use of a convolutional neural network, our algorithm recognizes which is the room that appears in the image and filters possible context aware human-object interactions. The main purpose of this project is helping people with memory failures to perform daily activities. Many people have problems carrying out actions that can be natural for the rest. With the aim to assist them, we are interested in the development of methods which allow remembering them the actions they may have forgotten.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Fernández Montenegro, J.M., Villarini, B., Angelopoulou, A., Kapetanios, E., Garcia-Rodriguez, J., Argyriou, V.: A survey of alzheimer’s disease early diagnosis methods for cognitive assessment. Sensors 20(24), 7292 (2020)

    Article  Google Scholar 

  2. Garcia-Rodriguez, J., et al.: COMBAHO: a deep learning system for integrating brain injury patients in society. Pattern Recognit. Lett. 137, 80–90 (2020)

    Article  Google Scholar 

  3. Puig, X.: Watch-and-help: A challenge for social perception and human-ai collaboration. arXiv preprint arXiv:2010.09890 (2020)

  4. Mo, K., Guibas, L., Mukadam, M., Gupta, A., Tulsiani, S.: Where2act: From pixels to actions for articulated 3d objects. arXiv preprint arXiv:2101.02692 (2021)

  5. Sun, Z., Liu, J., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G.: Human action recognition from various data modalities: A review. arXiv preprint arXiv:2012.11866 (2020)

  6. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  7. Chollet, F., et al.: Keras. https://keras.io (2015)

  8. Lin, T.Y., et al.: Microsoft coco: Common objects in context (2015)

    Google Scholar 

  9. Le, D.T., Uijlings, J., Bernardi, R.: TUHOI: Trento universal human object interaction dataset. In: Proceedings of the Third Workshop on Vision and Language, Dublin City University and the Association for Computational Linguistics, pp. 17–24, Dublin, Ireland (August 2014)

    Google Scholar 

  10. Dai, R.: Toyota smarthome untrimmed: Real-world untrimmed videos for activity detection. arXiv preprint arXiv:2010.14982 (2020)

  11. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018)

    Article  Google Scholar 

  12. Damen, D.: Scaling egocentric vision: the epic-kitchens dataset. In: European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

Download references

Acknowledgement

This work has been funded by the Spanish Government PID2019-104818RB-I00 grant for the MoDeaAS project, supported with Feder funds. This work has also been supported by Spanish national grants for PhD studies FPU17/00166, ACIF/2018/197 and UAFPU2019-13. Experiments were made possible by a generous hardware donation from NVIDIA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Garcia-Rodriguez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fernández, I.S.M., Oprea, S., Castro-Vargas, J.A., Martinez-Gonzalez, P., Garcia-Rodriguez, J. (2022). Estimating Context Aware Human-Object Interaction Using Deep Learning-Based Object Recognition Architectures. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021). SOCO 2021. Advances in Intelligent Systems and Computing, vol 1401. Springer, Cham. https://doi.org/10.1007/978-3-030-87869-6_41

Download citation

Publish with us

Policies and ethics