Skip to main content
Log in

DL-DARE: Deep learning-based different activity recognition for the human–robot interaction environment

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a deep learning-based activity recognition for the Human–Robot Interaction environment. The observations of the object state are acquired from the vision sensor in the real-time scenario. The activity recognition system examined in this paper comprises activities labeled as classes (pour, rotate, drop objects, and open bottles). The image processing unit processes the images and predicts the activity performed by the robot using deep learning methods so that the robot will do the actions (sub-actions) according to the predicted activity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available in the [MIME Dataset] repository: [https://sites.google.com/view/mimedataset/dataset?authuser=0].

References

  1. David A, Chapman K, Weigelt M, Weiss D, Wel R (2012) Cognition, action and object manipulation. Psycholl Bull 138(5):924–946

    Article  Google Scholar 

  2. Roitberg A, Perzylo A, Somani N, Giuliani M, Rickert M, Knoll A (2014) Human activity recognition in the context of industrial human-robot interaction, signal and information processing association annual summit and conference (APSIPA). Asia-Pacific 2014:1–10. https://doi.org/10.1109/APSIPA.2014.7041588

    Article  Google Scholar 

  3. Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28(6):976–990

    Article  Google Scholar 

  4. Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: A survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488

    Article  Google Scholar 

  5. Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318

    Article  Google Scholar 

  6. Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) The sequence of the most informative joints: a new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38

    Article  Google Scholar 

  7. Papadopoulos GT, Axenopoulos A, Daras P (2014) Real-time skeleton-tracking-based human action recognition using kinect data. MultiMedia modeling. Springer, Berlin, pp 473–483

    Chapter  Google Scholar 

  8. Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single-depth images. Commun ACM 56(1):116–124

    Article  Google Scholar 

  9. Mahamane A, Benoit A, Lambert P (2020) Timed-image-based deep learning for action recognition in video sequences. Pattern Recognit 104:107353

    Article  Google Scholar 

  10. Mualikrishna P, Ravi S (2021) Medical image analysis based on deep learning approach. Multimed Tools Appl 80:24365–24398

    Article  Google Scholar 

  11. Liu JE, An FP (2020) Image classification algorithm based on deep learning-kernel function. Sci Program 2020:1–14

    Google Scholar 

  12. Samir Y, Shivajirao J (2019) Deep convolutional neural network-based medical image classification for disease diagnosis. J Big Data 6(1):1–18

    Google Scholar 

  13. Lou B, Doken S, Wingerter T, Gidwani M, Mistry N, Ladic L, Kamen A, Abazeed M (2019) An image-based deep learning framework for individualizing radiotherapy dose: a retrospective analysis of outcome prediction. Lancet Digit Health 1(3):e136–e147

    Article  Google Scholar 

  14. Adib S, Eva B, Sullivan A (2021) Development and validation of image-based deep learning models to predict surgical complexity and complications in abdominal wall reconstruction. JAMA Surg 156:933–940

    Article  Google Scholar 

  15. Rezazadegan F, Shirazi S, Upcrofit B, Milford M (2017) Action recognition: from static datasets to moving robots. IEEE Int Conf Robot Autom (ICRA) 2017:3185–3191. https://doi.org/10.1109/ICRA.2017.7989361

    Article  Google Scholar 

  16. Mathew A, Amudha P, Sivakumar S (2021) Deep learning techniques: an overview. In: Hassanien A, Bhatnagar R, Darwish A (eds) Advanced machine learning technologies and applications. Springer, Singapore

    Google Scholar 

  17. Mathew A, Amudha P, Sivakumar S (2021) Deep learning models for medical Imaging In: Biomedical imaging devices and systems

  18. Le QV et al (2015) A tutorial on deep learning part 2: autoencoders, convolutional neural networks and recurrent neural networks. Google Brain 20:1–20

    Google Scholar 

  19. Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629. https://doi.org/10.1007/s13244-018-0639-9

    Article  Google Scholar 

  20. ImageNet. http://www.image-net.org. Accessed 28 May 2022

  21. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. Archives Cornell University, New York

    Google Scholar 

  22. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  23. https://keras.io/api/applications/resnet/#resnet50-function. Accessed 28 May 2022

  24. https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 28 May 2022

  25. Kao ST, Ho MT (2021) Ball-catching system using image processing and an omni-directional wheeled mobile robot. MDPI Sens J 21(9):3208

    Article  Google Scholar 

Download references

Acknowledgements

We hereby acknowledge the support of the Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala, Punjab, for providing the facility.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sachin Kansal.

Ethics declarations

Conflict of interest

The authors do not have any conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kansal, S., Jha, S. & Samal, P. DL-DARE: Deep learning-based different activity recognition for the human–robot interaction environment. Neural Comput & Applic 35, 12029–12037 (2023). https://doi.org/10.1007/s00521-023-08337-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08337-y

Keywords

Navigation