Abstract
This paper proposes a deep learning-based activity recognition for the Human–Robot Interaction environment. The observations of the object state are acquired from the vision sensor in the real-time scenario. The activity recognition system examined in this paper comprises activities labeled as classes (pour, rotate, drop objects, and open bottles). The image processing unit processes the images and predicts the activity performed by the robot using deep learning methods so that the robot will do the actions (sub-actions) according to the predicted activity.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets analyzed during the current study are available in the [MIME Dataset] repository: [https://sites.google.com/view/mimedataset/dataset?authuser=0].
References
David A, Chapman K, Weigelt M, Weiss D, Wel R (2012) Cognition, action and object manipulation. Psycholl Bull 138(5):924–946
Roitberg A, Perzylo A, Somani N, Giuliani M, Rickert M, Knoll A (2014) Human activity recognition in the context of industrial human-robot interaction, signal and information processing association annual summit and conference (APSIPA). Asia-Pacific 2014:1–10. https://doi.org/10.1109/APSIPA.2014.7041588
Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28(6):976–990
Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: A survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488
Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318
Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) The sequence of the most informative joints: a new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38
Papadopoulos GT, Axenopoulos A, Daras P (2014) Real-time skeleton-tracking-based human action recognition using kinect data. MultiMedia modeling. Springer, Berlin, pp 473–483
Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single-depth images. Commun ACM 56(1):116–124
Mahamane A, Benoit A, Lambert P (2020) Timed-image-based deep learning for action recognition in video sequences. Pattern Recognit 104:107353
Mualikrishna P, Ravi S (2021) Medical image analysis based on deep learning approach. Multimed Tools Appl 80:24365–24398
Liu JE, An FP (2020) Image classification algorithm based on deep learning-kernel function. Sci Program 2020:1–14
Samir Y, Shivajirao J (2019) Deep convolutional neural network-based medical image classification for disease diagnosis. J Big Data 6(1):1–18
Lou B, Doken S, Wingerter T, Gidwani M, Mistry N, Ladic L, Kamen A, Abazeed M (2019) An image-based deep learning framework for individualizing radiotherapy dose: a retrospective analysis of outcome prediction. Lancet Digit Health 1(3):e136–e147
Adib S, Eva B, Sullivan A (2021) Development and validation of image-based deep learning models to predict surgical complexity and complications in abdominal wall reconstruction. JAMA Surg 156:933–940
Rezazadegan F, Shirazi S, Upcrofit B, Milford M (2017) Action recognition: from static datasets to moving robots. IEEE Int Conf Robot Autom (ICRA) 2017:3185–3191. https://doi.org/10.1109/ICRA.2017.7989361
Mathew A, Amudha P, Sivakumar S (2021) Deep learning techniques: an overview. In: Hassanien A, Bhatnagar R, Darwish A (eds) Advanced machine learning technologies and applications. Springer, Singapore
Mathew A, Amudha P, Sivakumar S (2021) Deep learning models for medical Imaging In: Biomedical imaging devices and systems
Le QV et al (2015) A tutorial on deep learning part 2: autoencoders, convolutional neural networks and recurrent neural networks. Google Brain 20:1–20
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629. https://doi.org/10.1007/s13244-018-0639-9
ImageNet. http://www.image-net.org. Accessed 28 May 2022
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. Archives Cornell University, New York
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
https://keras.io/api/applications/resnet/#resnet50-function. Accessed 28 May 2022
https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 28 May 2022
Kao ST, Ho MT (2021) Ball-catching system using image processing and an omni-directional wheeled mobile robot. MDPI Sens J 21(9):3208
Acknowledgements
We hereby acknowledge the support of the Computer Science Engineering Department, Thapar Institute of Engineering Technology, Patiala, Punjab, for providing the facility.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors do not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kansal, S., Jha, S. & Samal, P. DL-DARE: Deep learning-based different activity recognition for the human–robot interaction environment. Neural Comput & Applic 35, 12029–12037 (2023). https://doi.org/10.1007/s00521-023-08337-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08337-y