skip to main content
10.1145/3591156.3591157acmotherconferencesArticle/Chapter ViewAbstractPublication PagesivspConference Proceedingsconference-collections
research-article

Human Activity Role Identification using Feature Vector and Encoding Techniques on Natural Language Sentences

Published: 16 June 2023 Publication History

Abstract

Role Identification has the potential to enhance activity recognition applications since it adds more information. Most of the works in the field of activity recognition and role identification are mainly dominated by models that use images and videos. The existing datasets of human activity are not capable of role identification. In this view, this work attempt to develop a novel Human Activity Role Identification Dataset and a novel Computational Recurrent Model that takes textual data as input. Additionally, various feature vector generation methods like N-Grams extraction, Unique word extraction, and Word2Vec are used to encode the input data into feature vectors that describe the relationship between sequences of words. To determine the fundamental roles, these feature vectors are trained on various types of Recurrent Neural Networks (i.e. RNN, LSTM, GRU). The proposed model is validated on evaluation metrics such as Precision, Recall, F1 Score, etc., using Recurrent Neural Networks like RNN, LSTM, and GRU. Hence, the combination of LSTM with unique word extraction methods outperforms with an F1 Score, precision and recall by 0.44, 0.36 and 0.58 respectively. So this role identification work may help to bind roles with entity and objects in human activity recognition.

References

[1]
Anam Arshad, Vivek Tiwari, Mayank Lovanshi, and Rahul Shrivastava. 2023. Role Identification from Human Activity Videos using Recurrent Neural Networks. In proceedings of the 8th IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE).
[2]
Djamila Romaissa Beddiar, Brahim Nini, Mohammad Sabokrou, and Abdenour Hadid. 2020. Vision-based human activity recognition: a survey. Multimedia Tools and Applications 79, 41 (2020), 30509–30555.
[3]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7291–7299.
[4]
Alebachew Chiche and Betselot Yitagesu. 2022. Part of speech tagging: a systematic review of deep learning and machine learning approaches. Journal of Big Data 9, 1 (2022), 1–25.
[5]
Jason PC Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the association for computational linguistics 4 (2016), 357–370.
[6]
Wongun Choi and Silvio Savarese. 2013. Understanding collective activitiesof people from videos. IEEE transactions on pattern analysis and machine intelligence 36, 6 (2013), 1242–1257.
[7]
Meenakshi Choudhary, Vivek Tiwari, and U Venkanna. 2020. Enhancing human iris recognition performance in unconstrained environment using ensemble of convolutional and residual deep neural network models. Soft Computing 24, 15 (2020), 11477–11491.
[8]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[9]
Rahul Dey and Fathi M Salem. 2017. Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, 1597–1600.
[10]
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2625–2634.
[11]
John T Hancock and Taghi M Khoshgoftaar. 2020. Survey on categorical data for neural networks. Journal of Big Data 7, 1 (2020), 1–41.
[12]
Maria M Hedblom, Oliver Kutz, Rafael Peñaloza, and Giancarlo Guizzardi. 2019. Image schema combinations and complex events. KI-Künstliche Intelligenz 33, 3 (2019), 279–291.
[13]
Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, and Jiawei Han. 2021. Few-Shot Named Entity Recognition: An Empirical Baseline Study. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 10408–10423.
[14]
Yanli Ji, Guo Ye, and Hong Cheng. 2014. Interactive body part contrast mining for human interaction recognition. In 2014 IEEE international conference on multimedia and expo workshops (ICMEW). IEEE, 1–6.
[15]
Yaozong Jia and Xiaobin Xu. 2018. Chinese named entity recognition based on cnn-bilstm-crf. In 2018 IEEE 9th international conference on software engineering and service science (ICSESS). IEEE, 1–4.
[16]
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410 (2016).
[17]
Maosen Li, Siheng Chen, Xu Chen, Ya Zhang, Yanfeng Wang, and Qi Tian. 2019. Actional-structural graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3595–3603.
[18]
Ivan Lillo, Juan Carlos Niebles, and Alvaro Soto. 2017. Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos. Image and Vision Computing 59 (2017), 63–75.
[19]
Mayank Lovanshi and Vivek Tiwari. 2023. Human Pose Estimation: Benchmarking Deep Learning-based Methods. In proceedings of the IEEE Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation.
[20]
Fang Luo, Han Xiao, and Weili Chang. 2011. Product named entity recognition using conditional random fields. In 2011 Fourth international conference on business intelligence and financial engineering. IEEE, 86–89.
[21]
Steven L Lytinen. 1992. Conceptual dependency and its descendants. Computers & Mathematics with Applications 23, 2-5 (1992), 51–73.
[22]
Jamie C Macbeth and Dagmar Gromann. 2019. Towards Modeling Conceptual Dependency Primitives with Image Schema Logic. (2019).
[23]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[24]
Kriti Pawar, Raj Srujan Jalem, and Vivek Tiwari. 2019. Stock market price prediction using LSTM RNN. In Emerging Trends in Expert Applications and Security: Proceedings of ICETEAS 2018. Springer, 493–503.
[25]
Ronald Poppe. 2010. A survey on vision-based human action recognition. Image and vision computing 28, 6 (2010), 976–990.
[26]
Michalis Raptis and Leonid Sigal. 2013. Poselet key-framing: A model for human activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2650–2657.
[27]
Michael S Ryoo. 2011. Human activity prediction: Early recognition of ongoing activities from streaming videos. In 2011 International Conference on Computer Vision. IEEE, 1036–1043.
[28]
Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, Lars Petersson, and Lars Andersson. 2017. Encouraging lstms to anticipate actions very early. In Proceedings of the IEEE International Conference on Computer Vision. 280–289.
[29]
Roger C Schank. 1972. Conceptual dependency: A theory of natural language understanding. Cognitive psychology 3, 4 (1972), 552–631.
[30]
Alex Sherstinsky. 2020. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena 404 (2020), 132306.
[31]
Rahul Shrivastava, Vivek Tiwari, Swati Jain, Basant Tiwari, Alok Kumar Singh Kushwaha, and Vibhav Prakash Singh. 2022. A role-entity based human activity recognition using inter-body features and temporal sequence memory. IET Image Processing (2022).
[32]
Kamilya Smagulova and Alex Pappachen James. 2019. A survey on LSTM memristive neural network architectures and applications. The European Physical Journal Special Topics 228, 10 (2019), 2313–2324.
[33]
Daniel Soutner and Luděk Müller. 2013. Application of LSTM neural networks in language modelling. In International Conference on Text, Speech and Dialogue. Springer, 105–112.
[34]
Vivek Tiwari, Aditi Agrahari, and Sriyuta Srivastava. 2021. Performance analysis of hand-crafted features and cnn toward real-time crop disease identification. In Information and Communication Technology for Intelligent Systems: Proceedings of ICTIS 2020, Volume 1. Springer, 497–505.
[35]
Arash Vahdat, Bo Gao, Mani Ranjbar, and Greg Mori. 2011. A discriminative key pose sequence model for recognizing human interactions. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). IEEE, 1729–1736.
[36]
Daniel Weinland, Remi Ronfard, and Edmond Boyer. 2006. Free viewpoint action recognition using motion history volumes. Computer vision and image understanding 104, 2-3 (2006), 249–257.
[37]
Yong Yu, Xiaosheng Si, Changhua Hu, and Jianxun Zhang. 2019. A review of recurrent neural networks: LSTM cells and network architectures. Neural computation 31, 7 (2019), 1235–1270.
[38]
Kiwon Yun, Jean Honorio, Debaleena Chattopadhyay, Tamara L Berg, and Dimitris Samaras. 2012. Two-person interaction detection using body-pose features and multiple instance learning. In 2012 IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, 28–35.
[39]
Yimeng Zhang, Xiaoming Liu, Ming-Ching Chang, Weina Ge, and Tsuhan Chen. 2012. Spatio-temporal phrases for activity recognition. In European Conference on Computer Vision. Springer, 707–721.
[40]
Qiang Zhou and Gang Wang. 2012. Atomic action features: A new feature for action recognition. In European Conference on Computer Vision. Springer, 291–300.

Index Terms

  1. Human Activity Role Identification using Feature Vector and Encoding Techniques on Natural Language Sentences

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      IVSP '23: Proceedings of the 2023 5th International Conference on Image, Video and Signal Processing
      March 2023
      207 pages
      ISBN:9781450398381
      DOI:10.1145/3591156
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 June 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Gated Recurrent Units
      2. Long Short Term Memory
      3. Named Entity Recognition
      4. Reciprocal Activities
      5. Recurrent Neural Networks
      6. Role Identification
      7. Word Embedding.
      8. Word2Vec

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      IVSP 2023

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 40
        Total Downloads
      • Downloads (Last 12 months)13
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 15 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media