Skip to main content

Advertisement

Log in

An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Real-time Action Recognition (ActRgn) of assembly workers can timely assist manufacturers in correcting human mistakes and improving task performance. Yet, recognizing worker actions in assembly reliably is challenging because such actions are complex and fine-grained, and workers are heterogeneous. This paper proposes to create an individualized system of Convolutional Neural Networks (CNNs) for action recognition using human skeletal data. The system comprises six 1-channel CNN classifiers that each is built with one unique posture-related feature vector extracted from the time series skeletal data. Then, the six classifiers are adapted to any new worker through transfer learning and iterative boosting. After that, an individualized fusion method named Weighted Average of Selected Classifiers (WASC) integrates the adapted classifiers as an ActRgn system that outperforms its constituent classifiers. An algorithm of stream data analysis further differentiates the actions for assembly from the background and corrects misclassifications based on the temporal relationship of the actions in assembly. Compared to the CNN classifier directly built with the skeletal data, the proposed system improves the accuracy of action recognition by 28%, reaching 94% accuracy on the tested group of new workers. The study also builds a foundation for immediate extensions for adapting the ActRgn system to current workers performing new tasks and, then, to new workers performing new tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Al-Amin, M., Qin, R., Moniruzzaman, M., Yin, Z., Tao, W., & Leu, M. C. (2020). Data for the individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly.

  • Al-Amin, M., Qin, R., Tao, W., Doell, D., Lingard, R., Yin, Z., & Leu, M. C. (2020). Fusing and refining convolutional neural network models for assembly action recognition in smart manufacturing. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, page NA.

  • Al-Amin, M., Tao, W., Doell, D., Lingard, R., Yin, Z., Leu, M. C., et al. (2019). Action recognition in manufacturing assembly using multimodal sensor fusion. Procedia Manufacturing, 39, 158–167.

    Article  Google Scholar 

  • Banos, O., Damas, M., Pomares, H., Rojas, F., Delgado-Marquez, B., & Valenzuela, O. (2013). Human activity recognition based on a sensor weighting hierarchical classifier. Soft Computing, 17(2), 333–343.

    Article  Google Scholar 

  • Chen, C., Jafari, R., & Kehtarnavaz, N. (2017). A survey of depth and inertial sensor fusion for human action recognition. Multimedia Tools and Applications, 76(3), 4405–4425.

    Article  Google Scholar 

  • Chernbumroong, S., Cang, S., & Yu, H. (2015). Genetic algorithm-based classifiers fusion for multisensor activity recognition of elderly people. IEEE Journal of Biomedical and Health Informatics, 19(1), 282–289.

    Article  Google Scholar 

  • Chung, S., Lim, J., Noh, K. J., Kim, G., & Jeong, H. (2019). Sensor data acquisition and multimodal sensor fusion for human activity recognition using deep learning. Sensors, 19(7), 1716.

    Article  Google Scholar 

  • Cook, D., Feuz, K. D., & Krishnan, N. C. (2013). Transfer learning for activity recognition: A survey. Knowledge and Information Systems, 36(3), 537–556.

    Article  Google Scholar 

  • Du, Y., Fu, Y., & Wang, L. (2015). Skeleton based action recognition with convolutional neural network. In 3rd IAPR Asian conference on pattern recognition (ACPR), pp. 579–583.

  • ElMaraghy, H., & ElMaraghy, W. (2016). Smart adaptable assembly systems. Procedia CIRP, 44, 4–13.

    Article  Google Scholar 

  • Guo, M., Wang, Z., Yang, N., Li, Z., & An, T. (2019). A multisensor multiclassifier hierarchical fusion model based on entropy weight for human activity recognition using wearable inertial sensors. IEEE Transactions on Human-Machine Systems, 49(1), 105–111.

    Article  Google Scholar 

  • Guo, Y., He, W., & Gao, C. (2012). Human activity recognition by fusing multiple sensor nodes in the wearable sensor systems. Journal of Mechanics in Medicine and Biology, 12(05), 1250084.

    Article  Google Scholar 

  • Han, Y., Chung, S. L., Chen, S. F., & Su, S. F. (2018) Two-stream LSTM for action recognition with RGB-D-based hand-crafted features and feature combination. In IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3547–3552. IEEE.

  • Hou, Y., Li, Z., Wang, P., & Li, W. (2018). Skeleton optical spectra-based action recognition using convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology, 28(3), 807–811.

    Article  Google Scholar 

  • Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., & Feng, D. D. (2019). Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(9), 1806–1819.

    Article  Google Scholar 

  • Kang, K., Li, H., Yan, J., Zeng, X., Yang, B., Xiao, T., et al. (2018). T-CNN: Tubelets with convolutional neural networks for object detection from videos. IEEE Transactions on Circuits and Systems for Video Technology, 28(10), 2896–2907.

    Article  Google Scholar 

  • Ke, Q., Bennamoun, M., An, S., Sohel, F., & Boussaid, F. (2017). A new representation of skeleton sequences for 3D action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3288–3297. IEEE.

  • Khaire, P., Kumar, P., & Imran, J. (2018). Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recognition Letters, 115, 107–116.

    Article  Google Scholar 

  • Kong, X. T., Luo, H., Huang, G. Q., & Yang, X. (2019). Industrial wearable system: The human-centric empowering technology in Industry 4.0. Journal of Intelligent Manufacturing, 30(8), 2853–2869.

    Article  Google Scholar 

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.

    Google Scholar 

  • Li, B., Li, X., Zhang, Z., & Wu, F. (2019). Spatio-temporal graph routing for skeleton-based action recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8561–8568.

    Article  Google Scholar 

  • Li, C., Wang, P., Wang, S., Hou, Y., & Li, W. (2017). Skeleton-based action recognition using LSTM and CNN. In IEEE International conference on multimedia and expo workshops (ICMEW), pp. 585–590. IEEE.

  • Liu, J., Shahroudy, A., Xu, D., Kot, A. C., & Wang, G. (2017). Skeleton-based action recognition using spatio-temporal LSTM network with trust gates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3007–3021.

    Article  Google Scholar 

  • Mittal, S., Galesso, S., & Brox, T. (2021). Essentials for class incremental learning. arXiv preprint arXiv:2102.09517.

  • Moniruzzaman, M., Yin, Z., He, Z. H., Qin, R., & Leu, M. (2021). Human action recognition by discriminative feature pooling and video segmentation attention model. IEEE Transactions on Multimedia.

  • Nunez, J. C., Cabido, R., Pantrigo, J. J., Montemayor, A. S., & Velez, J. F. (2018). Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recognition, 76, 80–94.

    Article  Google Scholar 

  • Pham, H. H., Khoudour, L., Crouzil, A., Zegers, P., & Velastin, S. A. (2018). Exploiting deep residual networks for human action recognition from skeletal data. Computer Vision and Image Understanding, 170, 51–66.

    Article  Google Scholar 

  • Rude, D. J., Adams, S., & Beling, P. A. (2018). Task recognition from joint tracking data in an operational manufacturing cell. Journal of Intelligent Manufacturing, 29(6), 1203–1217.

    Article  Google Scholar 

  • Shen, C., Chen, Y., Yang, G., & Guan, X. (2020). Toward hand-dominated activity recognition systems with wristband-interaction behavior analysis. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(7), 2501–2511.

    Article  Google Scholar 

  • Song, S., Lan, C., Xing, J., Zeng, W., & Liu, J. (2017). An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of AAAI Conference on Artificial Intelligence, pp. 4263–4270.

  • Stiefmeier, T., Roggen, D., Ogris, G., Lukowicz, P., & Tröster, G. (2008). Wearable activity tracking in car manufacturing. IEEE Pervasive Computing, 7(2), 42–50.

    Article  Google Scholar 

  • Tao, W., Lai, Z.-H., Leu, M. C., & Yin, Z. (2018). Worker activity recognition in smart manufacturing using IMU and sEMG signals with convolutional neural networks. Procedia Manufacturing, 26, 1159–1166.

  • Tao, X., Hong, X., Chang, X., Dong, S., Wei, X., & Gong, Y. (2020). Few-shot class-incremental learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12183–12192.

  • Tsanousa, A., Meditskos, G., Vrochidis, S., & Kompatsiaris, I. (2019). A weighted late fusion framework for recognizing human activity from wearable sensors. In 10th international conference on information, intelligence, systems and applications (IISA), pp. 1–8. IEEE.

  • Wang, K.-J., Rizqi, D. A., & Nguyen, H.-P. (2021). Skill transfer support model based on deep learning. Journal of Intelligent Manufacturing, 32(4), 1129–1146.

    Article  Google Scholar 

  • Ward, J. A., Lukowicz, P., Troster, G., & Starner, T. E. (2006). Activity recognition of assembly tasks using body-worn microphones and accelerometers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1553–1567.

    Article  Google Scholar 

  • Zhao, Z., Chen, Y., Liu, J., Shen, Z., & Liu, M. (2011). Cross-people mobile-phone based activity recognition. In Twenty-second International Joint Conference on Artificial Intelligence, pp. 2545–2550.

  • Zhou, F., Ji, Y., & Jiao, R. J. (2013). Affective and cognitive design for mass personalization: Status and prospect. Journal of Intelligent Manufacturing, 24(5), 1047–1069.

    Article  Google Scholar 

  • Zhu, X., Wang, Y., Dai, J., Yuan, L., & Wei, Y. (2017). Flow-guided feature aggregation for video object detection. Proceedings of the IEEE International Conference on Computer Vision, 1, 408–417.

    Google Scholar 

  • Zhu, X., Xiong, Y., Dai, J., Yuan, L., & Wei, Y. (2017). Deep feature flow for video recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4141–4150.

Download references

Acknowledgements

All the authors of this paper received financial support from the National Science Foundation through the Award CMMI-1646162. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruwen Qin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Amin, M., Qin, R., Moniruzzaman, M. et al. An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly. J Intell Manuf 34, 633–649 (2023). https://doi.org/10.1007/s10845-021-01815-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-021-01815-x

Keywords

Navigation