Abstract
Aging society gives rise to the need of fall detection for the elderly. The interference of the environmental noise and the loss of motion information causing fall detection still challenging. In this work, we present a novel two-stream network, called Flow-pose Net (FP-Net), which integrates the optical flow and human pose information to achieve robust and accurate fall detection in videos. Specifically, we use a human pose estimation model to detect the joints of the human body and design a GCN-based network to learn the body appearance feature from human pose. For motion feature extraction, we estimate optical flow from raw videos and utilize a CNN-based network to learn rich motion feature. Finally, the appearance feature and the motion feature are concatenated and then fed into a classifier to perform the classification of fall. To the best of our knowledge, we are the first to combine the optical flow and the human pose to simultaneously extract motion and appearance features for fall detection. Extensive experiments are conducted on two popular datasets URFD and Le2i, and the results show that our FP-Net achieves the state-of-the-art performance and has high robustness.








Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Lu, J.: Status and influencing factors of falls in the elderly in China. J. Zhengzhou Univ. (Medical Edition), ISTIC PKU CA 55(5), 662–667 (2020)
Beddiar, D. R., Nini, B., Sabokrou, M., Hadid, A.: Vision-based human activity recognition: a survey. Multimedia Tools and Applications, pp. 1–47 (2020)
Ezatzadeh, S., Keyvanpour, M. R.: Fall detection for elderly in assisted environments: video surveillance systems and challenges. In: 2017 9th International Conference on Information and Knowledge Technology (IKT), IEEE, pp. 93–98 (2017)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Charfi, I., Miteran, J., Dubois, J., Atri, M., Tourki, R.: Optimaized spatio-temporal descriptors for real-time fall detection: comparison of support vector machine and adaboost-based classification. J. Electron. Imaging 22, 041106 (2013)
Kwolek, B., Kepski, M.: Human fall detection on embedded platform using depth maps and wireless accelerometer. In: Computer Methods and Programs in Biomedicine, Volume 117, Issue 3, December pp. 489–501, ISSN 0169-2607 (2014)
Kripesh, A., Bouchachia, H., Nait-Charif, H.: Activity recognition for indoor fall detection using convolutional neural network. In: Fifteenth IAPR International Conference on Machine Vision Applications (MVA). IEEE (2017)
Sucerquia, A., López, J.D., Vargas-Bonilla, J.F.: SisFall: a fall and movement dataset. Sensors (Basel, Switzerland) 17(1) (2017). https://doi.org/10.3390/s17010198
Cotechini, V., Belli, A., Palma, L., Morettini, M., Burattini, L., Pierleoni, P.: A dataset for the development and optimization of fall detection algorithms based on wearable sensors. Data Brief 23, 103839 (2019)
Feng, W., Liu, R., Zhu, M.: Fall detection for elderly person care in a vision-based home surveillance environment using a monocular camera. In: Signal, image and video processing 8, pp. 1129–1138 (2014)
Shen, B. G., Wu, Z. Y., He, Y. H.: Falling detection method based on human body posture judgment. J. Comput. Appl., 34(S1), 223–223, 264 (2014)
Miaou, S. G., Sung, P. H., Huang, C. Y.: A customized human fall detection system using omni-camera images and personal information. In: Proceedings of the 1st Transdisciplinary Conference on Distributed Diagnosis and Home Healthcare. Piscataway: IEEE, pp. 39–42 (2006)
Rougier, L., Meunier, A., ST-Arnaud, A.: Robust video surveillance for fall detection based on human shape deformation. IEEE Trans. Circuits Syst. Video Technol. 21(5), 611–622 (2011)
Ma, L., Pei, W., Zhu, Y.Y.: Fall action recognition based on deep learning. Comput. Sci. 46(9), 106–112 (2019)
Ciabattoni, L., Foresi, G., Monteriù, A., Pagnotta, D.P., Tomaiuolo, L.: Fall detection system by using ambient intelligence and mobile robots. In: Zooming Innovation in Consumer Technologies Conference (ZINC). IEEE 2018, pp. 130–131 (2018)
Anishchenko, L.: Machine learning in video surveillance for fall detection. Ural Symp. Biomed. Eng. Radioelectr. Inf. Technol. (USBEREIT) IEEE 2018, 99–102 (2018)
Foroughi, H., Aski, B. S., Pourreza, H.: Intelligent video surveillance for monitoring fall detection of elderly in home environments. In: Proceedings of the 11th International Conference on Computer and Information Technology. Piscataway: IEEE, pp. 19–224 (2008)
Gammulle, H., Denman, S., Sridharan, S.: Two stream LSTM: a deep fusion framework for human action recognition. In: Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, pp. 177–186 (2017)
Yuan, Z., Hu, H.A.: fall detection method based on two-stream convolutional neural network. J. Henan Norm. Univ. (Natl. Sci. Edn.) 45(3), 96–101 (2017)
Kong, Y., Huang, J., Huang, S., Wei, Z., Wang, S.: Learning spatiotemporal representations for human fall detection in surveillance video. J. Vis. Commun. Image Represent. 59, 215–230 (2019)
Lu, N., Wu, Y., Feng, L., Song, J.: Deep learning for fall detection: Three-dimensional CNN combined with LSTM on video kinematic data. IEEE J. Biomed. Health Inform. 23, 314–323 (2018)
Adhikari, K., Bouchachia, H., Nait-Charif, H.: Long short term memory networks based fall detection using unified pose estimation. In: Twelfth International Conference on Machine Vision (ICMV 2019), volume 11433, International Society for Optics and Photonics, p. 114330H (2020)
Zhe, C., Tomas, S., Shih-En, W., Yaser, S.: Realtime multi- person 2D pose estimation using part affinity fields. In: IEEE Conference on Computer Vision and Pattern Recognition IEEE Computer Society, pp. 1302–1310 (2017)
Huang, Z., Liu, Y., Fang, Y., Horn, B. K.: Video-based fall detection for seniors with human pose estimation. In: 2018 4th International Conference on Universal Village (UV), IEEE, pp. 1–4 (2018)
Shen, D.Y., Ku, H.A., Pi, H.Y.: Depth camera-based fall detection system for the elderly. Chin. J. Med. Phys. 36(2), 223–228 (2019)
Nunez-Marcos, A., Azkune, G., Arganda-Carreras, I.: Vision-based fall detection with convolutional neural networks. Wireless Commun. Mobile Comput. (2017)
Tu, Z., et al.: A survey of variational and CNN-based optical flow techniques. Signal Processing Image Commun. 72, 9–24 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)
Bhandari, S., Babar, N., Gupta, P., Shah, N., Pujari, S.: A novel approach for fall detection in home environment. In: 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE), IEEE, pp. 1–5 (2017)
Geertsema, E.E., Visser, G.H., Viergever, M.A., Kalitzin, S.N.: Automated remote fall detection using impact features from video and audio. J. Biomech. 88, 25–32 (2019)
Singla, N.: Motion Detection Based on Frame Difference Method (2014)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. Comput. Vis. Pattern Recognit. IEEE (2016)
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Van Gool, L.: Temporal segment networks: towards good practices for deep action recognition. In: European Conference on Computer Vision, pp. 20–36. Springer (2016)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Carreira, J., Zisserman, A.: “Quo vadis, action recognition? a new model and the kinetics dataset,” In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference, pp. 4724–4733. IEEE (2017)
Zerrouki, N., Houacine, A.: Combined curvelets and hidden Markov models for human fall detection. Multimedia Tools Appl. 77, 6405–6424 (2018)
Charfi, I., et al.: Robust spatiooral descriptors for real-time SVM-based fall detection. IEEE (2014)
Kasturi, S., Jo K.H.: Classification of human fall in top viewed kinect depth images using binary support vector machine. In: Proceedings of the 10th International Conference on Human System Interactions (HSI), Ulsan, Korea, pp. 144–147 (2017)
Berlin, S. J., John, M.: Vision based human fall detection with Siamese convolutional neural networks. J. Ambient Intell. Human. Comput. 3, (2021)
Khan, S.S., Hoey, J.: Review of fall detection techniques: A data availability perspective. Med. Eng. Phys. 39, 12–22 (2017)
Aziz, O., Robinovitch, S.N.: An analysis of the accuracy of wearable sensors for classifying the causes of falls in humans. IEEE Trans. Neural Syst. Rehabil. Eng. 19, 670–676 (2011)
Sabatini, A.M., Ligorio, G., Mannini, A., Genovese, V., Pinna, L.: Prior-to and post-impact fall detection using inertial and barometric altimeter measurements. IEEE Trans. Neural Syst. Rehabil. Eng. 24, 774–783 (2015)
Mastorakis, G., Makris, D.: Fall detection system using kinect’s infrared sensor. J. Real-Time Image Proc. 9, 635–646 (2014)
Peng, Y., Peng, J., Li, J., Yan, P., Hu, B.: Design and development of the fall detection system based on point cloud. Procedia Comput. Sci. 147, 271–275 (2019)
Yajai, A., Rasmequan, S.: Adaptive directional bounding box from rgb-d information for improving fall detection. J. Vis. Commun. Image Represent. 49, 257–273 (2017)
Tu, Z., Li, H., Zhang, D., Dauwels, J., Li, B., Yuan, J.: Action-Stage Emphasized Spatio-Temporal VLAD for Video Action Recognition. IEEE Trans. Image Process. (TIP) 28(6), 2799–2812 (2019)
Tu, Z., Xie, W., Dauwels, J., Li, B., Yuan, J.: Semantic cues enhanced multi-modality multi-stream CNN for action recognition. IEEE Trans. Circuits Syst. Video Technol. (T-CSVT) 29(5), 1423–1437 (2019)
Liu, J., Xia, Y., Tang, Z.: Privacy-preserving video fall detection using visual shielding information. Vis. Comput. 37, 359–370 (2021)
Mousse, M.A., Motamed, C., Ezin, E.C.: Percentage of human-occupied areas for fall detection from two views. Vis. Comput. 33, 1529–1540 (2017)
Du, X., Yuan, J., Hu, L., et al.: Description generation of open-domain videos incorporating multimodal features and bidirectional encoder. Vis. Comput. 35, 1703–1712 (2019)
Bayoudh, K., Knani, R., Hamdaoui, F., et al.: A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Visual Comput. (2021)
Tu, Z., Xie, W., Qin, Q., Veltkamp, R.C., Li, B., Yuan, J.: Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn. 79, 32–43 (2018)
Chen, Y., Tu, Z., Kang, D., Chen, R., Bao, L., Zhang, Z., Yuan, J.: Joint hand-object 3d reconstruction from a single image with cross-branch feature fusion. IEEE Trans. Image Process. 30, 4008–4021 (2021)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: IEEE International Conference on Computer Vision (2020)
Tang, Y., Teng, Q., Zhang, L., Min, F., He, J.: Layer-wise training convolutional neural networks with smaller filters for human activity recognition using wearable sensors. IEEE Sens. J. 21(1), 581–592 (2020)
Teng, Q., Zhang, L., Tang, Y., Song, S., Wang, X., He, J.: Block-wise training residual networks on multi-channel time series for human activity recognition. IEEE Sensors J. (2021). https://doi.org/10.1109/JSEN.2021.3085360
Huang, W., Zhang, L., Gao, W., Min, F., He, J.: Shallow convolutional neural networks for human activity recognition using wearable sensors. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)
Huang, W., Zhang, L., Teng, Q., Song, C., He, J.: The convolutional neural networks training with Channel-Selectivity for human activity recognition based on sensors. IEEE J. Biomed. Health Inf. (2021). https://doi.org/10.1109/JBHI.2021.3092396
Gao, W., Zhang, L., Teng, Q., He, J., Wu, H.: DanHAR: dual attention network for multimodal human activity recognition using wearable sensors. Appl. Soft Comput. 111, 107728 (2021)
Lin, J., Gan, C., Han, S.:TSM: Temporal Shift Module for Efficient Video Understanding (2018)
Ze, L., Jia, N., Yue, C., Y. W., Zheng, Z., Stephen, L., Han, H.: Video Swim Transformer (2021). arxiv.org/pdf/2106.13230
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant 62106177. It was also supported by the Central University Basic Research Fund of China (No.2042020KF0016). The numerical calculation was supported by the supercomputing system in the Super-computing Center of Wuhan University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fei, K., Wang, C., Zhang, J. et al. Flow-pose Net: an effective two-stream network for fall detection. Vis Comput 39, 2305–2320 (2023). https://doi.org/10.1007/s00371-022-02416-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02416-2
Keywords
Profiles
- Zhigang Tu View author profile