Skip to main content
Log in

Lightweight network for small target fall detection based on feature fusion and dynamic convolution

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

The accurate and prompt detection of falls in the elderly holds significant importance in building a fall detection system based on artificial intelligence. However, the current research has many limitations, including poor performance in low-light conditions, missed detection for small targets, excessive parameters, and slow detection speed. This paper combines feature fusion, dynamic convolution, and the SCYLLA-IoU (SIoU) loss function to overcome these challenges. First, FasterNet is employed to ensure a balance between lightweight and accuracy. Second, the bi-directional cascaded feature pyramid network is introduced, incorporating a module to enhance feature representation and improving the perception capability for targets in dark images. Furthermore, dynamic convolution is implemented based on attention mechanisms to enhance the perception and localization accuracy for small object detection tasks. Finally, the SIOU loss function is introduced to expedite convergence speed and improve target localization accuracy. Experimental results demonstrate that the improved model outperforms the original YOLOv5s model, achieving a 6.6% increase in precision and a 15.3% enhancement in detection speed, while reducing parameter count by 24%. It exhibits superior performance compared to other networks, including Faster-R-CNN, SSD, YOLOXs, and YOLOv7.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this work are available from the corresponding author upon reasonable request.

References

  1. Giannakouris, K., et al.: Ageing characterises the demographic perspectives of the European societies. Stat. Focus 72(1), 12 (2008)

    Google Scholar 

  2. Tinetti, M.E., Kumar, C.: The patient who falls:“it’s always a trade-off’’. Jama 303(3), 258–266 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Edelman, M., Ficorelli, C.T.: Keeping older adults safe at home. Nursing 2023 42(1), 65–66 (2012)

    Google Scholar 

  4. Palmerini, L., Klenk, J., Becker, C., Chiari, L.: Accelerometer-based fall detection using machine learning: training and testing on real-world falls. Sensors 20, 6479 (2020). https://doi.org/10.3390/s20226479

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  5. Quadros, T., Lazzaretti, A.E., Schneider, F.K.: A movement decomposition and machine learning-based fall detection system using wrist wearable device. IEEE Sens. J. 18(12), 5082–5089 (2018)

    Article  ADS  Google Scholar 

  6. Pattamaset, S., Charoenpong, T., Charoenpong, P., Chianrabutra, C.: Human fall detection by using the body vector. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), pp. 162–165 (2017). https://doi.org/10.1109/KST.2017.7886075

  7. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision—ECCV 2016, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81

  9. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 39(6), pp. 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

  10. Min, W., Cui, H., Rao, H., Li, Z., Yao, L.: Detection of human falls on furniture using scene analysis based on deep learning and activity characteristics. IEEE Access, 6, 9324–9335 (2018). https://doi.org/10.1109/ACCESS.2018.2795239

    Article  Google Scholar 

  11. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017). https://doi.org/10.1109/CVPR.2017.690

  12. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint. arXiv:1804.02767

  13. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017). https://doi.org/10.1109/CVPR.2017.106

  14. Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint. arXiv:2004.10934

  15. Xiao, C., Liu, P., Zhou, Y., Liu, W., Hu, R., Liu, C., Wu, C.: Research on video object detection methods based on yolo with motion features. In: International Conference of Pioneering Computer Scientists, Engineers and Educators, vol. 1628, pp. 363–375 (2022). https://doi.org/10.1007/978-981-19-5194-7_27

  16. Zhao, X., Hou, F., Su, J., Davis, L.: An alphapose-based pedestrian fall detection algorithm. In: International Conference on Adaptive and Intelligent Systems, pp. 650–660 (2022). https://doi.org/10.1007/978-3-031-06794-5_52

  17. Du, F.-J., Jiao, S.-J.: Improvement of lightweight convolutional neural network model based on YOLO algorithm and its research in pavement defect detection. Sensors 22(9), 3537 (2022). https://doi.org/10.3390/s22093537

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  18. Xu, Y., Chen, Q., Kong, S., Xing, L., Wang, Q., Cong, X., Zhou, Y.: Real-time object detection method of melon leaf diseases under complex background in greenhouse. J. Real-Time Image Process. 19(5), 985–995 (2022). https://doi.org/10.1007/s11554-022-01239-7

    Article  Google Scholar 

  19. Li, X., Qin, Y., Wang, F., Guo, F., Yeow, J.T.: Pitaya detection in orchards using the MobileNet-yolo model. In: 2020 39th Chinese Control Conference (CCC), Shenyang, China, pp. 6274–6278. IEEE (2020)

  20. Fu, H., Gao, J.: Human fall detection based on posture estimation and infrared thermography. IEEE Sens. J. (2023). https://doi.org/10.1109/JSEN.2023.3307160

    Article  Google Scholar 

  21. Ros, D., Dai, R.: A flexible fall detection framework based on object detection and motion analysis. In: 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 063–068 (2023). https://doi.org/10.1109/ICAIIC57133.2023.10066990

  22. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https://doi.org/10.1109/TPAMI.2015.2389824

    Article  PubMed  Google Scholar 

  23. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018). https://doi.org/10.1109/CVPR.2018.00913

  24. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint. arXiv:1704.04861

  25. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474

  26. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716

  27. Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018). https://doi.org/10.1007/978-3-030-01264-9_8

  28. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020). https://doi.org/10.1109/CVPR42600.2020.00165

  29. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020). https://doi.org/10.1109/CVPR42600.2020.01079

  30. Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020). https://doi.org/10.1109/CVPRW50498.2020.00203

  31. Chen, J., Kao, S.-H., He, H., Zhuo, W., Wen, S., Lee, C.-H., Chan, S.-H.G.: Run, don’t walk: chasing higher flops for faster neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031 (2023). https://doi.org/10.48550/arXiv.2303.03667

  32. Li, C., Li, L., Geng, Y., Jiang, H., Cheng, M., Zhang, B., Ke, Z., Xu, X., Chu, X.: Yolov6 v3. 0: A full-scale reloading. arXiv preprint arXiv:2301.05586 (2023)

  33. Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11027–11036 (2020). https://doi.org/10.1109/CVPR42600.2020.01104

  34. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745

  35. Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., Zuo, W.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 52(8), 8574–8586 (2021). https://doi.org/10.1109/TCYB.2021.3095305

    Article  Google Scholar 

  36. Gevorgyan, Z.: SIoU loss: more powerful learning for bounding box regression (2022). arXiv preprint. arXiv:2205.12740

  37. Charfi, I., Miteran, J., Dubois, J., Atri, M., Tourki, R.: Optimised spatio-temporal descriptors for real-time fall detection: comparison of SVM and Adaboost based classification. J. Electron. Imaging 22(4), 17 (2013)

    Article  Google Scholar 

  38. Kwolek, B., Kepski, M.: Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput. Methods Programs Biomed. 117(3), 489–501 (2014)

    Article  PubMed  Google Scholar 

  39. Auvinet, E., Rougier, C., Meunier, J., St-Arnaud, A., Rousseau, J.: Multiple cameras fall dataset. DIRO-Université de Montréal, Technical Report 1350, 24 (2010)

  40. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019). https://doi.org/10.1109/CVPR.2019.00075

  41. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 07, pp. 12993–13000 (2020)

  42. Zhang, Y.-F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506, 146–157 (2022). https://doi.org/10.1016/j.neucom.2022.07.042

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by Open project of Key Laboratory of Healthy Freshwater Aquaculture, Ministry of Agriculture and Rural Affairs, Key Laboratory of Fish Health and Nutrition of Zhejiang Province (ZJK202204), The Six talent peak high level talent plan projects of Jiangsu Province (XYDXX-115), and Project 333 of Jiangsu Province.

Author information

Authors and Affiliations

Authors

Contributions

XB and QZ designed the research. QZ conducted numerical and experimental validations and prepared the manuscript, SS verified the correctness of the experiment, and SS and FL polished and revised the paper language. All authors took part in discussing the results. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xu Bao.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Bao, X., Sun, S. et al. Lightweight network for small target fall detection based on feature fusion and dynamic convolution. J Real-Time Image Proc 21, 17 (2024). https://doi.org/10.1007/s11554-023-01397-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-023-01397-2

Keywords

Navigation