Lightweight network for small target fall detection based on feature fusion and dynamic convolution

Zhang, Qihao; Bao, Xu; Sun, Shantong; Lin, Feng

doi:10.1007/s11554-023-01397-2

Lightweight network for small target fall detection based on feature fusion and dynamic convolution

Research
Published: 05 January 2024

Volume 21, article number 17, (2024)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Qihao Zhang¹,
Xu Bao^1,2,
Shantong Sun¹ &
…
Feng Lin³

309 Accesses
Explore all metrics

Abstract

The accurate and prompt detection of falls in the elderly holds significant importance in building a fall detection system based on artificial intelligence. However, the current research has many limitations, including poor performance in low-light conditions, missed detection for small targets, excessive parameters, and slow detection speed. This paper combines feature fusion, dynamic convolution, and the SCYLLA-IoU (SIoU) loss function to overcome these challenges. First, FasterNet is employed to ensure a balance between lightweight and accuracy. Second, the bi-directional cascaded feature pyramid network is introduced, incorporating a module to enhance feature representation and improving the perception capability for targets in dark images. Furthermore, dynamic convolution is implemented based on attention mechanisms to enhance the perception and localization accuracy for small object detection tasks. Finally, the SIOU loss function is introduced to expedite convergence speed and improve target localization accuracy. Experimental results demonstrate that the improved model outperforms the original YOLOv5s model, achieving a 6.6% increase in precision and a 15.3% enhancement in detection speed, while reducing parameter count by 24%. It exhibits superior performance compared to other networks, including Faster-R-CNN, SSD, YOLOXs, and YOLOv7.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Data availability

The data that support the findings of this work are available from the corresponding author upon reasonable request.

References

Giannakouris, K., et al.: Ageing characterises the demographic perspectives of the European societies. Stat. Focus 72(1), 12 (2008)
Google Scholar
Tinetti, M.E., Kumar, C.: The patient who falls:“it’s always a trade-off’’. Jama 303(3), 258–266 (2010)
Article CAS PubMed PubMed Central Google Scholar
Edelman, M., Ficorelli, C.T.: Keeping older adults safe at home. Nursing 2023 42(1), 65–66 (2012)
Google Scholar
Palmerini, L., Klenk, J., Becker, C., Chiari, L.: Accelerometer-based fall detection using machine learning: training and testing on real-world falls. Sensors 20, 6479 (2020). https://doi.org/10.3390/s20226479
Article ADS PubMed PubMed Central Google Scholar
Quadros, T., Lazzaretti, A.E., Schneider, F.K.: A movement decomposition and machine learning-based fall detection system using wrist wearable device. IEEE Sens. J. 18(12), 5082–5089 (2018)
Article ADS Google Scholar
Pattamaset, S., Charoenpong, T., Charoenpong, P., Chianrabutra, C.: Human fall detection by using the body vector. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), pp. 162–165 (2017). https://doi.org/10.1109/KST.2017.7886075
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision—ECCV 2016, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 39(6), pp. 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Min, W., Cui, H., Rao, H., Li, Z., Yao, L.: Detection of human falls on furniture using scene analysis based on deep learning and activity characteristics. IEEE Access, 6, 9324–9335 (2018). https://doi.org/10.1109/ACCESS.2018.2795239
Article Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017). https://doi.org/10.1109/CVPR.2017.690
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint. arXiv:1804.02767
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017). https://doi.org/10.1109/CVPR.2017.106
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint. arXiv:2004.10934
Xiao, C., Liu, P., Zhou, Y., Liu, W., Hu, R., Liu, C., Wu, C.: Research on video object detection methods based on yolo with motion features. In: International Conference of Pioneering Computer Scientists, Engineers and Educators, vol. 1628, pp. 363–375 (2022). https://doi.org/10.1007/978-981-19-5194-7_27
Zhao, X., Hou, F., Su, J., Davis, L.: An alphapose-based pedestrian fall detection algorithm. In: International Conference on Adaptive and Intelligent Systems, pp. 650–660 (2022). https://doi.org/10.1007/978-3-031-06794-5_52
Du, F.-J., Jiao, S.-J.: Improvement of lightweight convolutional neural network model based on YOLO algorithm and its research in pavement defect detection. Sensors 22(9), 3537 (2022). https://doi.org/10.3390/s22093537
Article ADS PubMed PubMed Central Google Scholar
Xu, Y., Chen, Q., Kong, S., Xing, L., Wang, Q., Cong, X., Zhou, Y.: Real-time object detection method of melon leaf diseases under complex background in greenhouse. J. Real-Time Image Process. 19(5), 985–995 (2022). https://doi.org/10.1007/s11554-022-01239-7
Article Google Scholar
Li, X., Qin, Y., Wang, F., Guo, F., Yeow, J.T.: Pitaya detection in orchards using the MobileNet-yolo model. In: 2020 39th Chinese Control Conference (CCC), Shenyang, China, pp. 6274–6278. IEEE (2020)
Fu, H., Gao, J.: Human fall detection based on posture estimation and infrared thermography. IEEE Sens. J. (2023). https://doi.org/10.1109/JSEN.2023.3307160
Article Google Scholar
Ros, D., Dai, R.: A flexible fall detection framework based on object detection and motion analysis. In: 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 063–068 (2023). https://doi.org/10.1109/ICAIIC57133.2023.10066990
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https://doi.org/10.1109/TPAMI.2015.2389824
Article PubMed Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018). https://doi.org/10.1109/CVPR.2018.00913
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint. arXiv:1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020). https://doi.org/10.1109/CVPR42600.2020.00165
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020). https://doi.org/10.1109/CVPR42600.2020.01079
Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020). https://doi.org/10.1109/CVPRW50498.2020.00203
Chen, J., Kao, S.-H., He, H., Zhuo, W., Wen, S., Lee, C.-H., Chan, S.-H.G.: Run, don’t walk: chasing higher flops for faster neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031 (2023). https://doi.org/10.48550/arXiv.2303.03667
Li, C., Li, L., Geng, Y., Jiang, H., Cheng, M., Zhang, B., Ke, Z., Xu, X., Chu, X.: Yolov6 v3. 0: A full-scale reloading. arXiv preprint arXiv:2301.05586 (2023)
Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11027–11036 (2020). https://doi.org/10.1109/CVPR42600.2020.01104
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745
Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., Zuo, W.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 52(8), 8574–8586 (2021). https://doi.org/10.1109/TCYB.2021.3095305
Article Google Scholar
Gevorgyan, Z.: SIoU loss: more powerful learning for bounding box regression (2022). arXiv preprint. arXiv:2205.12740
Charfi, I., Miteran, J., Dubois, J., Atri, M., Tourki, R.: Optimised spatio-temporal descriptors for real-time fall detection: comparison of SVM and Adaboost based classification. J. Electron. Imaging 22(4), 17 (2013)
Article Google Scholar
Kwolek, B., Kepski, M.: Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput. Methods Programs Biomed. 117(3), 489–501 (2014)
Article PubMed Google Scholar
Auvinet, E., Rougier, C., Meunier, J., St-Arnaud, A., Rousseau, J.: Multiple cameras fall dataset. DIRO-Université de Montréal, Technical Report 1350, 24 (2010)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019). https://doi.org/10.1109/CVPR.2019.00075
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 07, pp. 12993–13000 (2020)
Zhang, Y.-F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506, 146–157 (2022). https://doi.org/10.1016/j.neucom.2022.07.042
Article Google Scholar

Download references

Acknowledgements

This work is supported by Open project of Key Laboratory of Healthy Freshwater Aquaculture, Ministry of Agriculture and Rural Affairs, Key Laboratory of Fish Health and Nutrition of Zhejiang Province (ZJK202204), The Six talent peak high level talent plan projects of Jiangsu Province (XYDXX-115), and Project 333 of Jiangsu Province.

Author information

Authors and Affiliations

Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, 212013, Jiangsu, China
Qihao Zhang, Xu Bao & Shantong Sun
Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang, 212013, Jiangsu, China
Xu Bao
Zhejiang Institute of Freshwater Fisheries, Huzhou, 313001, China
Feng Lin

Authors

Qihao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Bao
View author publications
You can also search for this author in PubMed Google Scholar
Shantong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Feng Lin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XB and QZ designed the research. QZ conducted numerical and experimental validations and prepared the manuscript, SS verified the correctness of the experiment, and SS and FL polished and revised the paper language. All authors took part in discussing the results. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xu Bao.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Q., Bao, X., Sun, S. et al. Lightweight network for small target fall detection based on feature fusion and dynamic convolution. J Real-Time Image Proc 21, 17 (2024). https://doi.org/10.1007/s11554-023-01397-2

Download citation

Received: 24 July 2023
Accepted: 06 December 2023
Published: 05 January 2024
DOI: https://doi.org/10.1007/s11554-023-01397-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight network for small target fall detection based on feature fusion and dynamic convolution

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lightweight network for small target fall detection based on feature fusion and dynamic convolution

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation