Abstract
The accurate and prompt detection of falls in the elderly holds significant importance in building a fall detection system based on artificial intelligence. However, the current research has many limitations, including poor performance in low-light conditions, missed detection for small targets, excessive parameters, and slow detection speed. This paper combines feature fusion, dynamic convolution, and the SCYLLA-IoU (SIoU) loss function to overcome these challenges. First, FasterNet is employed to ensure a balance between lightweight and accuracy. Second, the bi-directional cascaded feature pyramid network is introduced, incorporating a module to enhance feature representation and improving the perception capability for targets in dark images. Furthermore, dynamic convolution is implemented based on attention mechanisms to enhance the perception and localization accuracy for small object detection tasks. Finally, the SIOU loss function is introduced to expedite convergence speed and improve target localization accuracy. Experimental results demonstrate that the improved model outperforms the original YOLOv5s model, achieving a 6.6% increase in precision and a 15.3% enhancement in detection speed, while reducing parameter count by 24%. It exhibits superior performance compared to other networks, including Faster-R-CNN, SSD, YOLOXs, and YOLOv7.
Similar content being viewed by others
Data availability
The data that support the findings of this work are available from the corresponding author upon reasonable request.
References
Giannakouris, K., et al.: Ageing characterises the demographic perspectives of the European societies. Stat. Focus 72(1), 12 (2008)
Tinetti, M.E., Kumar, C.: The patient who falls:“it’s always a trade-off’’. Jama 303(3), 258–266 (2010)
Edelman, M., Ficorelli, C.T.: Keeping older adults safe at home. Nursing 2023 42(1), 65–66 (2012)
Palmerini, L., Klenk, J., Becker, C., Chiari, L.: Accelerometer-based fall detection using machine learning: training and testing on real-world falls. Sensors 20, 6479 (2020). https://doi.org/10.3390/s20226479
Quadros, T., Lazzaretti, A.E., Schneider, F.K.: A movement decomposition and machine learning-based fall detection system using wrist wearable device. IEEE Sens. J. 18(12), 5082–5089 (2018)
Pattamaset, S., Charoenpong, T., Charoenpong, P., Chianrabutra, C.: Human fall detection by using the body vector. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), pp. 162–165 (2017). https://doi.org/10.1109/KST.2017.7886075
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision—ECCV 2016, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 39(6), pp. 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Min, W., Cui, H., Rao, H., Li, Z., Yao, L.: Detection of human falls on furniture using scene analysis based on deep learning and activity characteristics. IEEE Access, 6, 9324–9335 (2018). https://doi.org/10.1109/ACCESS.2018.2795239
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017). https://doi.org/10.1109/CVPR.2017.690
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint. arXiv:1804.02767
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017). https://doi.org/10.1109/CVPR.2017.106
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint. arXiv:2004.10934
Xiao, C., Liu, P., Zhou, Y., Liu, W., Hu, R., Liu, C., Wu, C.: Research on video object detection methods based on yolo with motion features. In: International Conference of Pioneering Computer Scientists, Engineers and Educators, vol. 1628, pp. 363–375 (2022). https://doi.org/10.1007/978-981-19-5194-7_27
Zhao, X., Hou, F., Su, J., Davis, L.: An alphapose-based pedestrian fall detection algorithm. In: International Conference on Adaptive and Intelligent Systems, pp. 650–660 (2022). https://doi.org/10.1007/978-3-031-06794-5_52
Du, F.-J., Jiao, S.-J.: Improvement of lightweight convolutional neural network model based on YOLO algorithm and its research in pavement defect detection. Sensors 22(9), 3537 (2022). https://doi.org/10.3390/s22093537
Xu, Y., Chen, Q., Kong, S., Xing, L., Wang, Q., Cong, X., Zhou, Y.: Real-time object detection method of melon leaf diseases under complex background in greenhouse. J. Real-Time Image Process. 19(5), 985–995 (2022). https://doi.org/10.1007/s11554-022-01239-7
Li, X., Qin, Y., Wang, F., Guo, F., Yeow, J.T.: Pitaya detection in orchards using the MobileNet-yolo model. In: 2020 39th Chinese Control Conference (CCC), Shenyang, China, pp. 6274–6278. IEEE (2020)
Fu, H., Gao, J.: Human fall detection based on posture estimation and infrared thermography. IEEE Sens. J. (2023). https://doi.org/10.1109/JSEN.2023.3307160
Ros, D., Dai, R.: A flexible fall detection framework based on object detection and motion analysis. In: 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 063–068 (2023). https://doi.org/10.1109/ICAIIC57133.2023.10066990
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https://doi.org/10.1109/TPAMI.2015.2389824
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018). https://doi.org/10.1109/CVPR.2018.00913
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint. arXiv:1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: GhostNet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020). https://doi.org/10.1109/CVPR42600.2020.00165
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020). https://doi.org/10.1109/CVPR42600.2020.01079
Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020). https://doi.org/10.1109/CVPRW50498.2020.00203
Chen, J., Kao, S.-H., He, H., Zhuo, W., Wen, S., Lee, C.-H., Chan, S.-H.G.: Run, don’t walk: chasing higher flops for faster neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031 (2023). https://doi.org/10.48550/arXiv.2303.03667
Li, C., Li, L., Geng, Y., Jiang, H., Cheng, M., Zhang, B., Ke, Z., Xu, X., Chu, X.: Yolov6 v3. 0: A full-scale reloading. arXiv preprint arXiv:2301.05586 (2023)
Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11027–11036 (2020). https://doi.org/10.1109/CVPR42600.2020.01104
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745
Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., Zuo, W.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 52(8), 8574–8586 (2021). https://doi.org/10.1109/TCYB.2021.3095305
Gevorgyan, Z.: SIoU loss: more powerful learning for bounding box regression (2022). arXiv preprint. arXiv:2205.12740
Charfi, I., Miteran, J., Dubois, J., Atri, M., Tourki, R.: Optimised spatio-temporal descriptors for real-time fall detection: comparison of SVM and Adaboost based classification. J. Electron. Imaging 22(4), 17 (2013)
Kwolek, B., Kepski, M.: Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput. Methods Programs Biomed. 117(3), 489–501 (2014)
Auvinet, E., Rougier, C., Meunier, J., St-Arnaud, A., Rousseau, J.: Multiple cameras fall dataset. DIRO-Université de Montréal, Technical Report 1350, 24 (2010)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019). https://doi.org/10.1109/CVPR.2019.00075
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 07, pp. 12993–13000 (2020)
Zhang, Y.-F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506, 146–157 (2022). https://doi.org/10.1016/j.neucom.2022.07.042
Acknowledgements
This work is supported by Open project of Key Laboratory of Healthy Freshwater Aquaculture, Ministry of Agriculture and Rural Affairs, Key Laboratory of Fish Health and Nutrition of Zhejiang Province (ZJK202204), The Six talent peak high level talent plan projects of Jiangsu Province (XYDXX-115), and Project 333 of Jiangsu Province.
Author information
Authors and Affiliations
Contributions
XB and QZ designed the research. QZ conducted numerical and experimental validations and prepared the manuscript, SS verified the correctness of the experiment, and SS and FL polished and revised the paper language. All authors took part in discussing the results. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Q., Bao, X., Sun, S. et al. Lightweight network for small target fall detection based on feature fusion and dynamic convolution. J Real-Time Image Proc 21, 17 (2024). https://doi.org/10.1007/s11554-023-01397-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-023-01397-2