Skip to main content
Log in

Real-time pedestrian recognition model on edge device using infrared vision system

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Aiming at the lightweight and real-time detection requirements of infrared pedestrians on edge devices, this paper proposes an improved infrared pedestrian recognition model based on You Only Look Once v5 nano. The model effectively reduces the number of parameters by introducing the Ghost Convolution and SlimNeck modules in the lightweight design. To enhance real-time performance, the model reduces the number of parameters and improves operation speed through the Performance-aware Approximation of Global Channel Pruning pruning strategy. After experimental analysis, a balance between operation speed and accuracy is achieved. The results show that compared to the original model, the proposed model decreases the number of parameters by 1.52 M, increases running speed by 21%, and decreases mAP@0.50 by only 0.4%. The running speed on Jetson Nano reaches 0.12 s per image. The proposed model effectively ensures high detection accuracy under lightweight and real-time requirements, providing technical support for the deployment of edge devices in scenarios such as autonomous driving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The dataset used for the experiments is available in the original paper in reference [8].

References

  1. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: ACM-SIAM Symposium on Discrete Algorithms (2007). https://doi.org/10.1145/1283383.1283494

  2. Bin, Z., Chunping, W., Qiang, F., Yichao, C.: Multi-scale infrared pedestrian detection based on deep attention mechanism. Acta Opt. Sin. 40(05), 47–58 (2020). https://doi.org/10.3788/AOS202040.0504001

    Article  MATH  Google Scholar 

  3. Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C.C., Lin, D.: Mmdetection: open mmlab detection toolbox and benchmark. CoRR arXiv:abs/1906.07155 (2019). https://doi.org/10.48550/arXiv.1906.07155

  4. Girshick, R.: Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169

  5. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020). https://doi.org/10.1109/CVPR42600.2020.00165

  6. Hao, S., Gao, S., Ma, X., An, B., He, T.: Anchor-free infrared pedestrian detection based on cross-scale feature fusion and hierarchical attention mechanism. Infrared Phys. Technol. 131, 104660 (2023). https://doi.org/10.1016/j.infrared.2023.104660

    Article  Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https://doi.org/10.1109/TPAMI.2015.2389824

    Article  MATH  Google Scholar 

  8. Jia, X., Zhu, C., Li, M., Tang, W., Zhou, W.: Llvip: a visible-infrared paired dataset for low-light vision. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 3489–3497 (2021). https://doi.org/10.1109/ICCVW54120.2021.00389

  9. Li, H., Li, J., Wei, H., Liu, Z., Zhan, Z., Ren, Q.: Slim neck by gsconv: a lightweight-design for real-time detector architectures. J. Real-Time Image Proc. 21(3), 62 (2024). https://doi.org/10.1007/s11554-024-01436-6

    Article  MATH  Google Scholar 

  10. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp. 21–37 (2016)

  11. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation (2018). https://doi.org/10.48550/arXiv.1803.01534

  12. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). https://doi.org/10.1109/cvpr.2016.91

  13. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  MATH  Google Scholar 

  14. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2017). https://doi.org/10.1109/ICCV.2017.74

  15. Tian, Z., Chu, X., Wang, X., Wei, X., Shen, C.: Fully convolutional one-stage 3d object detection on lidar range images. Adv Neural Inf Process Syst 35, 34899–34911 (2022). https://doi.org/10.48550/arXiv.2205.13764

    Article  Google Scholar 

  16. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7464–7475 (2023). https://doi.org/10.48550/arXiv.2207.02696

  17. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., Ding, G.: Yolov10: real-time end-to-end object detection (2024). https://doi.org/10.48550/arXiv.2405.14458

  18. Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: Yolov9: learning what you want to learn using programmable gradient information (2024). https://doi.org/10.48550/arXiv.2402.13616

  19. Ye, H., Zhang, B., Chen, T., Fan, J., Wang, B.: Performance-aware approximation of global channel pruning for multitask cnns (2023). https://doi.org/10.48550/arXiv.2303.11923

  20. Yinhui, Z., Kai, J., Zifen, H., Guangchen, C.: Attention guided multi-scale infrared real-time detection of pedestrian and vehicle. Infrared Laser Eng. 53(05), 237–247 (2024)

    MATH  Google Scholar 

  21. Zhuang, M., Yong, Z., Ruimin, C., Weihua, L.: Method for fast detection of infrared targets based on key points. Acta Opt. Sin. 40(23), 136–144 (2020). https://doi.org/10.3788/A0S202040.2312006

    Article  MATH  Google Scholar 

  22. Zifen, H., Guangchen, C., Junsong, C., Yinhui, Z.: Multiscale feature fusion lightweight real-time infrared pedestrian detection at night. Chin. J. Lasers 49(17), 130–139 (2022). https://doi.org/10.3788/CL202249.1709002

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Hebei Province Graduate Student Innovation Ability Training Funding Project (Grant: CXZZSS2024163) and the Key Research and Development Projects in Hebei Province (Grant: 20310103D). The authors would like to thank Ms. Zhang for her help in borrowing the lab equipment.

Author information

Authors and Affiliations

Authors

Contributions

Li Liu:Writing - Review & Editing, Conceptualization, Project Administration, Funding Acquisition, Resources, Supervision Kaiye Huang: Writing - Original Draft, Writing - Review & Editing, Methodology, Conceptualization Yujian Li: Software, Validation, Visualization, Data Curation, Methodology, Original Draft Chuxia Zhang: Data Curation, Writing - Review & Editing Shuo Zhang: Data Curation, Writing - Review & Editing Zhengyu Hu: Software.

Corresponding author

Correspondence to Kaiye Huang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, L., Huang, K., Li, Y. et al. Real-time pedestrian recognition model on edge device using infrared vision system. J Real-Time Image Proc 22, 27 (2025). https://doi.org/10.1007/s11554-024-01608-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-024-01608-4

Keywords