TPS-YOLO: The Efficient Tiny Person Detection Network Based on Improved YOLOv8 and Model Pruning

Yao, Li; Huang, Qianni; Wan, Yan

doi:10.1007/978-981-96-2071-5_18

Li Yao¹⁵,
Qianni Huang¹⁵ &
Yan Wan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15523))

Included in the following conference series:

International Conference on Multimedia Modeling

362 Accesses

Abstract

Tiny Person detection in long-range scenes is a popular and challenging task. Current person detectors have two major issues. Firstly, their performance is poor in the case of tiny and heavily occluded persons. Secondly, they are computation-intensive and have large model sizes, which make them difficult to deploy on resource-limited devices. To solve the above issues, we proposed TPS-YOLO. Based on YOLOv8, we reconstruct the network structure by introducing shallow features of P2 into the feature fusion layers, which helps retain more spatial information important for tiny person detection. We design a fine-grained feature extraction module SPDCA to replace the standard convolution layer in the backbone network to enhance the feature representation of the network. In the feature fusion network, we use a weighted fusion method to fuse multi-scale features, which introduces learnable weights to learn the importance of different input features. We propose a lightweight module named C2f_Efficient, which integrates Depthwise Separable Convolution (DSC) to reduce the model parameters. Furthermore, we apply a model pruning method to further reduce the model’s computational complexity. Experiments on the Tinypersonv2 and VisDrone-person datasets show that TPS-YOLO achieves satisfactory performance in terms of both efficiency and accuracy and has advantages on model lightweight.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Exploiting prunability for person re-identification

Article Open access 25 June 2021

SlimYOLOv4: lightweight object detector based on YOLOv4

Article 10 February 2022

Gradient-supervised person re-identification based on dense feature pyramid network

Article Open access 19 May 2022

References

Jiang, N., Yu, X., Peng, X., Gong, Y., Han, Z.: SM+: refined scale match for tiny person detection (2021)
Google Scholar
Peng, G., Yang, Z., Wang, S., Zhou, Y.: AMFLW-YOLO: a lightweight network for remote sensing image detection based on attention mechanism and multi-scale feature fusion. IEEE Trans. Geosci. Remote Sens. 16 (2023)
Google Scholar
Cao, J., Pang, Y., Xie, J., Khan, F.S., Shao, L.: From handcrafted to deep features for pedestrian detection: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(9) 4913–4934 (2021)
Google Scholar
Khan, A.H., Nawaz, M.S., Dengel, A.: Localized semantic feature mixers for efficient pedestrian detection in autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5476–5485 (2023)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi. A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon .J., Farhadi. A.: YOLOv3: an incremental improvement. arXiv preprint
Google Scholar
Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: YOLOV9: learning what you want to learn using programmable gradient information. arXiv preprintarXiv:2402.13616 (2024)
Google Scholar
Shi, Y., Li, S., Liu, Z., Zhou, Z., Zhou, X.: MTP-YOLO: you only look once based maritime tiny person detector for emergency rescue. J. Marine Sci. Eng. 12(4) (2024)
Google Scholar
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Li, G., Yang, Y., Xingda, Q.: Deep learning approaches on pedestrian detection in hazy weather. IEEE Trans. Industr. Electron. 67(10), 8889–8899 (2019)
Article MATH Google Scholar
Howard, A., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Google Scholar
Kim, B.J., Choi, H., Jang, H., Lee, D.G., Jeong, W., Kim, S.W.: Dead pixel test using effective receptive field. Pattern Recogn. Lett, 167, 149–156 (2023)
Google Scholar
Sunkara, R., Luo, T.: No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459. Springer (2022)
Google Scholar
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Google Scholar
Hua, B.-S., Tran, M.-K., Yeung, S.-K.: Pointwise convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 984–993 (2018)
Google Scholar
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)
Google Scholar
Yu, X., et al.: Object localization under single coarse point supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4868–4877 (2022)
Google Scholar
Zhu, P., Wen, L., Bian, X., Ling, H., Hu, Q.: Vision meets drones: A challenge. arXiv preprint arXiv:1804.07437 (2018)
Yu, X., Gong, Y., Jiang, N., Ye, Q., Han, Z.: Scale match for tiny person detection. In: Workshop on Applications of Computer Vision (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Donghua University, Shanghai, 201620, China
Li Yao, Qianni Huang & Yan Wan

Authors

Li Yao
View author publications
You can also search for this author in PubMed Google Scholar
Qianni Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qianni Huang .

Editor information

Editors and Affiliations

Nagoya University, Nagoya, Japan
Ichiro Ide
Centre of Research and Technology, Thermi, Greece
Ioannis Kompatsiaris
Chinese Academy of Sciences, Beijing, China
Changsheng Xu
The University of Electro-Communications, Tokyo, Japan
Keiji Yanai
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Mukogawa Women’s University, Nishinomiya, Japan
Naoko Nitta
Simula, Oslo, Norway
Michael Riegler
The University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yao, L., Huang, Q., Wan, Y. (2025). TPS-YOLO: The Efficient Tiny Person Detection Network Based on Improved YOLOv8 and Model Pruning. In: Ide, I., et al. MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15523. Springer, Singapore. https://doi.org/10.1007/978-981-96-2071-5_18

Download citation

DOI: https://doi.org/10.1007/978-981-96-2071-5_18
Published: 02 January 2025
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-2070-8
Online ISBN: 978-981-96-2071-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TPS-YOLO: The Efficient Tiny Person Detection Network Based on Improved YOLOv8 and Model Pruning