Skip to main content
Log in

Tiny FCOS: a Lightweight Anchor-Free Object Detection Algorithm for Mobile Scenarios

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

Many mobile vision application scenarios require the real-time detection of objects, such as real-world road condition detection. The real-time object detection demands a lightweight of the model, which is the ability to real-time process the detection of object efficiently. Recent studies have shown the potential of FCOS with anchor-free to increase the detection capacity. However, there are several severe issues with FCOS that prevent it from being directly employed to real-time object detection, such as algorithm model’s complexity, larger computation of parameter, and high memory usage. To address these issues, we design a lightweight FCOS-based model for real-time object detection, named Tiny FCOS, with three distinction characteristics: (1) a lightweight backbone network, which achieves the abatement of model’s weight efficiently. (2) a standardized dilated convolution group to construct the structure of FPN and efficiently reduces the impact of the gridding effect. (3) the prediction branch of FCN, while conceptually simple, detect object at only one convolution block rather than stacking convolutions with kernel size of 3 × 3 by 4 times. Experiments on two datasets demonstrate that the mAP of Tiny FCOS reaches 62.4% improving 4% compared to Tiny YOLOv3 on dataset PASCAL VOC, while the amount of parameters and calculations fall down to only 27.8% and 20.3% of FCOS respectively and the speed is 6 times that of FCOS on dataset KITTI. Furthermore, Tiny FCOS significantly outperforms existing methods and provides a new solution to the real-time object detection problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Gao H, Huang W, Duan Y (2021) The cloud-edge-based dynamic reconfiguration to service workflow for Mobile ecommerce environments: a QoS prediction perspective. ACM Trans Internet Technol, pp23, https://doi.org/10.1145/3391198

  2. Yang X, Zhou S, Cao M (2020) An approach to alleviate the sparsity problem of hybrid collaborative filtering based recommendations: the product-attribute perspective from user reviews. Mobile Netw Appl 25(2):376–390. https://doi.org/10.1007/s11036-019-01246-2

    Article  Google Scholar 

  3. Gao H, Liu C, Li Y (2020) V2VR: reliable hybrid-network-oriented V2V data transmission and routing considering RSUs and connectivity probability[J]. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.2983835

  4. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition, arXiv 1612:08242

    Google Scholar 

  5. Redmon, Joseph, Ali Farhadi (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  6. Ren SQ (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Proces Syst. https://doi.org/10.1109/TPAMI.2016.2577031

  7. Lin T (2017) Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision, https://doi.org/10.1109/TPAMI.2018.2858826

  8. Howard, Andrew G (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  9. Wang Q, Li B, Xiao T, Zhu J, Li C, Wong D F, Chao L S (2019) Learning deep transformer models for machine translation. in Proc. ACL, Florence, Italy, pp1810–1822, arXiv: 1906.01787

  10. Zhang X (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE conference on computer vision and pattern recognition, arXiv: 1707.01083

  11. Ma N (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision, https://doi.org/10.1007/978-3-030-01264-9_8

  12. Iandola, Forrest N (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360

  13. Bochkovskiy A, Wang C. Y, Liao H (2020) Yolov4: optimal speed and accuracy of object detection. Computer Vision and Pattern Recognition, arXiv: 2004.10934

  14. Sun P, Zhang R, Jiang Y (2020) Sparse r-cnn: end-to-end object detection with learnable proposals. Computer Vision and Pattern Recognition, arXiv: 2011.12450

  15. Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), https://doi.org/10.1007/978-3-030-01264-9_45

  16. Yildirim, Gökhan, Sabine Süsstrunk (2014) FASA: fast, accurate, and size-aware salient object detection. Asian conference on computer vision. Springer, Cham, https://doi.org/10.1007/978-3-319-16811-1_34

  17. Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, https://doi.org/10.1109/CVPR.2019.00093

  18. Zhou X, Zhuo J, P Krhenbühl (2019) Bottom-up object detection by grouping extreme and center points. Computer Vision and Pattern Recognition, arXiv: 1901.08043

  19. Zhou X, Wang D, P Krhenbühl (2019) Objects as points. Computer Vision and Pattern Recognition, arXiv: 1904.07850

  20. Tian, Z (2019) Fcos: fully convolutional one-stage object detection. Proceedings of the IEEE International Conference on Computer Vision, https://doi.org/10.1109/ICCV.2019.00972

  21. Huang Z, Chen P, Wang P (2018) System and method for semantic segmentation using hybrid dilated convolution (HDC). U.S. patent no. 10,147,193. 4

  22. Zhang S, Wen L, Shi H (2019) Single-shot scale-aware network for real-time face detection. Int J Comput Vis, https://doi.org/10.1007/s11263-019-01159-3

  23. Xiang C, Shi H, N Li (2019) Pedestrian detection under unmanned aerial vehicle an improved single-stage detector based on RetinaNet. 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), https://doi.org/10.1109/CISP-BMEI48845.2019.8965666

  24. Zhu B, Wang J, Jiang Z, Zong F (2020) Auto assign: differentiable label assignment for dense object detection. Computer Vision and Pattern Recognition, arXiv: 2007.03496

  25. He K (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2016.90

  26. Xie S (2017) Aggregated residual transformations for deep neural networks. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2017.634

  27. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2017.195

  28. Wang Y, Zhou Q, Liu J (2019) LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation. 2019 IEEE International Conference on Image Processing (ICIP), https://doi.org/10.1109/ICIP.2019.8803154

  29. Alshamsi H, Meng H, Li M (2016) Real time facial expression recognition app development on mobile phones. International Conference on Natural Computation, IEEE, https://doi.org/10.1109/FSKD.2016.7603442

  30. Hawazen R, Nusrat B, Salwa A (2020) Comparing the effects of individual versus group face-to-face class activities in flipped classroom on Student's test performances. Health Professions Education 6(2):153–161. https://doi.org/10.1016/j.hpe.2019.06.002

    Article  Google Scholar 

  31. Marchesi L, Marchesi M, Pompianu L (2020) Security checklists for Ethereum smart contract development: patterns and best practices. arXiv:2008.04761

  32. Yuan Y, Che X (2020) Research on road condition detection based on Crowdsensing. IEEE, https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00169

  33. Liu W (2016) Ssd: single shot multibox detector. European conference on computer vision, Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 62072255. We also would like to thank the reviewers for their comments to help us improve the quality of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Honghao Gao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, X., Liang, W., Zhao, J. et al. Tiny FCOS: a Lightweight Anchor-Free Object Detection Algorithm for Mobile Scenarios. Mobile Netw Appl 26, 2219–2229 (2021). https://doi.org/10.1007/s11036-021-01845-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11036-021-01845-y

Keywords

Navigation