Abstract
Many mobile vision application scenarios require the real-time detection of objects, such as real-world road condition detection. The real-time object detection demands a lightweight of the model, which is the ability to real-time process the detection of object efficiently. Recent studies have shown the potential of FCOS with anchor-free to increase the detection capacity. However, there are several severe issues with FCOS that prevent it from being directly employed to real-time object detection, such as algorithm model’s complexity, larger computation of parameter, and high memory usage. To address these issues, we design a lightweight FCOS-based model for real-time object detection, named Tiny FCOS, with three distinction characteristics: (1) a lightweight backbone network, which achieves the abatement of model’s weight efficiently. (2) a standardized dilated convolution group to construct the structure of FPN and efficiently reduces the impact of the gridding effect. (3) the prediction branch of FCN, while conceptually simple, detect object at only one convolution block rather than stacking convolutions with kernel size of 3 × 3 by 4 times. Experiments on two datasets demonstrate that the mAP of Tiny FCOS reaches 62.4% improving 4% compared to Tiny YOLOv3 on dataset PASCAL VOC, while the amount of parameters and calculations fall down to only 27.8% and 20.3% of FCOS respectively and the speed is 6 times that of FCOS on dataset KITTI. Furthermore, Tiny FCOS significantly outperforms existing methods and provides a new solution to the real-time object detection problems.
Similar content being viewed by others
References
Gao H, Huang W, Duan Y (2021) The cloud-edge-based dynamic reconfiguration to service workflow for Mobile ecommerce environments: a QoS prediction perspective. ACM Trans Internet Technol, pp23, https://doi.org/10.1145/3391198
Yang X, Zhou S, Cao M (2020) An approach to alleviate the sparsity problem of hybrid collaborative filtering based recommendations: the product-attribute perspective from user reviews. Mobile Netw Appl 25(2):376–390. https://doi.org/10.1007/s11036-019-01246-2
Gao H, Liu C, Li Y (2020) V2VR: reliable hybrid-network-oriented V2V data transmission and routing considering RSUs and connectivity probability[J]. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.2983835
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition, arXiv 1612:08242
Redmon, Joseph, Ali Farhadi (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Ren SQ (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Proces Syst. https://doi.org/10.1109/TPAMI.2016.2577031
Lin T (2017) Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision, https://doi.org/10.1109/TPAMI.2018.2858826
Howard, Andrew G (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Wang Q, Li B, Xiao T, Zhu J, Li C, Wong D F, Chao L S (2019) Learning deep transformer models for machine translation. in Proc. ACL, Florence, Italy, pp1810–1822, arXiv: 1906.01787
Zhang X (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE conference on computer vision and pattern recognition, arXiv: 1707.01083
Ma N (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision, https://doi.org/10.1007/978-3-030-01264-9_8
Iandola, Forrest N (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360
Bochkovskiy A, Wang C. Y, Liao H (2020) Yolov4: optimal speed and accuracy of object detection. Computer Vision and Pattern Recognition, arXiv: 2004.10934
Sun P, Zhang R, Jiang Y (2020) Sparse r-cnn: end-to-end object detection with learnable proposals. Computer Vision and Pattern Recognition, arXiv: 2011.12450
Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), https://doi.org/10.1007/978-3-030-01264-9_45
Yildirim, Gökhan, Sabine Süsstrunk (2014) FASA: fast, accurate, and size-aware salient object detection. Asian conference on computer vision. Springer, Cham, https://doi.org/10.1007/978-3-319-16811-1_34
Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, https://doi.org/10.1109/CVPR.2019.00093
Zhou X, Zhuo J, P Krhenbühl (2019) Bottom-up object detection by grouping extreme and center points. Computer Vision and Pattern Recognition, arXiv: 1901.08043
Zhou X, Wang D, P Krhenbühl (2019) Objects as points. Computer Vision and Pattern Recognition, arXiv: 1904.07850
Tian, Z (2019) Fcos: fully convolutional one-stage object detection. Proceedings of the IEEE International Conference on Computer Vision, https://doi.org/10.1109/ICCV.2019.00972
Huang Z, Chen P, Wang P (2018) System and method for semantic segmentation using hybrid dilated convolution (HDC). U.S. patent no. 10,147,193. 4
Zhang S, Wen L, Shi H (2019) Single-shot scale-aware network for real-time face detection. Int J Comput Vis, https://doi.org/10.1007/s11263-019-01159-3
Xiang C, Shi H, N Li (2019) Pedestrian detection under unmanned aerial vehicle an improved single-stage detector based on RetinaNet. 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), https://doi.org/10.1109/CISP-BMEI48845.2019.8965666
Zhu B, Wang J, Jiang Z, Zong F (2020) Auto assign: differentiable label assignment for dense object detection. Computer Vision and Pattern Recognition, arXiv: 2007.03496
He K (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2016.90
Xie S (2017) Aggregated residual transformations for deep neural networks. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2017.634
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. Proc IEEE Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2017.195
Wang Y, Zhou Q, Liu J (2019) LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation. 2019 IEEE International Conference on Image Processing (ICIP), https://doi.org/10.1109/ICIP.2019.8803154
Alshamsi H, Meng H, Li M (2016) Real time facial expression recognition app development on mobile phones. International Conference on Natural Computation, IEEE, https://doi.org/10.1109/FSKD.2016.7603442
Hawazen R, Nusrat B, Salwa A (2020) Comparing the effects of individual versus group face-to-face class activities in flipped classroom on Student's test performances. Health Professions Education 6(2):153–161. https://doi.org/10.1016/j.hpe.2019.06.002
Marchesi L, Marchesi M, Pompianu L (2020) Security checklists for Ethereum smart contract development: patterns and best practices. arXiv:2008.04761
Yuan Y, Che X (2020) Research on road condition detection based on Crowdsensing. IEEE, https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00169
Liu W (2016) Ssd: single shot multibox detector. European conference on computer vision, Springer, Cham. https://doi.org/10.1007/978-3-319-46448-0_2
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant 62072255. We also would like to thank the reviewers for their comments to help us improve the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, X., Liang, W., Zhao, J. et al. Tiny FCOS: a Lightweight Anchor-Free Object Detection Algorithm for Mobile Scenarios. Mobile Netw Appl 26, 2219–2229 (2021). https://doi.org/10.1007/s11036-021-01845-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-021-01845-y