ABSTRACT
Detecting small objects poses a significant challenge in computer vision because of the low resolution and fuzzy feature representation. Although one-stage detection techniques alleviate the problem caused by scale difference to some extent, they also retain redundant features, resulting in resource wastage and slower processing speeds. This research first primarily centers around elucidating the SSD algorithm, a new improved framework, namely Lite FPN-SSD (Lite Single Shot Multibox Detector with Adapting Feature Pyramid Network), then is proposed to solve the weakness of the SSD algorithm. The Lite FPN-SSD is build upon the popular FPN and SSD architectures to create a learnable fusion scheme with controlling the feature information that deep layers deliver to shallow layers. Its lightweight nature, with a minimal increase in parameters, ensures high efficiency for real-time applications. Extensive experiments conducted on VOC, VEDAI and SOHAS datasets demonstrate an impressive results of the proposed models in comparison with the original SSD and its other variations. Particularly, by making a minimal addition of only 2 million parameters, the proposed model achieves a mean average precision (mAP) of 78.36% on the VOC dataset, which is close to another architecture that achieved a 78.40% mAP but require more than 2.6.
- Tsung Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ar. 2020. Focal loss for dense object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 42, 2, 318–327. DOI: https://doi.org/ 10.1109/TPAMI.2018.2858826Google ScholarCross Ref
- Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 779–788. DOI: https://doi.org/ 10.1109/CVPR.2016.91Google ScholarCross Ref
- Joseph Redmon, and Ali Farh. 2017. Yolo9000: Better, faster, stronger. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: https://doi.org/10.1109/CVPR.2017.690Google ScholarCross Ref
- Joseph Redmon, and Ali Farh. 2018. Yolov3: An incremental improvement. In arXiv:1804.02767v1 [cs.CV]. DOI: https://doi.org/10.48550/arXiv.1804.02767Google ScholarCross Ref
- Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot MultiBox detector. In Proceedings of the Computer Vision (ECCV). 21–37. DOI: https://doi.org/10.1007/978-3-319-46448-0_2Google ScholarCross Ref
- Tsung Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: https://doi.org/10.1109/CVPR.2017.106Google ScholarCross Ref
- Xiang Li and Haibo LuoX. 2021. An Improved SSD for small target detection. In 6th International Conference on Multimedia and Image Processing (ICMIP). ACM, New York, NY, USA. DOI: https://doi.org/10.1145/3449388.3449391Google ScholarDigital Library
- Hong Tae Choi, Ho Jun Lee, Hoon Kang, Sungwook Yu, and Ho Hyun Park. 2021. SSD-EMB: An improved ssd using enhanced feature map block for object detection. Sensor. Vol. 21, 8, 1–8. DOI: https://doi.org/10.3390/s21082842Google ScholarCross Ref
- Zuoxin Li, and Fuqiang Zhou. 2017. FSSD: Feature fusion single shot multibox detector. arXiv: Computer Vision and Pattern Recognition. DOI: https://doi.org/10.48550/arXiv.1712.00960%20Google ScholarCross Ref
- Cheng Yang Fu, Wei Liu, Ananth Ranga, Ambrish Tyagi, and Alexander C. Berg. 2017. DSSD: Deconvolutional single shot detector. arXiv: Computer Vision and Pattern Recognition. DOI: https://doi.org/10.48550/arXiv.1701.06659Google ScholarCross Ref
- Lisha Cui, Rui Ma, Pei Lv, Xiaoheng Jiang, Zhimin Gao, Bing Zhou, and Mingliang Xu. 2020. MDSSD: Multi-scale deconvolutional single shot detector for small objects. Sci. China Inf. Sci. Vol. 63. DOI: https://doi.org/10.1007/s11432-019-2723-1Google ScholarCross Ref
- Everingham M., Van Gool L., Williams C.K., Winn J., Zisserman A. 2010. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. DOI: https://doi.org/10.1007/s11263-009-0275-4Google ScholarDigital Library
- Sebastien Razakarivony, and Frederic Jurie. 2015. Vehicle detection in aerial imagery: A small target detection benchmark. Journal of Visual Communication and Image Representation. Vol. 34, 187 – 203. DOI: https://doi.org/10.1016/j.jvcir.2015.11.002Google ScholarDigital Library
- Hamido Fujita, Francisco Herrera. 2020. Object Detection Binary Classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowledge-Based Systems. Vol 194. DOI: https://doi.org/10.1016/j.knosys.2020.105590Google ScholarCross Ref
- Karen Simonyan, and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. The 3rd International Conference on Learning Representations (ICLR2015), arXiv preprint arXiv:1409.1556, 2014. 1, 5. https://arxiv.org/abs/1409.1556Google Scholar
Index Terms
- Lite FPN_SSD: A Reconfiguration SSD with Adapting Feature Pyramid Network Scheme for Small Object Detection
Recommendations
Attentional feature pyramid network for small object detection
AbstractRecent state-of-the-art detectors generally exploit the Feature Pyramid Networks (FPN) due to its advantage of detecting objects at different scales. Despite significant advances in object detection owing to the design of feature ...
Hierarchical Focused Feature Pyramid Network for Small Object Detection
Pattern Recognition and Computer VisionAbstractSmall object detection has been a persistently practical and challenging task in the field of computer vision. Advanced detectors often utilize a feature pyramid network (FPN) to fuse the features generated from various receptive fields, which ...
3D Object Detection Based on Feature Pyramid Network
ICAIP '20: Proceedings of the 4th International Conference on Advances in Image Processing3D object detection aims to study how to perceive environmental information effectively, classify and locate interested objects accurately. In order to solve the problem that the object is easy to be lost in complex environments (such as partial ...
Comments