Object Detection Using Clustering Algorithm Adaptive Searching Regions in Aerial Images

Wang, Yi; Yang, Youlong; Zhao, Xi

doi:10.1007/978-3-030-66823-5_39

Object Detection Using Clustering Algorithm Adaptive Searching Regions in Aerial Images

Yi Wang¹⁰,
Youlong Yang¹⁰ &
Xi Zhao¹⁰

Conference paper
First Online: 03 January 2021

3631 Accesses
22 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12538))

Abstract

Aerial images are increasingly used for critical tasks, such as traffic monitoring, pedestrian tracking, and infrastructure inspection. However, aerial images have the following main challenges: 1) small objects with non-uniform distribution; 2) the large difference in object size. In this paper, we propose a new network architecture, Cluster Region Estimation Network (CRENet), to solve these challenges. CRENet uses a clustering algorithm to search cluster regions containing dense objects, which makes the detector focus on these regions to reduce background interference and improve detection efficiency. However, not every cluster region can bring precision gain, so each cluster region difficulty score is calculated to mine the difficult region and eliminate the simple cluster region, which can speed up the detection. Then, a Gaussian scaling function(GSF) is used to scale the difficult cluster region to reduce the difference of object size. Our experiments show that CRENet achieves better performance than previous approaches on the VisDrone dataset. Our best model achieved 4.3\(\%\) improvement on the VisDrone dataset.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The ROI here is different from Faster RCNN [29]. The ROI of this paper contains not just one object but multiple objects of interest, and it is used to represent the region with dense objects.

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection (2020)
Google Scholar
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS - improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Kai Li, Li Fei-Fei: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Ding, J., Xue, N., Long, Y., Xia, G.S., Lu, Q.: Learning RoI transformer for detecting oriented objects in aerial images (2018)
Google Scholar
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Article Google Scholar
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD : deconvolutional single shot detector (2017)
Google Scholar
Gao, M., Yu, R., Li, A., Morariu, V.I., Davis, L.S.: Dynamic zoom-in network for fast object detection in large images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
LaLonde, R., Zhang, D., Shah, M.: ClusterNet: detecting small objects in large scenes by exploiting spatio-temporal information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: The European Conference on Computer Vision (ECCV), September 2018
Google Scholar
Li, C., Yang, T., Zhu, S., Chen, C., Guan, S.: Density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2020
Google Scholar
Li, Y., Huang, Q., Pei, X., Jiao, L., Shang, R.: RADet: refine feature pyramid network and multi-layer attention network for arbitrary-oriented object detection of remote sensing images. Remote Sens. 12(3) (2020). https://doi.org/10.3390/rs12030389
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lu, Y., Javidi, T., Lazebnik, S.: Adaptive object detection using adjacency and zoom prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Ma, J., Shao, W., Ye, H., Wang, L., Wang, H., Zheng, Y., Xue, X.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimed. 20(11), 3111–3122 (2018)
Article Google Scholar
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 3, pp. 850–855 (2006)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked Hourglass Networks for Human Pose Estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Unel, F.O., Ozkalayci, B.O., Cigla, C.: The power of tiling for small object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019
Google Scholar
Perreault, H., Bilodeau, G., Saunier, N., Héritier, M.: SpotNet: self-attention multi-task network for object detection. In: 2020 17th Conference on Computer and Robot Vision (CRV), pp. 230–237 (2020)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Redmon, J., Farhadi, A.: YOLOV3: an incremental improvement (2018)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, pp. 91–99. Curran Associates, Inc. (2015). http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
Google Scholar
Tang, Z., Liu, X., Shen, G., Yang, B.: PENet: object detection using points estimation in aerial images (2020)
Google Scholar
Uzkent, B., Ermon, S.: Learning when and where to zoom with deep reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
Google Scholar
Uzkent, B., Yeh, C., Ermon, S.: Efficient object detection in large images using deep reinforcement learning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), March 2020
Google Scholar
Wang, H., et al.: Spatial attention for multi-scale feature refinement for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, October 2019
Google Scholar
Wu, Z., Suresh, K., Narayanan, P., Xu, H., Kwon, H., Wang, Z.: Delving into robust object detection from unmanned aerial vehicles: a deep nuisance disentanglement approach. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Yang, F., Fan, H., Chu, P., Blasch, E., Ling, H.: Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Yang, X., Liu, Q., Yan, J., Li, A., Zhang, Z., Yu, G.: R3Det: refined single-stage detector with feature refinement for rotating object (2019)
Google Scholar
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Article Google Scholar
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Yu, X., Gong, Y., Jiang, N., Ye, Q., Han, Z.: Scale match for tiny person detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), March 2020
Google Scholar
Zhang, J., Huang, J., Chen, X., Zhang, D.: How to fully exploit the abilities of aerial image detectors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, October 2019
Google Scholar
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points (2019)
Google Scholar
Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Zhu, P., Wen, L., Bian, X., Ling, H., Hu, Q.: Vision meets drones: a challenge (2018)
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China grant 61573266.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xidian University, Xi’an, China
Yi Wang, Youlong Yang & Xi Zhao

Authors

Yi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Youlong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xi Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youlong Yang .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Yang, Y., Zhao, X. (2020). Object Detection Using Clustering Algorithm Adaptive Searching Regions in Aerial Images. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12538. Springer, Cham. https://doi.org/10.1007/978-3-030-66823-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-66823-5_39
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66822-8
Online ISBN: 978-3-030-66823-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics