Abstract
In this paper we investigate how to improve object detection on very high resolution orthomosaics. For this, we present a new detection model ResnetYolo, with a Resnet50 backbone and selectable detection heads. Furthermore, we propose two novel techniques to post-process the object detection results: a neighbour based patch NMS algorithm and an IoA based filtering technique. Finally, we fuse color and depth data in order to further increase the results of our deep learning model. We test these improvements on two distinct, challenging use cases: solar panel and swimming pool detection. The images are very high resolution color and elevation orthomosaics, taken from plane photography. Our final models reach an average precision of 78.5% and 44.4% respectively, outperforming the baseline models by over 15% AP.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acatay, O., Sommer, L., Schumann, A., Beyerer, J.: Comprehensive evaluation of deep learning based detection methods for vehicle detection in aerial imagery. In: 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2018)
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Minding (2019)
Alganci, U., Soydas, M., Sertel, E.: Comparative research on deep learning approaches for airplane detection from very high-resolution satellite images. Remote Sens. 12(3), 458 (2020)
Bolya, D., Zhou, C., Xiao, F., Lee, Y.J.: YOLACT: real-time instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9157–9166 (2019)
Ding, J., et al.: Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021). https://doi.org/10.1109/TPAMI.2021.3117983
Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multimodal deep learning for robust RGB-D object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687. IEEE (2015)
Farahnakian, F., Heikkonen, J.: A comparative study of deep learning-based RGB-depth fusion methods for object detection. In: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1475–1482 (2020). https://doi.org/10.1109/ICMLA51294.2020.00228
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Liu, Z., Wang, H., Weng, L., Yang, Y.: Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geosci. Remote Sens. Lett. 13(8), 1074–1078 (2016)
Ophoff, T., Puttemans, S., Kalogirou, V., Robin, J.P., Goedemé, T.: Vehicle and vessel detection on satellite imagery: A comparative study on single-shot detectors. Remote Sens. 12(7), 1217 (2020)
Ophoff, T., Van Beeck, K., Goedemé, T.: Exploring RGB+depth fusion for real-time object detection. Sensors 19(4) (2019). https://doi.org/10.3390/s19040866, https://www.mdpi.com/1424-8220/19/4/866
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017). https://doi.org/10.1109/CVPR.2017.690
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. Technical report (2018)
Schwarz, M., Schulz, H., Behnke, S.: RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1329–1335. IEEE (2015)
Van Etten, A.: You only look twice: rapid multi-scale object detection in satellite imagery. arXiv preprint arXiv:1805.09512 (2018)
Zhou, K., Paiement, A., Mirmehdi, M.: Detecting humans in RGB-D data with CNNs. In: 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA), pp. 306–309 (2017). https://doi.org/10.23919/MVA.2017.7986862
Acknowledgements
This project was funded by VLAIO. We would like to thank Vansteelandt BV for preparing and providing the data used in this project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ophoff, T., Van Beeck, K., Goedemé, T. (2023). Improving Object Detection in VHR Aerial Orthomosaics. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13807. Springer, Cham. https://doi.org/10.1007/978-3-031-25082-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-25082-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25081-1
Online ISBN: 978-3-031-25082-8
eBook Packages: Computer ScienceComputer Science (R0)