Abstract
Compared with traditional natural images, remote sensing images (RSIs) typically have high resolution. The objects in the images are densely distributed, with heterogeneous orientation and large scale variation, even among objects of the same class. In recent years, object detection algorithms have made great strides in general images, but they are still difficult to meet the challenges that exist in RSIs. Therefore, we propose a foreground feature embedding network (FFE-Net) for object detection in RSIs. To better grasp the object features in RSIs, we design a foreground feature embedding module (FFEM) to learn the foreground features of the object. This is achieved by introducing an additional semantic segmentation branch and embedding the features in the classification and regression branches. Simultaneously, we propose a modified Gaussian function with focal loss (MGFFL) as a way to eliminate the extra background noise from soft labels, making the learned foreground features more robust. Our experimental results on two publicly available remote sensing image datasets, DOTA-v1.0 and HRSC2016, validate the effectiveness of FFE-Net.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xia, G.S., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3974–3983 (2018)
Liu, Z., Yuan, L., Weng, L., Yang, Y.: A high resolution optical satellite image dataset for ship recognition and some new baselines. In: ICPRAM, pp. 324–331 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: RepPoints: point set representation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9657–9666 (2019)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Chen, Y., Zhang, Z., Cao, Y., Wang, L., Lin, S., Hu, H.: RepPoints v2: verification meets regression for object detection. Adv. Neural. Inf. Process. Syst. 33, 5621–5631 (2020)
Zhou, Y., et al.: MMRotate: a rotated object detection benchmark using PyTorch. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 7331–7334 (2022)
Yang, X., Yan, J., Feng, Z., He, T.: R3Det: refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3163–3171 (2021)
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 677–694. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_40
Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., Tian, Q.: Rethinking rotated object detection with Gaussian Wasserstein distance loss. In: International Conference on Machine Learning, pp. 11830–11841. PMLR (2021)
Yang, X., et al.: Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. Adv. Neural. Inf. Process. Syst. 34, 18381–18394 (2021)
Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., Yang, W.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11(24), 2930 (2019)
Xu, C., Li, C., Cui, Z., Zhang, T., Yang, J.: Hierarchical semantic propagation for object detection in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 58(6), 4353–4364 (2020)
Liu, S., Zhang, L., Lu, H., He, Y.: Center-boundary dual attention for oriented object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2021)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Hou, L., Lu, K., Xue, J., Li, Y.: Shape-adaptive selection and measurement for oriented object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 923–932 (2022)
Acknowledgements
This work is partially supported by National Science Foundation of China (61972187), Natural Science Foundation of Fujian Province (2020J01828, 2020J02024, 2022J011112, 2020J01826), Fuzhou Science and Technology Major Project (2022FZZD0112), Fuzhou Technology Planning Program (2021-ZD-284), and the Open Program of The Key Laboratory of Cognitive Computing and Intelligent Information Processing of Fujian Education Institutions, Wuyi University (KLCCIIP2020202).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, J., Cai, Y., Wang, T., Luo, Z., Shan, S., Li, Z. (2024). A Foreground Feature Embedding Network for Object Detection in Remote Sensing Images. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2012. Springer, Singapore. https://doi.org/10.1007/978-981-99-9637-7_34
Download citation
DOI: https://doi.org/10.1007/978-981-99-9637-7_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9636-0
Online ISBN: 978-981-99-9637-7
eBook Packages: Computer ScienceComputer Science (R0)