Abstract
Aerial oriented object detection is a vital task in computer vision, receiving significant attention for its role in remote image understanding. However, most Convolutional Neural Networks (CNNs) methods easily ignore the frequency domain because they only focus on the spatial/channel interaction. To address these limitations, we propose a novel approach called Cross Frequency-domain Interaction Learning (CFIL) for aerial oriented object detection. Our method consists of two modules: the Extraction of Frequency-domain Features (EFF) module and the Interaction of Frequency-domain Features (IFF) module. The EFF module extracts frequency-domain information from the feature maps, enhancing the richness of feature information across different frequencies. The IFF module facilitates efficient interaction and fusion of the frequency-domain feature maps obtained from the EFF module across channels. Finally, these frequency-domain weights are combined with the time-domain feature maps. By enabling full and efficient interaction and fusion of EFF feature weights across channels, the IFF module ensures effective utilization of frequency-domain information. Extensive experiments are conducted on the DOTA V1.0, DOTA V1.5, and HRSC2016 datasets to demonstrate the competitive performance of the proposed CFIL in the aerial oriented object detection. Our code and models will be publicly released.
This work was supported by the Natural Science Foundation of Fujian Province of China under Grant 2022J011271, the Foundation of Educational and Scientific Research Projects for Young and Middle-aged Teachers of Fujian Province under Grant JAT200471, as well as the High-level Talent Project of Xiamen University of Technology under Grant YKJ20013R.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Azimi, S.M., Vig, E., Bahmanyar, R., Kƶrner, M., Reinartz, P.: Towards multi-class object detection in unconstrained remote sensing imagery. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11363, pp. 150ā165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_10
Chen, K., et al.: Hybrid task cascade for instance segmentation. Cornell University (2019)
Chen, Z., et al.: PIoU loss: towards accurate oriented object detection in complex environments. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 195ā211. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_12
Ding, J., Xue, N., Long, Y., Xia, G.S., Lu, Q.: Learning roi transformer for oriented object detection in aerial images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Fu, K., Chang, Z., Zhang, Y., Sun, X.: Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE Trans. Geosci. Remote Sens. 59, 4370ā4387 (2021)
Han, J., Ding, J., Li, J., Xia, G.S.: Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 60, 1ā11 (2021)
Han, J., Ding, J., Xue, N., Xia, G.S.: Redet: a rotation-equivariant detector for aerial object detection. In: Computer Vision and Pattern Recognition (2021)
He, K., Gkioxari, G., DollĆ”r, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961ā2969 (2017)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. (2018)
Jiang, Y.Y., et al.: R2cnn: rotational region CNN for orientation robust scene text detection (2017)
Li, C., Xu, C., Cui, Z., Wang, D., Zhang, T., Yang, J.: Feature-attentioned object detection in remote sensing imagery. In: International Conference on Image Processing (2019)
Li, W., Zhu, J.: Oriented reppoints for aerial object detection. Cornell University (2021)
Lin, T.Y., Goyal, P., Girshick, R., He, K., DollƔr, P.: Focal loss for dense object detection. Cornell University (2017)
Liu, Z., Wang, H., Weng, L., Yang, Y.: Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geosci. Remote Sens. Lett. 13, 1074ā1078 (2016)
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20, 3111ā3122 (2018)
Ming, Q., Miao, L., Zhou, Z., Song, J., Yang, X.: Sparse label assignment for oriented object detection in aerial images. Remote Sens. 13(14), 2664 (2021)
Ming, Q., Zhou, Z., Miao, L., Zhang, H., Li, L.: Dynamic anchor learning for arbitrary-oriented object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2355ā2363 (2022)
Pan, X., et al.: Dynamic refinement network for oriented and densely packed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11207ā11216 (2020)
Qiao, C., et al.: A novel multi-frequency coordinated module for sar ship detection. In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 804ā811. IEEE (2022)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Cornell University (2015)
Shen, F., Du, X., Zhang, L., Tang, J.: Triplet contrastive learning for unsupervised vehicle re-identification. arXiv preprint arXiv:2301.09498 (2023)
Shen, F., et al.: A large benchmark for fabric image retrieval. In: 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), pp. 247ā251. IEEE (2019)
Shen, F., Peng, X., Wang, L., Zhang, X., Shu, M., Wang, Y.: HSGM: a hierarchical similarity graph module for object re-identification. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1ā6. IEEE (2022)
Shen, F., Shu, X., Du, X., Tang, J.: Pedestrian-specific bipartite-aware similarity learning for text-based person retrieval. In: Proceedings of the 31th ACM International Conference on Multimedia (2023)
Shen, F., Wei, M., Ren, J.: HSGNET: object re-identification with hierarchical similarity graph network. arXiv preprint arXiv:2211.05486 (2022)
Shen, F., Xie, Y., Zhu, J., Zhu, X., Zeng, H.: Git: graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. 32, 1039ā1051 (2023)
Shen, F., et al.: An efficient multiresolution network for vehicle reidentification. IEEE Internet Things J. 9(11), 9049ā9059 (2021)
Shen, F., Zhu, J., Zhu, X., Xie, Y., Huang, J.: Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 23(7), 8793ā8804 (2021)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)
Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., Yang, W.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11, 2930 (2019)
Wang, J., Yang, W., Li, H.C., Zhang, H., Xia, G.S.: Learning center probability map for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 59, 4307ā4323 (2021)
Wu, H., Shen, F., Zhu, J., Zeng, H., Zhu, X., Lei, Z.: A sample-proxy dual triplet loss function for object re-identification. IET Image Proc. 16(14), 3781ā3789 (2022)
Xia, G.S., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3974ā3983 (2018)
Xie, X., Cheng, G., Wang, J., Yao, X., Han, J.: Oriented R-CNN for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3520ā3529 (2021)
Xie, Y., Shen, F., Zhu, J., Zeng, H.: Viewpoint robust knowledge distillation for accelerating vehicle re-identification. EURASIP J. Adv. Signal Process. 2021, 1ā13 (2021)
Xu, R., Shen, F., Wu, H., Zhu, J., Zeng, H.: Dual modal meta metric learning for attribute-image person re-identification. In: 2021 IEEE International Conference on Networking, Sensing and Control (ICNSC), vol. 1, pp. 1ā6. IEEE (2021)
Xu, Y., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1452ā1459 (2020)
Yang, X., Yan, J., Feng, Z., He, T.: R3det: Refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, pp. 3163ā3171 (2022)
Yang, X., et al.: SCRDET: towards more robust detection for small, cluttered and rotated objects. In: International Conference on Computer Vision (2019)
Zhang, G., Lu, S., Zhang, W.: CAD-Net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 57, 10015ā10024 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Weng, W., Lin, W., Lin, F., Ren, J., Shen, F. (2024). A Novel Cross Frequency-Domain Interaction Learning for Aerial Oriented Object Detection. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14428. Springer, Singapore. https://doi.org/10.1007/978-981-99-8462-6_24
Download citation
DOI: https://doi.org/10.1007/978-981-99-8462-6_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8461-9
Online ISBN: 978-981-99-8462-6
eBook Packages: Computer ScienceComputer Science (R0)