Skip to main content

A Novel Cross Frequency-Domain Interaction Learning forĀ Aerial Oriented Object Detection

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14428))

Included in the following conference series:

Abstract

Aerial oriented object detection is a vital task in computer vision, receiving significant attention for its role in remote image understanding. However, most Convolutional Neural Networks (CNNs) methods easily ignore the frequency domain because they only focus on the spatial/channel interaction. To address these limitations, we propose a novel approach called Cross Frequency-domain Interaction Learning (CFIL) for aerial oriented object detection. Our method consists of two modules: the Extraction of Frequency-domain Features (EFF) module and the Interaction of Frequency-domain Features (IFF) module. The EFF module extracts frequency-domain information from the feature maps, enhancing the richness of feature information across different frequencies. The IFF module facilitates efficient interaction and fusion of the frequency-domain feature maps obtained from the EFF module across channels. Finally, these frequency-domain weights are combined with the time-domain feature maps. By enabling full and efficient interaction and fusion of EFF feature weights across channels, the IFF module ensures effective utilization of frequency-domain information. Extensive experiments are conducted on the DOTA V1.0, DOTA V1.5, and HRSC2016 datasets to demonstrate the competitive performance of the proposed CFIL in the aerial oriented object detection. Our code and models will be publicly released.

This work was supported by the Natural Science Foundation of Fujian Province of China under Grant 2022J011271, the Foundation of Educational and Scientific Research Projects for Young and Middle-aged Teachers of Fujian Province under Grant JAT200471, as well as the High-level Talent Project of Xiamen University of Technology under Grant YKJ20013R.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Azimi, S.M., Vig, E., Bahmanyar, R., Kƶrner, M., Reinartz, P.: Towards multi-class object detection in unconstrained remote sensing imagery. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11363, pp. 150ā€“165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_10

    ChapterĀ  Google ScholarĀ 

  2. Chen, K., et al.: Hybrid task cascade for instance segmentation. Cornell University (2019)

    Google ScholarĀ 

  3. Chen, Z., et al.: PIoU loss: towards accurate oriented object detection in complex environments. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 195ā€“211. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_12

    ChapterĀ  Google ScholarĀ 

  4. Ding, J., Xue, N., Long, Y., Xia, G.S., Lu, Q.: Learning roi transformer for oriented object detection in aerial images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google ScholarĀ 

  5. Fu, K., Chang, Z., Zhang, Y., Sun, X.: Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE Trans. Geosci. Remote Sens. 59, 4370ā€“4387 (2021)

    ArticleĀ  Google ScholarĀ 

  6. Han, J., Ding, J., Li, J., Xia, G.S.: Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 60, 1ā€“11 (2021)

    Google ScholarĀ 

  7. Han, J., Ding, J., Xue, N., Xia, G.S.: Redet: a rotation-equivariant detector for aerial object detection. In: Computer Vision and Pattern Recognition (2021)

    Google ScholarĀ 

  8. He, K., Gkioxari, G., DollĆ”r, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961ā€“2969 (2017)

    Google ScholarĀ 

  9. Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. (2018)

    Google ScholarĀ 

  10. Jiang, Y.Y., et al.: R2cnn: rotational region CNN for orientation robust scene text detection (2017)

    Google ScholarĀ 

  11. Li, C., Xu, C., Cui, Z., Wang, D., Zhang, T., Yang, J.: Feature-attentioned object detection in remote sensing imagery. In: International Conference on Image Processing (2019)

    Google ScholarĀ 

  12. Li, W., Zhu, J.: Oriented reppoints for aerial object detection. Cornell University (2021)

    Google ScholarĀ 

  13. Lin, T.Y., Goyal, P., Girshick, R., He, K., DollƔr, P.: Focal loss for dense object detection. Cornell University (2017)

    Google ScholarĀ 

  14. Liu, Z., Wang, H., Weng, L., Yang, Y.: Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geosci. Remote Sens. Lett. 13, 1074ā€“1078 (2016)

    ArticleĀ  Google ScholarĀ 

  15. Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20, 3111ā€“3122 (2018)

    ArticleĀ  Google ScholarĀ 

  16. Ming, Q., Miao, L., Zhou, Z., Song, J., Yang, X.: Sparse label assignment for oriented object detection in aerial images. Remote Sens. 13(14), 2664 (2021)

    ArticleĀ  Google ScholarĀ 

  17. Ming, Q., Zhou, Z., Miao, L., Zhang, H., Li, L.: Dynamic anchor learning for arbitrary-oriented object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2355ā€“2363 (2022)

    Google ScholarĀ 

  18. Pan, X., et al.: Dynamic refinement network for oriented and densely packed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11207ā€“11216 (2020)

    Google ScholarĀ 

  19. Qiao, C., et al.: A novel multi-frequency coordinated module for sar ship detection. In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 804ā€“811. IEEE (2022)

    Google ScholarĀ 

  20. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Cornell University (2015)

    Google ScholarĀ 

  21. Shen, F., Du, X., Zhang, L., Tang, J.: Triplet contrastive learning for unsupervised vehicle re-identification. arXiv preprint arXiv:2301.09498 (2023)

  22. Shen, F., et al.: A large benchmark for fabric image retrieval. In: 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), pp. 247ā€“251. IEEE (2019)

    Google ScholarĀ 

  23. Shen, F., Peng, X., Wang, L., Zhang, X., Shu, M., Wang, Y.: HSGM: a hierarchical similarity graph module for object re-identification. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1ā€“6. IEEE (2022)

    Google ScholarĀ 

  24. Shen, F., Shu, X., Du, X., Tang, J.: Pedestrian-specific bipartite-aware similarity learning for text-based person retrieval. In: Proceedings of the 31th ACM International Conference on Multimedia (2023)

    Google ScholarĀ 

  25. Shen, F., Wei, M., Ren, J.: HSGNET: object re-identification with hierarchical similarity graph network. arXiv preprint arXiv:2211.05486 (2022)

  26. Shen, F., Xie, Y., Zhu, J., Zhu, X., Zeng, H.: Git: graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. 32, 1039ā€“1051 (2023)

    ArticleĀ  Google ScholarĀ 

  27. Shen, F., et al.: An efficient multiresolution network for vehicle reidentification. IEEE Internet Things J. 9(11), 9049ā€“9059 (2021)

    ArticleĀ  Google ScholarĀ 

  28. Shen, F., Zhu, J., Zhu, X., Xie, Y., Huang, J.: Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 23(7), 8793ā€“8804 (2021)

    ArticleĀ  Google ScholarĀ 

  29. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)

    Google ScholarĀ 

  30. Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., Yang, W.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11, 2930 (2019)

    ArticleĀ  Google ScholarĀ 

  31. Wang, J., Yang, W., Li, H.C., Zhang, H., Xia, G.S.: Learning center probability map for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 59, 4307ā€“4323 (2021)

    ArticleĀ  Google ScholarĀ 

  32. Wu, H., Shen, F., Zhu, J., Zeng, H., Zhu, X., Lei, Z.: A sample-proxy dual triplet loss function for object re-identification. IET Image Proc. 16(14), 3781ā€“3789 (2022)

    ArticleĀ  Google ScholarĀ 

  33. Xia, G.S., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3974ā€“3983 (2018)

    Google ScholarĀ 

  34. Xie, X., Cheng, G., Wang, J., Yao, X., Han, J.: Oriented R-CNN for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3520ā€“3529 (2021)

    Google ScholarĀ 

  35. Xie, Y., Shen, F., Zhu, J., Zeng, H.: Viewpoint robust knowledge distillation for accelerating vehicle re-identification. EURASIP J. Adv. Signal Process. 2021, 1ā€“13 (2021)

    ArticleĀ  Google ScholarĀ 

  36. Xu, R., Shen, F., Wu, H., Zhu, J., Zeng, H.: Dual modal meta metric learning for attribute-image person re-identification. In: 2021 IEEE International Conference on Networking, Sensing and Control (ICNSC), vol. 1, pp. 1ā€“6. IEEE (2021)

    Google ScholarĀ 

  37. Xu, Y., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1452ā€“1459 (2020)

    ArticleĀ  Google ScholarĀ 

  38. Yang, X., Yan, J., Feng, Z., He, T.: R3det: Refined single-stage detector with feature refinement for rotating object. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, pp. 3163ā€“3171 (2022)

    Google ScholarĀ 

  39. Yang, X., et al.: SCRDET: towards more robust detection for small, cluttered and rotated objects. In: International Conference on Computer Vision (2019)

    Google ScholarĀ 

  40. Zhang, G., Lu, S., Zhang, W.: CAD-Net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 57, 10015ā€“10024 (2019)

    ArticleĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiming Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Weng, W., Lin, W., Lin, F., Ren, J., Shen, F. (2024). A Novel Cross Frequency-Domain Interaction Learning forĀ Aerial Oriented Object Detection. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14428. Springer, Singapore. https://doi.org/10.1007/978-981-99-8462-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8462-6_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8461-9

  • Online ISBN: 978-981-99-8462-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics