RTMDet-R2: An Improved Real-Time Rotated Object Detector

Xiang, Haifeng; Jing, Naifeng; Jiang, Jianfei; Guo, Hongbo; Sheng, Weiguang; Mao, Zhigang; Wang, Qin

doi:10.1007/978-981-99-8555-5_28

Haifeng Xiang¹⁵,
Naifeng Jing¹⁵,
Jianfei Jiang¹⁵,
Hongbo Guo¹⁵,
Weiguang Sheng¹⁵,
Zhigang Mao¹⁵ &
…
Qin Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14436))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

403 Accesses

Abstract

Object detection in remote sensing images is challenging due to the absence of visible features and variations in object orientation. Efficient detection of objects in such images can be achieved using rotated object detectors that utilize oriented bounding boxes. However, existing rotated object detectors often struggle to maintain high accuracy while processing high-resolution remote sensing images in real time. In this paper, we present RTMDet-R2, an improved real-time rotated object detector. RTMDet-R2 incorporates an enhanced path PAFPN to effectively fuse multi-level features and employs a task interaction decouple head to alleviate the imbalance between regression and classification tasks. To further enhance performance, we propose the ProbIoU-aware dynamic label assignment strategy, which enables efficient and accurate label assignment during the training. As a result, RTMDet-R2-m and RTMDet-R2-l achieve 79.10% and 79.46% mAP, respectively, on the DOTA 1.0 dataset using single-scale training and testing, outperforming the majority of other rotated object detectors. Moreover, RTMDet-R2-s and RTMDet-R2-t achieve 78.43% and 77.27% mAP, respectively, while achieving inference frame rates of 175 and 181 FPS at a resolution of 1024 × 1024 on an RTX 3090 GPU with TensorRT and FP16-precision. Furthermore, RTMDet-R2-t achieves 90.63/97.44% mAP on the HRSC2016 dataset. The code and models are available at https://github.com/Zeba-Xie/RTMDet-R2.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We follow the latest metrics from the DOTA evaluation server, original voc format mAP is now mAP50.

References

Xie, X., Cheng, G.: Oriented R-CNN for object detection. In: ICCV, pp. 3520–3529 (2021)
Google Scholar
Han, J., Ding, J., Xue, N., Xia, G.: ReDet: a rotation-equivariant detector for aerial object detection. In: CVPR, pp. 2786–2795 (2021)
Google Scholar
Xue, Y., Qing, L., Junchi, Y.: R3det: Refined single-stage detector with feature refinement for rotating object. arXiv preprint arXiv:1908.05612 (2019)
Jiaming, H., Jian, D., Jie, L.: Align deep features for oriented object detection. IEEE Trans. Geosci. Aerial 60, 1–11 (2021)
Google Scholar
Youtian, L., Pengming, F.: IENet: Interacting Embranchment One Stage Anchor Free Detector for Orientation Aerial Object Detection. arXiv preprint arXiv: 1912.00969 (2019)
Google Scholar
Steven, L., Fabrizio, V., Kristian, K.: Dafne: A one-stage anchor-free deep model for oriented object detection. arXiv preprint arXiv:2109.06148 (2021)
Zhonghua, L., Biao, H., Zitong, W.: FCOSR: A simple anchor-free rotated detector for aerial object detection. arXiv preprint arXiv:2111.10780 (2021)
Wang, X., Wang, G., Dang, Q.: PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector. arXiv preprint arXiv:2211.02386 (2022)
Lyu, C., Zhang, W., Huang, H.: RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv preprint arXiv:2212.07784 (2022)
Lin, T., Dollár, P., Girshick, R., He, K.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Google Scholar
Liu, S., Qi, L., Qin, H.: Path aggregation network for instance segmentation. In: CVPR, pp. 8759–8768 (2018)
Google Scholar
Chen, K., Cao, Y.: Feature pyramid grids. arXiv preprint arXiv:2004.03580 (2020)
Zheng G., Songtao L., Jian S.: YOLOX: Exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Chengjian, F., Yujie, Z., Yu G.: TOOD: Task-aligned one-stage object detection. In: ICCV, pp. 3490–3499 (2021)
Google Scholar
Shifeng, Z., Cheng, C., Yongqiang, Y.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR, pp. 9759–9768 (2020)
Google Scholar
Yue, W., Yinpeng, C., Lu, Y.: Rethinking classification and localization for object detection. In: CVPR, pp. 10186–10195 (2020)
Google Scholar
Zhuang, J., Qin, Z., Yu, H., Chen, X.: Task-Specific Context Decoupling for Object Detection. arXiv preprint arXiv:2303.01047 (2023)
Jeffri, M.L., Luis, F.Z., Lucas, N.K., Claudio, J.: Gaussian bounding boxes and probabilistic intersection-over-union for object detection. arXiv preprintarXiv:2106.06072 (2021)
Xue, Y., Junchi, Y., Qi, M.: Rethinking rotated object detection with Gaussian Wasserste in distance loss. In: ICML, pp. 11830–11841 (2021)
Google Scholar
Xue, Y., Yue, Z., Gefan, Z.: The kfiou loss for rotated object detection. arXiv preprint arXiv:2201.12558 (2022)
GuiSong, X., Xiang, B., Jian, D.: Dota: a large-scale dataset for object detection in aerial images. In: CVPR, pp. 3974–3983 (2018)
Google Scholar
Zikun, L., Liu, Y., Lubin, W.: A high resolution optical satellite image dataset for ship recognition and some new baselines. In ICPRAM, pp. 324–331 (2017)
Google Scholar
Zhou, Y., Yang, X., Zhang, G.: Mmrotate: a rotated object detection benchmark using pytorch. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 7331–7334 (2022)
Google Scholar
Shaoqing R., Kaiming H., Ross G., Jian S.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Google Scholar
Kaiming, H., Georgia, G., Piotr, D.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017)
Google Scholar
Joseph R., Ali F.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
ChienYao, W., Alexey, B.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: CVPR, pp. 7464–7475 (2022)
Google Scholar
Liu, W., Anguelov, D., Erhan, D.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
Google Scholar
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: ECCV, pp. 354–370 (2016)
Google Scholar
Yu, Y., Da, F.: Phase-shifting coder: predicting accurate orientation in oriented object detection. In: CVPR, pp. 13354–13363 (2023)
Google Scholar
Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11(24), 2930 (2019)
Article Google Scholar
Li, W., Chen, Y., Hu, K., Zhu, J.: Oriented reppoints for aerial object detection. In: CVPR, pp. 1829–1838 (2022)
Google Scholar
Guo, Z., Liu, C., Zhang, X., Jiao, J.: Beyond bounding-box: convex-hull feature adaptation for oriented and densely packed object detection. In: CVPR, pp. 8792–8801 (2021)
Google Scholar
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. In: ECCV, pp. 677–694 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Haifeng Xiang, Naifeng Jing, Jianfei Jiang, Hongbo Guo, Weiguang Sheng, Zhigang Mao & Qin Wang

Authors

Haifeng Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Naifeng Jing
View author publications
You can also search for this author in PubMed Google Scholar
Jianfei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Hongbo Guo
View author publications
You can also search for this author in PubMed Google Scholar
Weiguang Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Mao
View author publications
You can also search for this author in PubMed Google Scholar
Qin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qin Wang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiang, H. et al. (2024). RTMDet-R2: An Improved Real-Time Rotated Object Detector. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14436. Springer, Singapore. https://doi.org/10.1007/978-981-99-8555-5_28

Download citation

DOI: https://doi.org/10.1007/978-981-99-8555-5_28
Published: 28 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8554-8
Online ISBN: 978-981-99-8555-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics