ABSTRACT
Existing two-stage detectors usually generate oriented proposals based on heuristically defined anchors with different scales, angles, and aspect ratios. This scheme usually suffers from severe memory-consuming and redundant computation. Additionally, misalignment between rotated proposals and horizontally aligned convolutional features exists when using a conventional Region Proposal Network (RPN), which leads to the inconsistency of classification confidence and positioning accuracy. To tackle these problems, we propose a Rotated Cascade Region Proposal Network (RCRPN), which effectively reduces memory usage and improves the quality of proposals through multi-stage refinement. Specifically, instead of using multiple anchors with predefined scales and aspect ratios, a single anchor per location is adopted in the first stage of RCRPN, and coarse proposals are generated in a horizontal convolution manner, this stage effectively takes the advantage of gliding vertex method to adapt the rotated bounding box. In the second stage, by taking the coarse proposals and image feature map as input, adaptive align-convolution is applied to learn the sampled rotated features guided by the coarse proposals, finally generating high-quality proposals for the downstream tasks. Extensive experiments demonstrate that our method can achieve better performance than baseline algorithm Oriented R-CNN on two commonly used datasets including DOTA and HRSC2016 for oriented object detection.
- Seyed Majid Azimi, Eleonora Vig, Reza Bahmanyar, Marco Körner, and Peter Reinartz. 2018. Towards multi-class object detection in unconstrained remote sensing imagery. In Asian conference on computer vision. Springer, 150–165.Google Scholar
- Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6154–6162.Google ScholarCross Ref
- Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision. 764–773.Google ScholarCross Ref
- Jian Ding, Nan Xue, Yang Long, Gui-Song Xia, and Qikai Lu. 2019. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2849–2858.Google ScholarCross Ref
- Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303–338.Google ScholarDigital Library
- Kun Fu, Zhonghan Chang, Yue Zhang, and Xian Sun. 2020. Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing 59, 5 (2020), 4370–4387.Google ScholarCross Ref
- Spyros Gidaris and Nikos Komodakis. 2016. Attend refine repeat: Active box proposal generation via in-out localization. arXiv preprint arXiv:1606.04446 (2016).Google Scholar
- Jiaming Han, Jian Ding, Jie Li, and Gui-Song Xia. 2021. Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing (2021).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
- Chengzheng Li, Chunyan Xu, Zhen Cui, Dan Wang, Tong Zhang, and Jian Yang. 2019. Feature-attentioned object detection in remote sensing imagery. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 3886–3890.Google ScholarCross Ref
- Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.Google ScholarCross Ref
- Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision.Google ScholarCross Ref
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.Google ScholarCross Ref
- Zikun Liu, Jingao Hu, Lubin Weng, and Yiping Yang. 2017. Rotated region based CNN for ship detection. In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 900–904.Google ScholarDigital Library
- Zikun Liu, Hongzhen Wang, Lubin Weng, and Yiping Yang. 2016. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geoscience and Remote Sensing Letters 13, 8 (2016), 1074–1078.Google ScholarCross Ref
- Zikun Liu, Liu Yuan, Lubin Weng, and Yiping Yang. 2017. A high resolution optical satellite image dataset for ship recognition and some new baselines. In International conference on pattern recognition applications and methods, Vol. 2. SciTePress, 324–331.Google ScholarCross Ref
- Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, and Xiangyang Xue. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia 20, 11 (2018), 3111–3122.Google ScholarDigital Library
- Qi Ming, Lingjuan Miao, Zhiqiang Zhou, and Yunpeng Dong. 2021. CFC-Net: A critical feature capturing network for arbitrary-oriented object detection in remote-sensing images. IEEE Transactions on Geoscience and Remote Sensing 60 (2021), 1–14.Google ScholarCross Ref
- Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Hongwei Zhang, and Linhao Li. 2021. Dynamic anchor learning for arbitrary-oriented object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 2355–2363.Google ScholarCross Ref
- Wen Qian, Xue Yang, Silong Peng, Junchi Yan, and Yue Guo. 2021. Learning modulated loss for rotated object detection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 2458–2466.Google ScholarCross Ref
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015).Google Scholar
- Tian Tian, Zhihong Pan, Xiangyu Tan, and Zhengquan Chu. 2020. Arbitrary-oriented inshore ship detection based on multi-scale feature fusion and contextual pooling on rotation region proposals. Remote Sensing 12, 2 (2020), 339.Google ScholarCross Ref
- Thang Vu, Hyunjun Jang, Trung X Pham, and Chang Yoo. 2019. Cascade rpn: Delving into high-quality region proposal network with adaptive convolution. Advances in neural information processing systems 32 (2019).Google Scholar
- Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, and Dahua Lin. 2019. Region proposal by guided anchoring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2965–2974.Google ScholarCross Ref
- Jinwang Wang, Jian Ding, Haowen Guo, Wensheng Cheng, Ting Pan, and Wen Yang. 2019. Mask OBB: A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing 11, 24 (2019), 2930.Google ScholarCross Ref
- Jinwang Wang, Wen Yang, Heng-Chao Li, Haijian Zhang, and Gui-Song Xia. 2020. Learning center probability map for detecting objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing 59, 5 (2020), 4307–4323.Google ScholarCross Ref
- Gui-Song Xia, Xiang Bai, Jian Ding, Zhen Zhu, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, and Liangpei Zhang. 2018. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3974–3983.Google ScholarCross Ref
- Xingxing Xie, Gong Cheng, Jiabao Wang, Xiwen Yao, and Junwei Han. 2021. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3520–3529.Google ScholarCross Ref
- Yongchao Xu, Mingtao Fu, Qimeng Wang, Yukang Wang, Kai Chen, Gui-Song Xia, and Xiang Bai. 2020. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE transactions on pattern analysis and machine intelligence 43, 4 (2020), 1452–1459.Google ScholarCross Ref
- Bin Yang, Junjie Yan, Zhen Lei, and Stan Z Li. 2016. Craft objects from images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6043–6051.Google ScholarCross Ref
- Xue Yang, Junchi Yan, Ziming Feng, and Tao He. 2021. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3163–3171.Google ScholarCross Ref
- Xue Yang, Jirui Yang, Junchi Yan, Yue Zhang, Tengfei Zhang, Zhi Guo, Xian Sun, and Kun Fu. 2019. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8232–8241.Google ScholarCross Ref
- Gongjie Zhang, Shijian Lu, and Wei Zhang. 2019. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing 57, 12 (2019), 10015–10024.Google ScholarCross Ref
- Qiaoyong Zhong, Chao Li, Yingying Zhang, Di Xie, Shicai Yang, and Shiliang Pu. 2020. Cascade region proposal and global context for deep object detection. Neurocomputing 395 (2020), 170–177.Google ScholarCross Ref
- Yue Zhou, Xue Yang, Gefan Zhang, Jiabao Wang, Yanyi Liu, Liping Hou, Xue Jiang, Xingzhao Liu, Junchi Yan, Chengqi Lyu, Wenwei Zhang, and Kai Chen. 2022. MMRotate: A Rotated Object Detection Benchmark using PyTorch. In Proceedings of the 30th ACM International Conference on Multimedia.Google ScholarDigital Library
Index Terms
- Learning High-Quality Bounding Box for Rotated Object Detection via Rotated Cascade Region Proposal Network
Recommendations
RSDet++: Point-Based Modulated Loss for More Accurate Rotated Object Detection
We classify the discontinuity of loss in both five-param and eight-param rotated object detection methods as rotation sensitivity error (RSE) which will result in performance degeneration. We introduce a novel modulated rotation loss to alleviate the ...
Rethinking Parking Slot Detection with Rotated Bounding Box
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaParking slot detection is an essential yet challenging task in the field of self-driving perception. During parking, vehicles often block part of the parking slots which makes the corners occluded. In addition, due to the impact of the external ...
RTMDet-R2: An Improved Real-Time Rotated Object Detector
Pattern Recognition and Computer VisionAbstractObject detection in remote sensing images is challenging due to the absence of visible features and variations in object orientation. Efficient detection of objects in such images can be achieved using rotated object detectors that utilize ...
Comments