Abstract
The dominant instance segmentation methods first detect the object with an axis-aligned box, then predict the foreground mask on each proposal. While in aerial images, methods detecting objects with axis-aligned boxes are unsuitable, since the orientation of objects is arbitrary. What’s more, the RoI pooling step existed in these systems results in the loss of spatial details due to the feature warping and resizing, which will degrade the segmentation quality, especially for large elongated objects. In this paper, we propose a novel accurate oriented instance segmentation method, named Rotated Blend Mask R-CNN. We perform mask prediction in oriented bounding boxes and predict the final mask by combining instance-level information with lower-level fine-granularity information. The proposed method is evaluated on the iSAID dataset, and competitive outcomes show that our model achieves state-of-the-art. Code will be made available at https://github.com/ZZR8066/RotatedBlendMaskRCNN
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR, vol. abs/1703.06870 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. CoRR, vol. abs/1803.01534 (2018)
Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring R-CNN. CoRR, vol. abs/1903.00241 (2019)
Lin, T.-Y., et al.: Microsoft COCO: Common objects in context (2014)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. CoRR, vol. abs/1604.01685 (2016)
Xia, G.-S., et al.: DOTA: a large-scale dataset for object detection in aerial images. CoRR, vol. abs/1711.10398 (2017)
Liu, Z., Wang, H., Weng, L., Yang, Y.: Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE Geosci. Remote Sens. Lett. 13(8), 1074–1078 (2016)
Weir, N., et al.: SpaceNet MVOI: a multi-view overhead imagery dataset. CoRR, vol. abs/1903.12239 (2019)
Zamir, S.W., et al.: iSAID: a large-scale dataset for instance segmentation in aerial images. CoRR, vol. abs/1905.12886 (2019)
Chen, H., Sun, K., Tian, Z., Shen, C., Huang, Y., Yan, Y.: BlendMask: top-down meets bottom-up for instance segmentation (2020)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, vol. abs/1311.2524 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR, vol. abs/1406.4729 (2014)
Girshick, R.B.: Fast R-CNN. CoRR, vol. abs/1504.08083 (2015)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR, vol. abs/1506.01497 (2015)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. CoRR, vol. abs/1605.06409 (2016)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. CoRR, vol. abs/1712.00726 (2017)
Zhu, Y., Ma, C., Jun, D.: Rotated cascade R-CNN: a shape robust detector with coordinate regression. Pattern Recogn. 96, 106964 (2019)
Zhu, Y., Wu, X., Du, J.: Adaptive period embedding for representing oriented objects in aerial images. CoRR, vol. abs/1906.09447 (2019)
Ding, J., Xue, N., Long, Y., Xia, G.-S., Lu, O.: Learning ROI transformer for detecting oriented objects in aerial images. CoRR, vol. abs/1812.00155 (2018)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. CoRR, vol. abs/1411.4038 (2014)
Bolya, D., Zhou, C., Xiao, F., Lee, Y.J.: YOLACT: real-time instance segmentation. CoRR, vol. abs/1904.02689 (2019)
Bolya, D., Zhou, C., Xiao, F., Lee, Y.: Yolact++: Better real-time instance segmentation (2019)
Kirillov, A., Girshick, R., He, K., Dollar, P.: Panoptic feature pyramid networks, pp. 6392–6401 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, pp. 770–778 (2016)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database, pp. 248–255 (2009)
Chen, K., et al.: MMDetection: open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Acknowledgement
This work was supported by the Youtu Lab of Tencent.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Du, J. (2021). Accurate Oriented Instance Segmentation in Aerial Images. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12888. Springer, Cham. https://doi.org/10.1007/978-3-030-87355-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-87355-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87354-7
Online ISBN: 978-3-030-87355-4
eBook Packages: Computer ScienceComputer Science (R0)