skip to main content
10.1145/3651671.3651695acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

A Study on Semantic Segmentation for Small Objects in High-resolution Aerial Images based on Mask R-CNN and HRNet

Published: 07 June 2024 Publication History

Abstract

Due to the high resolution of aerial images, small objects in the images occupy very few pixels. Additionally, the surface features of some types of small objects are very similar, making them difficult to distinguish. These factors make semantic segmentation of small objects in aerial images a challenging task with poor performance. In this paper, we replaced ResNet in Mask R-CNN with HRNet, which has better resolution preservation ability. The proposed method achieved an mIoU of 68.06 on the iSAID dataset, which is an improvement of 13.02% and 10.94% compared to using ResNet-50 and ResNet-101 as the backbone network, respectively. Moreover, compared to other advanced semantic segmentation algorithms with the same number of parameters and computational complexity, our method achieves higher accuracy with similar performance. We expect that this method can help improve the accuracy of semantic segmentation of small objects in aerial images and assist in better identification and localization of these objects.

References

[1]
S. Q. Yang, “Planting Row Detection Of Multi-Growth Winter Wheat Field Based on UAV Remote Sensing Image,” Transactions of the Chinese Society for Agricultural Machinery, vol. 54, no. 2, pp. 181-188, 2022.
[2]
Cheng Q, Man F A N, Yandong L I. Review on semantic segmentation of UAV aerial images[J]. Comput. Eng. Appl, 2021, 57(19): 57-69.
[3]
L. L. Chen, “Cultivated Land Extraction from High Resolution Remote Sensing Image Based on Convolutional Neural Network,” Transactions of the Chinese Society for Agricultural Machinery, vol. 53, no. 9, pp. 168-177, 2022.
[4]
K. M. He, g. Georgia, D. Piotr, and R. Girshick, “Mask r-cnn,” IEEE international conference on computer vision, pp. 2961-2969. 2017.
[5]
Y. C. Wang, “Extraction And Digital Modeling Of Road Geometric Information Using Lidar Data Point Clouds,” China Journal of Highway Transportations, vol 36, no. 2, 2023.
[6]
X. L. Li, “Application of Improved Mask RCNN in Offshore Ship Instance Segmentation,” Ship Engineering, vol 43, no. 27, 2021.
[7]
X. Q. Du, “UAV Field Obstacle Detection Based On Spatial Attention And Deformable Convolution,” Transactions of the Chinese Society for Agricultural Machinery, vol. 54, no. 2, 2023.
[8]
Y. Long, “Depth estimation of apple tree in single image using improved HRNet,” Transactions of the Chinese Society of Agricultural Engineering, vol. 38, no. 23, 2022.
[9]
Z. L. Liu, “Small Target Detection Method for Dual-Modal Autonomous Driving with Yolo v5 and Lite-HRNet Fusion,” Automotive Engineering, vol. 44, no. 10, 2022.
[10]
S. M. Luo, “Human Pose Estimation of Occlusion Based on Light-Weight High-Resolution Network,” J. Wuhan Univ. (Nat. Sci. Ed.), vol. 67, no. 23, 2021.
[11]
Y. Wu, “Human Pose Estimation Network Based on Context Attention Mechanism,” Unmanned Systems Technology, vol. 5, no. 6, 2022.
[12]
C. Y. Ru, “Fault Identification Method for High Voltage Power Grid Insulator Based on Lightweight MobileNet-SSD and MobileNetV2-DeeplabV3+ Network,” High Voltage Engineering, vol. 48, no. 9, 2022.
[13]
C. H, Jiang, “Lightweight Retinal Blood Vessels Segmentation Based on PSPNet Improved UNet*,” Chinese Journal Of Sensors And Actuators, vol. 35, no. 7, 2022.

Index Terms

  1. A Study on Semantic Segmentation for Small Objects in High-resolution Aerial Images based on Mask R-CNN and HRNet

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLC '24: Proceedings of the 2024 16th International Conference on Machine Learning and Computing
    February 2024
    757 pages
    ISBN:9798400709234
    DOI:10.1145/3651671
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. HRNet
    2. aerial images
    3. semantic segmentation
    4. small objects

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICMLC 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 45
      Total Downloads
    • Downloads (Last 12 months)45
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media