skip to main content
10.1145/3383972.3384032acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

Multi-task Learning for Bi-temporal Remote Sensing Scene Parsing via Patch-pixel Representation

Published: 26 May 2020 Publication History

Abstract

Bi-temporal high resolution (BHR) remote sensing images assemble abundant information and ample ground details. Recently, scene parsing based on bi-temporal images has played an important role in many fields. Nevertheless, for the purpose of bi-temporal images understanding, conventional semantic segmentation can neither utilized feature relevance adequately nor tackle the problem of low-speed inference. In this paper, we propose a confused matrix unit (CMU) for scene parsing in BHR images. The designed unit is capable of representing scene transformation condition at patch level. Based on the proposed unit, a Siamese fully convolution network is developed for supervised scene parsing in BHR images. For the details of the network, we design Siamese architecture to obtain feature representation from BHR images. Then, feature aggregation protocol is implemented to yield feature fusion map. Next, CMU is calculated to reflect patch-level scene content. Subsequently, according to the results of CMU, a decoder part of semantic segmentation is designed to achieve binary pixel-level change detection. Finally, a post-processing operation is carried out to assign different dense labels for different scenes. The proposed method is evaluated both on aerial and satellite image data sets. The test result has illustrated that the proposed algorithm achieves better performance than other existing methods, especially in accuracy and inference time.

References

[1]
Long, J., Shelhamer, E., & Darrell, T. (2014), 'Fully convolutional networks for semantic segmentation'. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(4), 640--651.
[2]
O. Ronneberger, P. Fischer, and T. Brox, 'U-Net: Convolutional Networks for Biomedical Image Segmentation'. Medical Image Computing and Computer-Assisted Intervention(MICCAI)., pp. 234--241, 2015.
[3]
Chen, Liang Chieh, et al. 'DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs'. IEEE Transactions on Pattern Analysis & Machine Intelligence 40.4(2016): 834--848.
[4]
C. R. Daudt, L. B. Saux, and A. Boulch, 'Fully Convolutional Siamese Networks for Change Detection', IEEE International Conference on Image Processing (ICIP), pp. 2381--8F549, 2018.
[5]
Qing, Wang, et al. 'Change detection based on Faster R-CNN for high-resolution remote sensing images'. Remote Sensing Letters 9.10(2018): 923--932.
[6]
Ren, S., K. He, R. Girshick, and J. Sun. 2017. 'Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks'. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6): 1137--1149.
[7]
Saboori, Arash, and J. Birjandtalab. 'Remote sensing image data fusion using spatial PCA and average block-DCT'. 2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS) IEEE, 2016.
[8]
Ye, Qiankun, et al. 'AggregationNet: Identifying Multiple Changes Based on Convolutional Neural Network in Bitemporal Optical Remote Sensing Images'. Advances in Knowledge Discovery and Data Mining. Springer, Cham, 2019.
[9]
Kirillov, Alexander, et al. 'Panoptic Segmentation'. Computer vision and pattern recognition (CVPR), 2019.
[10]
Romera, Eduardo, et al. 'ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation'. IEEE Transactions on Intelligent Transportation Systems 19.1(2018): 263--272.
[11]
Byvatov, Evgeny, et al. 'Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification'. Journal of Chemical Information and Computer Sciences 43.6(2003): 1882--1889.
[12]
Zhang, Cheng, et al. 'Siamese neural network based gait recognition for human identification'. IEEE International Conference on Acoustics IEEE, 2016.
[13]
Lin, Chenhao, and A. Kumar. 'Multi-Siamese networks to accurately match contactless to contact-based fingerprint images'. 2017 IEEE International Joint Conference on Biometrics (IJCB) IEEE, 2017.
[14]
Chollet, François. 'Xception: Deep Learning with Depthwise Separable Convolutions'. arXiv preprint arXiv: 1601.02357, (2016).
[15]
Simonyan, Karen, and A. Zisserman. 'Very Deep Convolutional Networks for Large-Scale Image Recognition'. Computer Science (2014).
[16]
He, Kaiming, et al. 'Deep Residual Learning for Image Recognition'. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) IEEE Computer Society, 2016.
[17]
Ioffe, Sergey, and C. Szegedy. 'Batch normalization: accelerating deep network training by reducing internal covariate shift'. International Conference on International Conference on Machine Learning JMLR.org, 2015.
[18]
Girshick, Ross. 'Fast R-CNN'. 2015 IEEE International Conference on Computer Vision (ICCV) IEEE, 2016.
[19]
Chen, Hongruixuan, et al. 'Deep Siamese Multi-scale Convolutional Network for Change Detection in Multi-temporal VHR Images'. arXiv preprint arXiv: 1906.11479, (2019).
[20]
Garcia-Garcia, Alberto, et al. 'A Review on Deep Learning Techniques Applied to Semantic Segmentation'. arXiv preprint arXiv: 1704.06857, (2017).
[21]
Howard, Andrew G., et al. 'MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications'. (2017). arXiv preprint arXiv: 1704.04861, (2017).
[22]
Lin, Guosheng, et al. 'RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation'. arXiv preprint arXiv: 1611.06612, (2016).
[23]
Zhao, Hengshuang, et al. 'Pyramid Scene Parsing Networkv'. arXiv preprint arXiv: 1612.01105, (2016).
[24]
Yu, Changqian, et al. 'BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation'. arXiv preprint arXiv: 1808.00897, (2018).

Index Terms

  1. Multi-task Learning for Bi-temporal Remote Sensing Scene Parsing via Patch-pixel Representation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLC '20: Proceedings of the 2020 12th International Conference on Machine Learning and Computing
    February 2020
    607 pages
    ISBN:9781450376426
    DOI:10.1145/3383972
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Shenzhen University: Shenzhen University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. BHR remote sensing images
    2. Scene parsing
    3. confuse matrix
    4. pixel-level detection
    5. post-processing

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICMLC 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 102
      Total Downloads
    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media