skip to main content
10.1145/3447548.3467301acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Weakly Supervised Spatial Deep Learning based on Imperfect Vector Labels with Registration Errors

Published: 14 August 2021 Publication History

Abstract

This paper studies weakly supervised learning on spatial raster data based on imperfect vector training labels. Given raster feature imagery and imperfect (weak) vector labels with location registration errors, our goal is to learn a deep learning model for pixel classification and refine vector labels simultaneously. The problem is important in many geoscience applications such as streamline delineation and road mapping from earth imagery, where annotating imperfect coarse vector labels is far more efficient than drawing precise labels. But the problem is challenging due to the misalignment of vector labels with raster feature pixels and the need to infer true vector label location while learning neural network parameters. Existing works on weakly supervised learning often focus on noise and errors in label semantics, assuming label locations to be either correct or irrelevant (e.g., identical and independently distributed). A few works exist on label registration errors, but these methods often focus on label misalignment on object segment boundaries at the pixel level without guaranteeing vector continuity. To fill the gap, this paper proposes a spatial learning framework based on Expectation-Maximization that iteratively updates deep neural network parameters while inferring true vector label locations. Specifically, inference of true vector locations is based on both the current pixel class predictions and the geometric properties of vectors. Evaluations on real-world high-resolution remote sensing datasets in National Hydrography Dataset (NHD) refinement show that the proposed framework outperforms baseline methods in classification accuracy and refined vector quality.

Supplementary Material

MP4 File (weakly_supervised_spatial_deep_learning-zhe_jiang-wenchong_he-38957875-i5gB.mp4)
Presentation video for Weakly Supervised Spatial Deep Learning based on Imperfect Vector Labels with Registration Errors in KDD 2021.

References

[1]
David Acuna, Amlan Kar, and Sanja Fidler. 2019. Devil is in the edges: Learning semantic boundaries from noisy annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11075--11083.
[2]
Blake Anderson and David McGrew. 2017. Machine learning for encrypted malware traffic classification: accounting for noisy labels and non-stationarity. In Proceedings of the 23rd ACM SIGKDD International Conference on knowledge discovery and data mining. 1723--1732.
[3]
Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, Vol. 39, 12 (2017), 2481--2495.
[4]
Christopher M Bishop. 2006. Pattern recognition and machine learning. springer.
[5]
Honglie Chen, Weidi Xie, Andrea Vedaldi, and Andrew Zisserman. 2019. AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations. arXiv preprint arXiv:1908.05263 (2019).
[6]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, Vol. 40, 4 (2017), 834--848.
[7]
Lívia Castro Degrossi, Jo ao Porto de Albuquerque, Roberto dos Santos Rocha, and Alexander Zipf. 2018. A taxonomy of quality assessment methods for volunteered and crowdsourced geographic information. Transactions in GIS, Vol. 22, 2 (2018), 542--560.
[8]
Thomas G Dietterich. 2000. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine learning, Vol. 40, 2 (2000), 139--157.
[9]
Benoit Fré nay and Michel Verleysen. 2014. Classification in the presence of label noise: a survey. IEEE transactions on neural networks and learning systems, Vol. 25, 5 (2014), 845--869.
[10]
Michael F Goodchild and Linna Li. 2012. Assuring the quality of volunteered geographic information. Spatial statistics, Vol. 1 (2012), 110--120.
[11]
Benjamin Herfort, Hao Li, Sascha Fendrich, Sven Lautenbach, and Alexander Zipf. 2019. Mapping Human Settlements with Higher Accuracy and Less Volunteer Efforts by Combining Crowdsourcing and Deep Learning. Remote Sensing, Vol. 11, 15 (2019), 1799.
[12]
Mohammad Hesam Hesamian, Wenjing Jia, Xiangjian He, and Paul Kennedy. 2019. Deep learning techniques for medical image segmentation: Achievements and challenges. Journal of digital imaging, Vol. 32, 4 (2019), 582--596.
[13]
Zhe Jiang. 2018. A survey on spatial prediction methods. IEEE Transactions on Knowledge and Data Engineering, Vol. 31, 9 (2018), 1645--1664.
[14]
Zhe Jiang and Shashi Shekhar. 2017. Spatial big data science. Schweiz: Springer International Publishing AG (2017).
[15]
Anuj Karpatne, Zhe Jiang, Ranga Raju Vatsavai, Shashi Shekhar, and Vipin Kumar. 2016. Monitoring land-cover changes: A machine-learning perspective. IEEE Geoscience and Remote Sensing Magazine, Vol. 4, 2 (2016), 8--21.
[16]
Zhiwu Lu, Zhenyong Fu, Tao Xiang, Peng Han, Liwei Wang, and Xin Gao. 2017. Learning from weak and noisy labels for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 3 (2017), 486--500. https://doi.org/10.1109/TPAMI.2016.2552172
[17]
Volodymyr Mnih and Geoffrey E Hinton. 2012. Learning to label aerial images from noisy data. In Proceedings of the 29th International conference on machine learning (ICML-12). 567--574.
[18]
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1944--1952.
[19]
Xiang Ren, Wenqi He, Meng Qu, Clare R Voss, Heng Ji, and Jiawei Han. 2016. Label noise reduction in entity typing by heterogeneous partial-label embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 1825--1834.
[20]
Alan Ritter, Evan Wright, William Casey, and Tom Mitchell. 2015. Weakly supervised extraction of computer security events from twitter. In Proceedings of the 24th International Conference on World Wide Web. 896--905.
[21]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.
[22]
Karan Samel and Xu Miao. 2018. Active Deep Learning to Tune Down the Noise in Labels. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 685--694.
[23]
Shashi Shekhar, Zhe Jiang, Reem Y Ali, Emre Eftelioglu, Xun Tang, Venkata Gunturi, and Xun Zhou. 2015. Spatiotemporal data mining: a computational perspective. ISPRS International Journal of Geo-Information, Vol. 4, 4 (2015), 2306--2338.
[24]
Lawrence V. Stanislawski, Ethan J. Shavers, Shaowen Wang, Zhe Jiang, E. Lynn Usery, Evan Moak, Alexander Duffy, and Joel Schott. 2021. Extensibility of U-Net neural network model for hydrographic feature extraction and implications for hydrologic modeling. Remote Sensing, Vol. 1, 1 (2021), 26.
[25]
Yina Tang, Fedor Borisyuk, Siddarth Malreddy, Yixuan Li, Yiqun Liu, and Sergey Kirshner. 2019. MSURU: Large Scale E-commerce Image Classification with Weakly Supervised Search Data. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2518--2526.
[26]
Weixing Wang, Nan Yang, Yi Zhang, Fengping Wang, Ting Cao, and Patrik Eklund. 2016. A review of road extraction from remote sensing images. Journal of traffic and transportation engineering (english edition), Vol. 3, 3 (2016), 271--282.
[27]
Tong Xiao, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang. 2015. Learning from massive noisy labeled data for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2691--2699.
[28]
Zewei Xu, Shaowen Wang, Lawrence V Stanislawski, Zhe Jiang, Nattapon Jaroenchai, Arpan Man Sainju, Ethan Shavers, E Lynn Usery, Li Chen, Zhiyu Li, et almbox. 2021. An attention U-Net model for detection of fine-scale hydrologic streamlines. Environmental Modelling & Software, Vol. 140 (2021), 104992.
[29]
Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, BVK Vijaya Kumar, and Jan Kautz. 2018. Simultaneous edge alignment and learning. In Proceedings of the European Conference on Computer Vision (ECCV). 388--404.
[30]
ZFTurbo. 2018. ZF_UNET_224 Pretrained Model. https://github.com/ZFTurbo/ZF_UNET_224_Pretrained_Model.
[31]
Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou. 2019. Learning from incomplete and inaccurate supervision. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1017--1025.

Cited By

View all
  • (2024)PolygonGNN: Representation Learning for Polygonal Geometries with Heterogeneous Visibility GraphProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671738(4012-4022)Online publication date: 25-Aug-2024
  • (2022)Quantifying and Reducing Registration Uncertainty of Spatial Vector Labels on Earth ImageryProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539410(554-564)Online publication date: 14-Aug-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. imperfect vector labels
  2. limited training labels
  3. registration errors
  4. remote sensing
  5. weakly supervised spatial deep learning

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)18
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)PolygonGNN: Representation Learning for Polygonal Geometries with Heterogeneous Visibility GraphProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671738(4012-4022)Online publication date: 25-Aug-2024
  • (2022)Quantifying and Reducing Registration Uncertainty of Spatial Vector Labels on Earth ImageryProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539410(554-564)Online publication date: 14-Aug-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media