A fast instance selection method for support vector machines in building extraction

https://doi.org/10.1016/j.asoc.2020.106716Get rights and content
Under a Creative Commons license
open access

Highlights

  • The linear time complexity of DR.LSH makes it suitable for handling big datasets.

  • DR.LSH is competitive with other state-of-the-art methods in building extraction.

  • DR.LSH can significantly reduce the number of instances and execution time.

Abstract

Training support vector machines (SVMs) for pixel-based feature extraction purposes from aerial images requires selecting representative pixels (instances) as a training dataset. In this research, locality-sensitive hashing (LSH) is adopted for developing a new instance selection method which is referred to as DR.LSH. The intuition of DR.LSH rests on rapidly finding similar and redundant training samples and excluding them from the original dataset. The simple idea of this method alongside its linear computational complexity make it expeditious in coping with massive training data (millions of pixels). DR.LSH is benchmarked against two recently proposed methods on a dataset for building extraction with 23,750,000 samples obtained from the fusion of aerial images and point clouds. The results reveal that DR.LSH outperforms them in terms of both preservation rate and maintaining the generalization ability (classification loss). The source code of DR.LSH can be found in https://github.com/mohaslani/DR.LSH.

Keywords

Support vector machines
Data reduction
Instance selection
Big data
Building extraction

Cited by (0)