The linear time complexity of DR.LSH makes it suitable for handling big datasets.
•
DR.LSH is competitive with other state-of-the-art methods in building extraction.
•
DR.LSH can significantly reduce the number of instances and execution time.
Abstract
Training support vector machines (SVMs) for pixel-based feature extraction purposes from aerial images requires selecting representative pixels (instances) as a training dataset. In this research, locality-sensitive hashing (LSH) is adopted for developing a new instance selection method which is referred to as . The intuition of rests on rapidly finding similar and redundant training samples and excluding them from the original dataset. The simple idea of this method alongside its linear computational complexity make it expeditious in coping with massive training data (millions of pixels). is benchmarked against two recently proposed methods on a dataset for building extraction with 23,750,000 samples obtained from the fusion of aerial images and point clouds. The results reveal that outperforms them in terms of both preservation rate and maintaining the generalization ability (classification loss). The source code of can be found in https://github.com/mohaslani/DR.LSH.