Abstract:
The continuous generation of huge amount of remote sensing (RS) data is becoming a challenging task for researchers due to the 4 Vs characterizing this type of data (volu...Show MoreMetadata
Abstract:
The continuous generation of huge amount of remote sensing (RS) data is becoming a challenging task for researchers due to the 4 Vs characterizing this type of data (volume, variety, velocity and veracity). Many platforms have been proposed to deal with big data in RS field. This paper focus on the comparison of two well-known platforms of big RS data namely Hadoop and Spark. We start by describing the two platforms Hadoop and Spark. The first platform is designed for processing enormous unstructured data in a distributed computing environment. It is composed of two basic elements : 1) Hadoop Distributed file system for storage, and 2) Mapreduce and Yarn for parallel processing, scheduling the jobs and analyzing big RS data. The second platform, Spark, is composed by a set of libraries and uses the resilient distributed data set to overcome the computational complexity. The last part of this paper is devoted to a comparison between the two platforms.
Published in: 2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)
Date of Conference: 21-24 March 2018
Date Added to IEEE Xplore: 24 May 2018
ISBN Information: