Abstract:
Nowadays the amounts of data available to us have the ever larger growth trend. On the other hand such data often contain noise. We call them noisy Big Data. There is an ...Show MoreMetadata
Abstract:
Nowadays the amounts of data available to us have the ever larger growth trend. On the other hand such data often contain noise. We call them noisy Big Data. There is an increasing need for learning methods that can handle such noisy Big Data for classification tasks. In this paper we propose a highly random fuzzy forest algorithm for learning an ensemble of fuzzy decision trees from a big data set contaminated with attribute noise. We also present the distributed version of the proposed learning algorithm implemented in the MapReduce framework. Experiment results have demonstrated that the proposed algorithm is faster and more accurate than the state-of-the-art approach particularly in the presence of attribute noise.
Published in: 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)
Date of Conference: 29-31 July 2017
Date Added to IEEE Xplore: 25 June 2018
ISBN Information: