Abstract:
Recent big data analysis usually involves datasets with features collected from various sources, where each feature may have different importance, and the training datase...Show MoreMetadata
Abstract:
Recent big data analysis usually involves datasets with features collected from various sources, where each feature may have different importance, and the training datasets may not be uniformly sampled. To improve the prediction quality of realworld learning problems, we propose a local radial basis function network that is capable of handling both nonuniform sampling density and heterogeneous features. Nonuniform sampling is resolved by estimating local sampling density and adjust the width of the Gaussian kernels accordingly, and heterogeneous features are handled by scaling each dimension of the feature space asymmetrically. To make the learner aware of inter-feature relationship, we propose a feature importance optimization technique base on L-BFGS-B algorithm, using the leave-one-out cross-validation mean squared error as the objective function. Leave-one-out cross-validation used to be a very time consuming process, but the optimization has been made practical by the fast cross-validation capability of local RBFN. Our experiments show that when both nonuniform sampling density and interfeature relationship are properly handled, a simple RBFN can outperform more complex kernel-based learning models such as support vector regressor on both mean-squared-error and training speed.
Date of Conference: 12-17 July 2015
Date Added to IEEE Xplore: 01 October 2015
ISBN Information: