Rule based regression and feature selection for biological data | IEEE Conference Publication | IEEE Xplore

Rule based regression and feature selection for biological data


Abstract:

Regression is widely utilized in a variety of biological problems involving continuous outcomes. There are a number of methods for building regression models ranging from...Show More

Abstract:

Regression is widely utilized in a variety of biological problems involving continuous outcomes. There are a number of methods for building regression models ranging from linear models to more complex nonlinear ones. While linear regression techniques can identify linear correlations between input and output, in many practical applications, the relations are nonlinear. These relations can be modeled by nonlinear regression techniques effectively. However, many models built with nonlinear techniques have limited interpretation, which is crucial in many biological problems. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features, and hence is able to provide a simple interpretation. We tested the approach on a seacoast chemical sensors dataset, a Stockori flowering time dataset, and three datasets from the UCI repository. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of conventional random forests regression. It demonstrates high potential in terms of prediction performance and interpretation ease on studying nonlinear relationships of the subjects.
Date of Conference: 18-21 December 2013
Date Added to IEEE Xplore: 06 February 2014
Electronic ISBN:978-1-4799-1309-1
Conference Location: Shanghai, China

Contact IEEE to Subscribe

References

References is not available for this document.