Abstract
In this paper we propose a novel support vector based soft computing technique which can be applied to solve regression problems. Proposed hybrid outperforms previously known techniques in literature in terms of accuracy of prediction and time taken for training. We also present a comparative study of quantile regression, differential evolution trained wavelet neural networks (DEWNN) and quantile regression random forest ensemble models in prediction in regression problems. Intervals of the parameter values of random forest for which the performance figures of the Quantile Regression Random Forest (QRFF) are statistically stable are also identified. The effectiveness of the QRFF over Quantile Regression and DWENN is evaluated on Auto MPG dataset, Body fat dataset, Boston Housing dataset, Forest Fires dataset, Pollution dataset, by using 10-fold cross validation.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Koenker, R.: Quantile Regression. Cambridge University Press (2005)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman & Hall/CRC (1984)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Meinhausen, N.: Quantile Regression Forest. Journal of Machine Learning Research, 983–999 (2006)
Koenker, R., Basset, G.: Regression Quantiles. Econometrica: Journal of the Econometric Society Econometrica, 33–50 (1978)
Bhat, H.S., Kumar, N., Vaz, G.: Quantile Decision Trees (2011), http://faculty.ucmerced.edu/hbhat/BhatKumarVaz2011.pdf
Ivakhnenko, A.G.: The group method of data handling - A rival of the method of stochastic approximation. Soviet Automatic Control 13(3), 43–55 (1966)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Schulze, N.: Applied Quantile Regression: Microeconometric, Financial, and Environmental Analyses. PhD Dissertation, Faculty of Economics, Eberhard Karls University of Tübingen (2004)
Chauhan, N., Ravi, V., Karthikchandra, D.: Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks. Expert Systems with Application 36, 7659–7665 (2009)
Reddy, K.N., Ravi, V.: Kernel Group Method of Data Handling: Application to Regression Problems. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) SEMCCO 2012. LNCS, vol. 7677, pp. 74–81. Springer, Heidelberg (2012)
Naveen, N., Ravi, V., Rao, C.R.: Rule extraction from DEWNN to Solve Classification and Regression Problems. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) SEMCCO 2012. LNCS, vol. 7677, pp. 206–214. Springer, Heidelberg (2012)
Lin, Y., Jeon, Y.: Random Forest and adaptive nearest neighbors, Technical Report 1055, University of Wisconsin (2002)
Bache, K., Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA, http://archive.ics.uci.edu/ml
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013), http://www.R-project.org/
Koenker, R.: quantreg: Quantile Regression. R package version 5.05 (2013), http://CRAN.R-project.org/package=quantreg
Meinhausen, N.: quantregForest: Quantile Regression Forests. R package version 0.2-3 (2012), http://CRAN.R-project.org/package=quantregForest
Mosteller, F., Tukey, J.W.: Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley (1977)
Portnoy, S., Koenker, R.: The gaussian hare and the laplacian tortoise: Computability of squared-error versus absolute-error estimates. Statistical Science 12, 279–300 (1997)
He, X., Ng, P., Portnoy, S.: Bivariate quantile smoothing splines. Journal of the Royal Statistical Society B 3, 537–550 (1998)
Biau, G.: Analysis of a Random Forests Model. Journal of Machine Learning Research 13, 1063–1095 (2012)
Koenker, R., Ng, P., Portnoy, S.: Quantile smoothing splines. Biometrika 81, 673–680 (1994)
Chaudhuri, P., Loh, W.: Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli 8, 561–576 (2002)
Barrodale, I., Roberts, F.: Solution of an overdetermined system of equations in the 1 norm. Communications of the ACM, 17, 319–320
Koenker, R., d’Orey, V.: Computing Regression Quantiles. Applied Statistics 36, 383–393 (1987)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning (1995)
Srinivasan, D.: Energy demand prediction using GMDH networks. Neurocomputing 72(1-3), 625–629 (2008)
Buchinsky, M.: Changes in U.S. Wage Structure 1963–1987: An Application of Quantile Regression. Econometrica 62(2), 405–458 (1994)
Buchinsky, M.: The Dynamics of Changes in the Female Wage Distribution in the USA: A Quantile Regression Approach. Journal of Applied Econometrics 13(1), 1–30 (1997)
Cortez, P., Morais, A.: A Data Mining Approach to Predict Forest Fires using Meteorological Data. In: Neves, J., Santos, M.F., Machado, J. (eds.) New Trends in Artificial Intelligence, Proceedings of the 13th EPIA 2007 - Portuguese Conference on Artificial Intelligence, Guimaraes, Portugal, pp. 512–523. APPIA (2007), http://www.dsi.uminho.pt/~pcortez/fires.pdf , ISBN-13 978-989-95618-0-9
Arias, O., Hallock, K.F., Sosa-Escudero, W.: Individual Heterogeneity in the Returns to Schooling: Instrumental Variables Quantile Regression Using Twins Data. Empirical Economics 26(1), 7–40 (2001)
Eide, E., Showalter, M.: The Effect of School Quality on Student Performance: A Quantile Regression Approach. Economics Letters 58(3), 345–350 (1998)
Levin, J.: For Whom the Reductions Count: A Quantile Regression Analysis of Class Size on Scholastic Achievement. Empirical Economics 26(1), 221–246 (2001)
Mueller, R.: Public- and Private-Sector Wage Differentials in Canada Revisited. Industrial Relations 39(3), 375–400 (2000)
Poterba, J., Rueben, K.: The Distribution of Public Sector Wage Premia: New Evidence Using Quantile Regression Methods, NBER Working Paper No. 4734 (1995)
Knight, K., Bassett, G., Tam, M.S.: Comparing Quantile Estimators for the Linear Model (2000) (preprint)
Newey, W., Powell, J.: Efficient Estimation of Linear and Type I Censored Regression Models Under Conditional Quantile Restrictions. Econometric Theory 6, 295–317 (1990)
Newey, W., Powell, J.: Asymmetric Least Squares Estimation and Testing. Econometrica 55, 819–847
Francke, T., López-Tarazón, J.A., Schröder, B.: Estimation of suspended sediment concentration and yield using linear models, random forests and quantile regression forests. Hydrological Processes
Archibald, S., Roy, D.P., Wilgen, V., Brian, W., Scholes, R.J.: What limits fire? An examination of drivers of burnt area in Southern Africa. Global Change Biology 15(3), 613–630 (2009)
Satten, G.A., Datta, S., Moura, H., Woolfitt, A.R., Carvalho, M.G., Carlone, G.M., De, B.K., Pavlopoulos, A., Barr, J.R.: Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens. Bioinformatics 20(17), 3128–3136 (2004)
Bossavy, A., Girard, R., Kariniotakis, G.: Forecasting uncertainty related to ramps of wind power production. In: European Wind Energy Conference and Exhibition, EWEC 2010, vol. 2 (2010)
Farquad, M.A.H., et al.: Support vector regression based hybrid rule extraction methods for forecasting. Expert Systems with Applications 37(8), 5577–5589 (2010)
Reddy, K.N., Ravi, V.: Differential evolution trained kernel principal component WNN and kernel binary quantile regression: Application to banking. Knowledge-Based Systems 39, 45–56 (2012)
Zhang, Z., Dai, G., Jordan, M.I.: Matrix-variate Dirichlet process mixture models. In: International Conference on Artificial Intelligence and Statistics, pp. 980–987 (2010)
Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2(3), 18–22 (2002)
Benoit, D.F., Van den Poel, D.: Binary quantile regression: A Bayesian approach based on the asymmetric Laplace distribution. J. Appl. Econ. 27, 1174–1188 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Vadlamani, R., Sharma, A. (2014). Support Vector–Quantile Regression Random Forest Hybrid for Regression Problems. In: Murty, M.N., He, X., Chillarige, R.R., Weng, P. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2014. Lecture Notes in Computer Science(), vol 8875. Springer, Cham. https://doi.org/10.1007/978-3-319-13365-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-13365-2_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13364-5
Online ISBN: 978-3-319-13365-2
eBook Packages: Computer ScienceComputer Science (R0)