Abstract
We introduce “Polyhedral Support Vector Regression” (PSVR), a regression model for data represented by arbitrary convex polyhedral sets. PSVR is derived as a generalization of support vector regression, in which the data is represented by individual points along input variables \(X_1\), \(X_2\), \(\ldots \), \(X_p\) and output variable Y, and extends a support vector classification model previously introduced for polyhedral data. PSVR is in essence a robust-optimization model, which defines prediction error as the largest deviation, calculated along Y, between an interpolating hyperplane and all points within a convex polyhedron; the model relies on the affine Farkas’ lemma to make this definition computationally tractable within the formulation. As an application, we consider the problem of regression with missing data, where we use convex polyhedra to model the multivariate uncertainty involving the unobserved values in a data set. For this purpose, we discuss a novel technique that builds on multiple imputation and principal component analysis to estimate convex polyhedra from missing data, and on a geometric characterization of such polyhedra to define observation-specific hyper-parameters in the PSVR model. We show that an appropriate calibration of such hyper-parameters can have a significantly beneficial impact on the model’s performance. Experiments on both synthetic and real-world data illustrate how PSVR performs competitively or better than other benchmark methods, especially on data sets with high degree of missingness.
Similar content being viewed by others
References
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.
Ben-Tal, A., & Nemirovski, A. (2002). Robust optimization-methodology and applications. Mathematical Programming, 92(3), 453–480.
Bertsimas, D., Brown, D. B., & Caramanis, C. (2011). Theory and applications of robust optimization. SIAM Review, 53(3), 464–501.
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.
Breiman, L., & Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80(391), 580–598.
Buuren, S. V., & Groothuis-Oudshoorn, K. (2010). Mice: Multivariate imputation by chained equations in \(R\). Journal of Statistical Software, 45(3), 1–68.
Carrizosa, E., & Gordillo, J. (2008). Kernel support vector regression with imprecise output. Tech. Rep., Dept. MOSI, Vrije Univ. Brussel, Belgium.
Carrizosa, E., Gordillo, J., & Plastria, F. (2007). Support vector regression for imprecise data. Tech. Rep., Dept. MOSI, Vrije Univ. Brussel, Belgium.
Chang, C. C., & Lin, C. J. (2002). Training nu-support vector regression: Theory and algorithms. Neural Computation, 14(8), 1959–1978.
Chen-Chia, C., Shun-Feng, S., Jin-Tsong, J., & Chih-Ching, H. (2002). Robust support vector regression networks for function approximation with outliers. IEEE Transactions on Neural Networks, 13(6), 1322–1330.
Dimitrov, D., Knauer, C., Kriegel, K., & Rote, G. (2006). On the bounding boxes obtained by principal component analysis. In: Pages 193–196 of: 22nd European workshop on computational geometry.
Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. J., & Vapnik, V. N. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Dua, D., & Graff, C. (2020). UCI machine learning repository. Irvine: School of Information and Computer Sciences, University of California. http://archive.ics.uci.edu/ml.
Fan, N., Sadeghi, E., & Pardalos, P. M. (2014). Robust support vector machines with polyhedral uncertainty of the input data. In P. Pardalos, M. Resende, C. Vogiatzis, & J. Walteros (Eds.), Pages 291–305 of: Learning and intelligent optimization. LION 2014. Lecture notes in computer science (Vol. 8426). Cham: Springer.
Golub, G. H., & Van Loan, C. F. (2012). Matrix computations. Baltimore, London: JHU Press.
Harrison, D, Jr., & Rubinfeld, D. L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81–102.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. Data mining, inference, and prediction. New York: Springer.
Hong, D. H., & Hwang, C. (2003). Support vector fuzzy regression machines. Fuzzy Sets and Systems, 138(2), 271–281.
Hong, D. H., & Hwang, C. (2004). Extended fuzzy regression models using regularization method. Information Sciences, 164(1–4), 31–46.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441.
Huang, G., Song, S., Wu, C., & You, K. (2012). Robust support vector regression for uncertain input and output data. IEEE Transactions on Neural Networks and Learning Systems, 23(11), 1690–1700.
Hwang, S., Kim, D., Jeong, M. K., & Yum, B.-J. (2015). Robust kernel-based regression with bounded influence for outliers. Journal of the Operational Research Society, 66(8), 1385–1398.
Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York: Spinger.
Kim, D., Lee, C., Hwang, S., & Jeong, M. K. (2016). A robust support vector regression with a linear-log concave loss function. Journal of the Operational Research Society, 67(5), 735–742.
Kim, J.-H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics and Data Analysis, 53(11), 3735–3745.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Pages 1137–1145 of: International joint conference on artificial intelligence (Vol. 2).
Lee, K., Kim, N., & Jeong, M. K. (2014). The sparse signomial classification and regression model. Annals of Operations Research, 216(1), 257–286.
Lima, C. A. M., Coelho, A. L. V., & Von Zuben, F. J. (2002). Ensembles of support vector machines for regression problems. In: Pages 2381–2386 of: Proceedings of the 2002 international joint conference on neural networks IJCNN’02 (Vol. 3).
Little, R. (1988). Missing-data adjustments in large surveys. Journal of Business and Economic Statistics, 6(3), 287–296.
Little, R. J. A., & Rubin, D. B. (2014). Statistical analysis with missing data. New York: Wiley.
Mangasarian, O., Shavlik, J., & Wild, E. (2004). Knowledge-based kernel approximation. The Journal of Machine Learning Research, 5, 1127–1141.
Mangasarian, O., & Wild, E. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18(1), 300–306.
Martín-Guerrero, J. D., Camps-Valls, G., Soria-Olivas, E., Serrano-López, A. J., Pérez-Ruixo, J. J., & Jiménez-Torres, N. V. (2003). Dosage individualization of erythropoietin using a profile-dependent support vector regression. IEEE Transactions on Biomedical Engineering, 50(10), 1136–1142.
Myasnikova, E., Samsonova, A., Samsonova, M., & Reinitz, J. (2002). Support vector regression applied to the determination of the developmental age of a Drosophila embryo from its segmentation gene expression patterns. Bioinformatics, 18(s1), 87–95.
Panagopoulos, O. P., Xanthopoulos, P., Razzaghi, T., & Şeref, O. (2018). Relaxed support vector regression. Annals of Operations Research, 276(1–2), 191–210.
Park, J. I., Kim, N., Jeong, M. K., & Shin, K. S. (2013). Multiphase support vector regression for function approximation with break-points. Journal of the Operational Research Society, 64(5), 775–785.
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.
Raghunathan, T. E., Lepkowski, J. M., & Van Hoewyk, J. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–95.
Rätsch, G., Demiriz, A., & Bennett, K. P. (2002). Sparse regression ensembles in infinite and finite hypothesis spaces. Machine Learning, 48(1–3), 189–218.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Rubin, D. B. (1996). Multiple imputation after 18\(+\) years. Journal of the American Statistical Association, 91(434), 473–518.
Schenker, N., & Taylor, J. M. G. (1996). Partially parametric techniques for multiple imputation. Computational Statistics and Data Analysis, 22(4), 425–446.
Shivaswamy, P., Bhattacharyya, C., & Smola, A. (2006). Second order cone programming approaches for handling missing and uncertain data. The Journal of Machine Learning Research, 7, 1283–1314.
Smola, A. J. (1996). Regression estimation with support vector learning machines. Ph.D. thesis, Master’s thesis, Technische Universität München.
Smola, A. J., & Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.
Trafalis, T. B., & Alwazzi, S. A. (2007). Support vector regression with noisy data: A second order cone programming approach. International Journal of General Systems, 36(2), 237–250.
Trafalis, T. B., & Gilbert, R. C. (2006). Robust classification and regression using support vector machines. European Journal of Operational Research, 173(3), 893–909.
Van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16(3), 219–242.
Van Buuren, S., Boshuizen, H. C., & Knook, D. L. (1999). Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine, 18(6), 681–694.
Van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12), 1049–1064.
Vapnik, V. N. (1995). The Nature of statistical learning. New York: Springer.
Wang, Y. M., Schultz, R. T., Constable, R. T., & Staib, L. H. (2003). Nonlinear estimation and modeling of fMRI data using spatio-temporal support vector regression. In: Pages 647–659 of: Biennial international conference on information processing in medical imaging.
Wu, C.-H., Wei, C.-C., Su, D.-C., Chang, M.-H., & Ho, J.-M. (2004). Travel time prediction with support vector regression. IEEE Transactions on Intelligent Transportation Systems, 5(4), 276–281.
Yang, H., Chan, L., & King, I. (2002). Support vector machine regression for volatile stock market prediction. In: Pages 391–396 of: International conference on intelligent data engineering and automated learning.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Theorem 1
In order to make the two \(\max \) constraints in (6) feasible, the following systems of linear inequalities must be both infeasible:
By the affine Farkas’ lemma (Boyd and Vandenberghe 2004), alternative systems (19) and (20) must therefore be both feasible:
The feasibility of (19) and (20), and the non-emptiness of \(P_i\) imply the following:
Now, since both
and
contradict (21), neither can hold true, which leads to formulation (7). \(\square \)
Rights and permissions
About this article
Cite this article
Gazzola, G., Jeong, M.K. Support vector regression for polyhedral and missing data. Ann Oper Res 303, 483–506 (2021). https://doi.org/10.1007/s10479-020-03799-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-020-03799-y