Skip to main content
Log in

Support vector regression for polyhedral and missing data

  • S.I.: Data Mining and Decision Analytics
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

We introduce “Polyhedral Support Vector Regression” (PSVR), a regression model for data represented by arbitrary convex polyhedral sets. PSVR is derived as a generalization of support vector regression, in which the data is represented by individual points along input variables \(X_1\), \(X_2\), \(\ldots \), \(X_p\) and output variable Y, and extends a support vector classification model previously introduced for polyhedral data. PSVR is in essence a robust-optimization model, which defines prediction error as the largest deviation, calculated along Y, between an interpolating hyperplane and all points within a convex polyhedron; the model relies on the affine Farkas’ lemma to make this definition computationally tractable within the formulation. As an application, we consider the problem of regression with missing data, where we use convex polyhedra to model the multivariate uncertainty involving the unobserved values in a data set. For this purpose, we discuss a novel technique that builds on multiple imputation and principal component analysis to estimate convex polyhedra from missing data, and on a geometric characterization of such polyhedra to define observation-specific hyper-parameters in the PSVR model. We show that an appropriate calibration of such hyper-parameters can have a significantly beneficial impact on the model’s performance. Experiments on both synthetic and real-world data illustrate how PSVR performs competitively or better than other benchmark methods, especially on data sets with high degree of missingness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.

    Article  Google Scholar 

  • Ben-Tal, A., & Nemirovski, A. (2002). Robust optimization-methodology and applications. Mathematical Programming, 92(3), 453–480.

    Article  Google Scholar 

  • Bertsimas, D., Brown, D. B., & Caramanis, C. (2011). Theory and applications of robust optimization. SIAM Review, 53(3), 464–501.

    Article  Google Scholar 

  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Breiman, L., & Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80(391), 580–598.

    Article  Google Scholar 

  • Buuren, S. V., & Groothuis-Oudshoorn, K. (2010). Mice: Multivariate imputation by chained equations in \(R\). Journal of Statistical Software, 45(3), 1–68.

    Google Scholar 

  • Carrizosa, E., & Gordillo, J. (2008). Kernel support vector regression with imprecise output. Tech. Rep., Dept. MOSI, Vrije Univ. Brussel, Belgium.

  • Carrizosa, E., Gordillo, J., & Plastria, F. (2007). Support vector regression for imprecise data. Tech. Rep., Dept. MOSI, Vrije Univ. Brussel, Belgium.

  • Chang, C. C., & Lin, C. J. (2002). Training nu-support vector regression: Theory and algorithms. Neural Computation, 14(8), 1959–1978.

    Article  Google Scholar 

  • Chen-Chia, C., Shun-Feng, S., Jin-Tsong, J., & Chih-Ching, H. (2002). Robust support vector regression networks for function approximation with outliers. IEEE Transactions on Neural Networks, 13(6), 1322–1330.

    Article  Google Scholar 

  • Dimitrov, D., Knauer, C., Kriegel, K., & Rote, G. (2006). On the bounding boxes obtained by principal component analysis. In: Pages 193–196 of: 22nd European workshop on computational geometry.

  • Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. J., & Vapnik, V. N. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.

    Google Scholar 

  • Dua, D., & Graff, C. (2020). UCI machine learning repository. Irvine: School of Information and Computer Sciences, University of California. http://archive.ics.uci.edu/ml.

  • Fan, N., Sadeghi, E., & Pardalos, P. M. (2014). Robust support vector machines with polyhedral uncertainty of the input data. In P. Pardalos, M. Resende, C. Vogiatzis, & J. Walteros (Eds.), Pages 291–305 of: Learning and intelligent optimization. LION 2014. Lecture notes in computer science (Vol. 8426). Cham: Springer.

    Google Scholar 

  • Golub, G. H., & Van Loan, C. F. (2012). Matrix computations. Baltimore, London: JHU Press.

    Google Scholar 

  • Harrison, D, Jr., & Rubinfeld, D. L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81–102.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. Data mining, inference, and prediction. New York: Springer.

    Google Scholar 

  • Hong, D. H., & Hwang, C. (2003). Support vector fuzzy regression machines. Fuzzy Sets and Systems, 138(2), 271–281.

    Article  Google Scholar 

  • Hong, D. H., & Hwang, C. (2004). Extended fuzzy regression models using regularization method. Information Sciences, 164(1–4), 31–46.

    Article  Google Scholar 

  • Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441.

    Article  Google Scholar 

  • Huang, G., Song, S., Wu, C., & You, K. (2012). Robust support vector regression for uncertain input and output data. IEEE Transactions on Neural Networks and Learning Systems, 23(11), 1690–1700.

    Article  Google Scholar 

  • Hwang, S., Kim, D., Jeong, M. K., & Yum, B.-J. (2015). Robust kernel-based regression with bounded influence for outliers. Journal of the Operational Research Society, 66(8), 1385–1398.

    Article  Google Scholar 

  • Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York: Spinger.

    Google Scholar 

  • Kim, D., Lee, C., Hwang, S., & Jeong, M. K. (2016). A robust support vector regression with a linear-log concave loss function. Journal of the Operational Research Society, 67(5), 735–742.

    Article  Google Scholar 

  • Kim, J.-H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics and Data Analysis, 53(11), 3735–3745.

    Article  Google Scholar 

  • Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Pages 1137–1145 of: International joint conference on artificial intelligence (Vol. 2).

  • Lee, K., Kim, N., & Jeong, M. K. (2014). The sparse signomial classification and regression model. Annals of Operations Research, 216(1), 257–286.

    Article  Google Scholar 

  • Lima, C. A. M., Coelho, A. L. V., & Von Zuben, F. J. (2002). Ensembles of support vector machines for regression problems. In: Pages 2381–2386 of: Proceedings of the 2002 international joint conference on neural networks IJCNN’02 (Vol. 3).

  • Little, R. (1988). Missing-data adjustments in large surveys. Journal of Business and Economic Statistics, 6(3), 287–296.

    Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (2014). Statistical analysis with missing data. New York: Wiley.

    Google Scholar 

  • Mangasarian, O., Shavlik, J., & Wild, E. (2004). Knowledge-based kernel approximation. The Journal of Machine Learning Research, 5, 1127–1141.

    Google Scholar 

  • Mangasarian, O., & Wild, E. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18(1), 300–306.

    Article  Google Scholar 

  • Martín-Guerrero, J. D., Camps-Valls, G., Soria-Olivas, E., Serrano-López, A. J., Pérez-Ruixo, J. J., & Jiménez-Torres, N. V. (2003). Dosage individualization of erythropoietin using a profile-dependent support vector regression. IEEE Transactions on Biomedical Engineering, 50(10), 1136–1142.

    Article  Google Scholar 

  • Myasnikova, E., Samsonova, A., Samsonova, M., & Reinitz, J. (2002). Support vector regression applied to the determination of the developmental age of a Drosophila embryo from its segmentation gene expression patterns. Bioinformatics, 18(s1), 87–95.

    Article  Google Scholar 

  • Panagopoulos, O. P., Xanthopoulos, P., Razzaghi, T., & Şeref, O. (2018). Relaxed support vector regression. Annals of Operations Research, 276(1–2), 191–210.

    Google Scholar 

  • Park, J. I., Kim, N., Jeong, M. K., & Shin, K. S. (2013). Multiphase support vector regression for function approximation with break-points. Journal of the Operational Research Society, 64(5), 775–785.

    Article  Google Scholar 

  • Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.

    Google Scholar 

  • Raghunathan, T. E., Lepkowski, J. M., & Van Hoewyk, J. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–95.

    Google Scholar 

  • Rätsch, G., Demiriz, A., & Bennett, K. P. (2002). Sparse regression ensembles in infinite and finite hypothesis spaces. Machine Learning, 48(1–3), 189–218.

    Article  Google Scholar 

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

    Book  Google Scholar 

  • Rubin, D. B. (1996). Multiple imputation after 18\(+\) years. Journal of the American Statistical Association, 91(434), 473–518.

    Article  Google Scholar 

  • Schenker, N., & Taylor, J. M. G. (1996). Partially parametric techniques for multiple imputation. Computational Statistics and Data Analysis, 22(4), 425–446.

    Article  Google Scholar 

  • Shivaswamy, P., Bhattacharyya, C., & Smola, A. (2006). Second order cone programming approaches for handling missing and uncertain data. The Journal of Machine Learning Research, 7, 1283–1314.

    Google Scholar 

  • Smola, A. J. (1996). Regression estimation with support vector learning machines. Ph.D. thesis, Master’s thesis, Technische Universität München.

  • Smola, A. J., & Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.

    Article  Google Scholar 

  • Trafalis, T. B., & Alwazzi, S. A. (2007). Support vector regression with noisy data: A second order cone programming approach. International Journal of General Systems, 36(2), 237–250.

    Article  Google Scholar 

  • Trafalis, T. B., & Gilbert, R. C. (2006). Robust classification and regression using support vector machines. European Journal of Operational Research, 173(3), 893–909.

    Article  Google Scholar 

  • Van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16(3), 219–242.

    Article  Google Scholar 

  • Van Buuren, S., Boshuizen, H. C., & Knook, D. L. (1999). Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine, 18(6), 681–694.

    Article  Google Scholar 

  • Van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76(12), 1049–1064.

    Article  Google Scholar 

  • Vapnik, V. N. (1995). The Nature of statistical learning. New York: Springer.

    Book  Google Scholar 

  • Wang, Y. M., Schultz, R. T., Constable, R. T., & Staib, L. H. (2003). Nonlinear estimation and modeling of fMRI data using spatio-temporal support vector regression. In: Pages 647–659 of: Biennial international conference on information processing in medical imaging.

  • Wu, C.-H., Wei, C.-C., Su, D.-C., Chang, M.-H., & Ho, J.-M. (2004). Travel time prediction with support vector regression. IEEE Transactions on Intelligent Transportation Systems, 5(4), 276–281.

    Article  Google Scholar 

  • Yang, H., Chan, L., & King, I. (2002). Support vector machine regression for volatile stock market prediction. In: Pages 391–396 of: International conference on intelligent data engineering and automated learning.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Myong K. Jeong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Theorem 1

In order to make the two \(\max \) constraints in (6) feasible, the following systems of linear inequalities must be both infeasible:

$$\begin{aligned} \left\{ \begin{array}{ll} y-\varvec{w}^T\varvec{x} - w_0 > \xi _i+\epsilon _i\\ \varvec{A}_i \begin{pmatrix} \varvec{x} \\ y \end{pmatrix} \le \varvec{a}_i \end{array} \right. \end{aligned}$$
(17)
$$\begin{aligned} \left\{ \begin{array}{ll} \varvec{w}^T\varvec{x} + w_0 - y > \xi _i+\epsilon _i\\ \varvec{A}_i \begin{pmatrix} \varvec{x} \\ y \end{pmatrix} \le \varvec{a}_i \end{array} \right. \end{aligned}$$
(18)

By the affine Farkas’ lemma (Boyd and Vandenberghe 2004), alternative systems (19) and (20) must therefore be both feasible:

$$\begin{aligned} \begin{aligned}&\varvec{u}_i \ge \varvec{0} \quad \text{ and } \left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{u}_i = \begin{pmatrix} -\varvec{w} \\ 1 \end{pmatrix} \\ \varvec{a}_i^T \varvec{u}_i - w_0 \le \xi _i+\epsilon _i \end{array} \right.&\text{ or }&\left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{u}_i = \varvec{0}\\ \varvec{a}_i^T \varvec{u}_i < 0 \end{array} \right. \end{aligned} \end{aligned}$$
(19)
$$\begin{aligned} \begin{aligned}&\quad \varvec{v}_i \ge \varvec{0} \quad \ \text{ and } \left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{v}_i = \begin{pmatrix} \varvec{w} \\ -1 \end{pmatrix} \\ \varvec{a}_i^T \varvec{v}_i + w_0 \le \xi _i+\epsilon _i \end{array} \right.&\text{ or }&\left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{v}_i = \varvec{0}\\ \varvec{a}_i^T \varvec{v}_i < 0 \end{array} \right. \end{aligned} \end{aligned}$$
(20)

The feasibility of (19) and (20), and the non-emptiness of \(P_i\) imply the following:

$$\begin{aligned} \left\{ \begin{array}{ll} \varvec{u}_i^T\left( \varvec{A}_i \begin{pmatrix} \varvec{x} \\ y \end{pmatrix} - \varvec{a}_i\right) \le 0\\ \varvec{v}_i^T\left( \varvec{A}_i \begin{pmatrix} \varvec{x} \\ y \end{pmatrix} - \varvec{a}_i\right) \le 0 \end{array} \right. \end{aligned}$$
(21)

Now, since both

$$\begin{aligned} \begin{aligned} \left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{u}_i = \varvec{0}\\ \varvec{a}_i^T \varvec{u}_i < 0 \end{array} \right. \end{aligned} \end{aligned}$$
(22)

and

$$\begin{aligned} \left\{ \begin{array}{ll} \varvec{A}_i^T \varvec{v}_i = \varvec{0}\\ \varvec{a}_i^T \varvec{v}_i < 0 \end{array} \right. \end{aligned}$$
(23)

contradict (21), neither can hold true, which leads to formulation (7). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gazzola, G., Jeong, M.K. Support vector regression for polyhedral and missing data. Ann Oper Res 303, 483–506 (2021). https://doi.org/10.1007/s10479-020-03799-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-020-03799-y

Keywords

Navigation