Abstract
Online estimation of regression functions becomes important in presence of drifts and rapid changes in the training data. In this article we propose a new online training algorithm for SVR, called Priona, which is based on the idea of computing approximate solutions to the primal optimization problem. For the solution of the primal SVR problem we investigated the trade-off between computation time and prediction accuracy for the gradient, diagonally scaled gradient, and Newton descent direction. The choice of a particular buffering strategy did not influence the performance of the algorithm. By using a line search Priona does not require a priori selection of a learning rate which facilitates its practical application. On various benchmark data sets Priona is shown to perform better in terms of prediction accuracy in comparison to the Norma and Silk online SVR algorithms. Further, tests on two artificial data sets show that the online SVR algorithms are able to track temporal changes and drifts of the regression function, if the buffer size and learning rate are selected appropriately.










Similar content being viewed by others
Notes
\(\beta(\rho) = \beta + \rho (\bar{\beta} - \beta)\) and \(b(\rho) = b + \rho (\bar{b} - b)\).
References
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. MIT Press.
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., & Watkins, C. (2002). Text classification using string kernels. The Journal of Machine Learning Research, 2, 419–444.
Gärtner, T., Flach, P., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In LNAI (Vol. 2777, pp. 129–143).
Shphigelman, I., Singer, Y., Paz, R., & Vaadia, E. (2005). Spikernels: Predicting arm movements by embedding population spike rate patterns in inner-product spaces. Neural Computation, 17, 671–690.
Lal, T. N., Schröder, M., Hinterberger, T., Weston, J., Bogdan, M., Birbaumer, N., et al. (2004). Support vector channel selection in BCI. IEEE Transactions on Biomedical Engineering, 51(6), 1003–1010.
Rasch, M. J., Gretton, A., Murayama, Y., Maass, W., & Logothetis, N. K. (2008). Inferring spike trains from local field potentials. Journal of Neurophysiology, 99, 1461–1476.
Brugger, D., Butovas, S., Bogdan, M., Schwarz, C., & Rosenstiel, W. (2008). Direct and inverse solution for a stimulus adaptation problem using SVR. In ESANN proceedings (pp. 397–402). Bruges.
Saunders, C., Gammerman, A., & Vovk, V. (1998). Ride regression learning algorithm in dual variables. In Proceedings of the 15th international conference on machine learning.
Rojo-Alvarez, J. L., Martínez-Ramón, M., de Prado-Cumplido, M., Artes-Rodriguez, A., & Figueiras-Vidal, A. R. (2004). Support vector method for robust ARMA system identification. IEEE Transactions on Signal Processing, 52(1), 155–164.
Ma, J., Theiler, J., & Perkins, S. (2003). Accurate on-line support vector regression. Neural Computation, 15, 2683–2703.
Cauwenberghs, G., & Poggio, T. (2000). Incremental and decremental support vector machine learning. In NIPS (pp. 409–415).
Kivinen, J., Smola, A. J., & Williamson, R. C. (2004). Online learning with kernels. IEEE Transactions on Signal Processing, 52(8), 2165–2175.
Vishwanathan, S. V. N., Schraudolph, N. N., & Smola, A. J. (2006). Step size adaptation in reproducing kernel Hilbert space. JMLR, 7, 1107–1133.
Cheng, L., Vishwanathan, S. V. N., Schuurmans, D., Wang, S., & Caelli, T. (2007). Implicit online learning with kernels. In NIPS (pp. 249–256). MIT Press.
Bordes, A., Ertekin, S., Weston, J., & Bottou, L. (2005). Fast kernel classifiers with online and active learning. Journal of Machine Learning Research, 6, 1579–1619.
Chapelle, O. (2007). Training a support vector machine in the primal. Neural Computation, 19, 1135–1178.
Vapnik, V. N. (1999). The nature of s tatistical learning t heory (2nd ed.). Springer.
Bo, L., Wang, L., & Jiao, L. (2007). Recursive finite newton algorithm for support vector regression in the primal. Neural Computation, 19, 1082–1096.
Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68(3), 337–404.
Kimeldorf, G. S., & Wahba, G. (1970). A correspondence between bayesian estimation on stochastic processes and smoothing by splines. The Annals of Mathematical Statistics, 41(2), 495–502.
Bertsekas, D. P. (2003). Nonlinear programming (2nd ed.). Athena Scientific.
Crammer, K., Kandola, J. S., & Singer, Y. (2003). Online classification on a budget. In NIPS.
Weston, J., Bordes, A., & Bottou, L. (2005). Online (and offline) on an even tighter budget. In Proc. of the 10th int. workshop on artificial intelligence and statistics (pp. 413–420).
Csató, L., & Opper, M. (2002). Sparse on-line gaussian processes. Neural Computation, 14(3), 641–668.
Keerthi, S. S., & DeCoste, D. (2005). A modified finite newton method for fast solution of large scale linear SVMs. JMLR, 6, 341–361.
DiCiccio, T. J., & Efron, B. (1996). Bootstrap confidence intervals. Statistical Science, 11(3), 189–228.
Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46(1–3), 131–159.
Chang, M. W., & Lin, C. J. (2005). Leave-one-out bounds for support vector regression model selection. Neural Computation, 17, 1188–1222.
Author information
Authors and Affiliations
Corresponding author
Additional information
The first author was supported by the Centre for Integrative Neuroscience, Tübingen, Germany.
Rights and permissions
About this article
Cite this article
Brugger, D., Rosenstiel, W. & Bogdan, M. Online SVR Training by Solving the Primal Optimization Problem. J Sign Process Syst 65, 391–402 (2011). https://doi.org/10.1007/s11265-010-0514-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-010-0514-5