Abstract
Uncertainty is a concept associated with data acquisition and analysis, usually appearing in the form of noise or measure error, often due to some technological constraint. In supervised learning, uncertainty affects classification accuracy and yields low quality solutions. For this reason, it is essential to develop machine learning algorithms able to handle efficiently data with imprecision. In this paper we study this problem from a robust optimization perspective. We consider a supervised learning algorithm based on generalized eigenvalues and we provide a robust counterpart formulation and solution in case of ellipsoidal uncertainty sets. We demonstrate the performance of the proposed robust scheme on artificial and benchmark datasets from University of California Irvine (UCI) machine learning repository and we compare results against a robust implementation of Support Vector Machines.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aeberhard, S., Coomans, D., & De Vel, O. (1992a). Comparison of classifiers in high dimensional settings. Tech. Rep. Dept. Math. Statist., James Cook Univ. North Queensland, Australia
Aeberhard, S., Coomans, D., & De Vel, O. (1992b). The classification performance of RDA, Cambridge. Tech. Rep. (pp. 92–01). Dept. of Computer Science/Dept. of Mathematics and Statistics, James Cook University of North Queensland
Andersen, M. S., Dahl, J., Liu, Z., & Vandenberghe, L. (2011). Interior-point methods for large-scale cone programming. Optimization for machine learning. Cambridge: MIT Press.
Ben-Tal, A., El Ghaoui, L., & Nemirovski, A. S. (2009). Robust optimization. Princeton: Princeton University Press.
Ben-Tal, A., & Nemirovski, A. (1998). Robust convex optimization. Mathematics of Operations Research, 23(4), 769–805.
Ben-Tal, A., & Nemirovski, A. (1999). Robust solutions of uncertain linear programs. Operations Research Letters, 25(1), 1–14.
Ben-Tal, A., & Nemirovski, A. (2000). Robust solutions of linear programming problems contaminated with uncertain data. Mathematical Programming, 88(3), 411–424.
Ben-Tal, A., & Nemirovski, A. S. (2002). Robust optimization—methodology and applications. Mathematical Programming, 92(3), 453–480.
Bertsimas, D., Brown, D. B., & Caramanis, C. (2011). Theory and applications of robust optimization. SIAM Review 53(3), 464–501.
Bertsimas, D., & Sim, M. (2004). The price of robustness. Operations Research, 52(1), 35–53.
Caramanis, C., Mannor, S., & Xu, H. (2011). Robust optimization in machine learning. In S. Sra, S. Nowozin, & S. J. Wright (Eds.), Optimization for machine learning (pp. 369–402). Cambridge: MIT Press.
Cifarelli, C., Guarracino, M. R., Seref, O., Cuciniello, S., & Pardalos, P. M. (2007). Incremental classification with generalized eigenvalues. Journal of Classification, 24(2), 205–219.
Cortes, C., & Vapnik, V. N. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
D’Aspremont, A., Ghaoui, L., Jordan, M., & Lanckriet, G. (2004). A direct formulation for sparse PCA using semidefinite programming. SIAM Review, 49(3), 434–448.
El Ghaoui, L., & Lebret, H. (1997). Robust solutions to least-squares problems with uncertain data. SIAM Journal on Matrix Analysis and Applications, 18, 1035–1064.
El Ghaoui, L., Oustry, F., Lebret, H., et al. (1998). Robust solutions to uncertain semidefinite programs. SIAM Journal on Optimization, 9, 33–52.
Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics, 13(1), 1–50.
Fisher, R., et al. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Guarracino, M. R., Cifarelli, C., Seref, O., & Pardalos, P. M. (2007). A classification method based on generalized eigenvalue problems. Optimization Methods & Software, 22(1), 73–81.
Huber, P., & Ronchetti, E. (1981). MyiLibrary: robust statistics (Vol. 1). Hoboken: Wiley Online Library.
Hubert, M., Rousseeuw, P., & Vanden Branden, K. (2005). ROBPCA: a new approach to robust principal component analysis. Technometrics, 47(1), 64–79.
Irpino, A., Guarracino, M. R., & Verde, R. (2010). Multiclass generalized eigenvalue proximal support vector machines. In 4th IEEE conference on complex, intelligent and software intensive systems (CISIS 2010). (pp. 25–32). Los Alamitos: IEEE Computer Society.
Kim, S. J., & Boyd, S. (2008). A minimax theorem with applications to machine learning, signal processing, and finance. SIAM Journal on Optimization, 19(3), 1344–1367.
Kim, S. J., Magnani, A., & Boyd, S. (2006). Robust fisher discriminant analysis. Advances in Neural Information Processing Systems, 18, 659.
Mangasarian, O. L., & Wild, E. W. (2006). Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 69–74.
Musicant, D. R. (1998). NDC: normally distributed clustered datasets. http://www.cs.wisc.edu/dmi/svm/ndc/.
Pothin, J., & Richard, C. (2006). Incorporating prior information into support vector machines in the form of ellipsoidal knowledge sets. Citeseer.
Shahbazpanahi, S., Gershman, A., Luo, Z., & Wong, K. (2003). Robust adaptive beamforming using worst-case SINR optimization: a new diagonal loading-type solution for general-rank signal models. In 2003 IEEE international conference on acoustics, speech, and signal processing. Proceedings (ICASSP’03) (Vol. 5). New York: IEEE Press.
Shivaswamy, P., Bhattacharyya, C., & Smola, A. (2006). Second order cone programming approaches for handling missing and uncertain data. The Journal of Machine Learning Research, 7, 1283–1314.
Smith, J. W., Everhart, J. E., Dickson, W. C., Knowler, W. C., & Johannes, R. S. (1988). Using the adap learning algorithm to forecast the onset of diabetes mellitus. Johns Hopkins APL Technical Digest, 10, 262–266.
Smola, A. J., Schölkopf, B., & Müller, K. R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11(4), 637–649.
Song, Q., Hu, W., & Xie, W. (2002). Robust support vector machine with bullet hole image classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 32(4), 440–448.
Trafalis, T. B., & Gilbert, R. C. (2006). Robust classification and regression using support vector machines. European Journal of Operational Research, 173(3), 893–909.
Trafalis, T. B., & Gilbert, R. C. (2007). Robust support vector machines for classification and computational issues. Optimization Methods & Software, 22(1), 187–198.
Vandenberghe, L. (2010). The CVXOPT linear and quadratic cone program solvers.
Vapnik, V. N. (1999). The nature of statistical learning theory. Information science and statistics. Berlin: Springer.
Verdú, S., & Poor, H. (2002). On minimax robustness: a general approach and applications. IEEE Transactions on Information Theory, 30(2), 328–340.
Xanthopoulos, P., Pardalos, P. M., & Trafalis, T. B. (2012). Robust data mining. New York: Springer.
Xu, H., Caramanis, C., & Mannor, S. (2009). Robustness and regularization of support vector machines. Journal of Machine Learning Research, 10, 1485–1510.
Xu, H., Caramanis, C., & Mannor, S. (2010). Robust regression and lasso. IEEE Transactions on Information Theory, 56(7), 3561–3574.
Acknowledgements
This project was partially funded by National Science Foundation (N.S.F.) grants and Italian Flagship Project Interomics funded by MIUR and CNR.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xanthopoulos, P., Guarracino, M.R. & Pardalos, P.M. Robust generalized eigenvalue classifier with ellipsoidal uncertainty. Ann Oper Res 216, 327–342 (2014). https://doi.org/10.1007/s10479-012-1303-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-012-1303-2