Abstract
Supervised classification is one of the most powerful techniques to analyze data, when a-priori information is available on the membership of data samples to classes. Since the labeling process can be both expensive and time-consuming, it is interesting to investigate semi-supervised algorithms that can produce classification models taking advantage of unlabeled samples. In this paper we propose LapReGEC, a novel technique that introduces a Laplacian regularization term in a generalized eigenvalue classifier. As a result, we produce models that are both accurate and parsimonious in terms of needed labeled data. We empirically prove that the obtained classifier well compares with other techniques, using as little as 5% of labeled points to compute the models.
Similar content being viewed by others
References
Antonelli, L., De Simone, V., & di Serafino, D. (2015). On the application of the spectral projected gradient method in image segmentation. Journal of Mathematical Imaging and Vision, 54(1), 106–116.
Astorino, A., Gorgone, E., Gaudioso, M., & Pallaschke, D. (2011). Data preprocessing in semi-supervised svm classification. Optimization, 60(1–2), 143–151.
Battiti, R. (1992). First and second-order methods for learning: Between steepest descent and Newton’s method. Neural Computation, 4(2), 141–166.
Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 7, 2399–2434.
Bennett, K. P., & Demiriz, A. (1999). Semi-supervised support vector machines. In Advances in neural information processing systems 11: Proceedings of the 1998 conference (Vol. 11). Cambridge: MIT Press.
Birgin, E. G., Martinez, J. M., & Raydan, M. (2014). Spectral projected gradient methods: Review and perspectives. J. Stat. Softw, 60(3), 1–21.
Cafieri, S., D’Apuzzo, M., De Simone, V., Di Serafino, D., & Toraldo, G. (2007). Convergence analysis of an inexact potential reduction method for convex quadratic programming. Journal of Optimization Theory and Applications, 135(3), 355–366.
Chapelle, O., Scholkopf, B., & Zien, A. (2006). Semi-supervised learning. Cambridge: MIT Press.
Chapelle, O., Sindhwani, V., & Keerthi, S. S. (2008). Optimization techniques for semi-supervised support vector machines. The Journal of Machine Learning Research, 9, 203–233.
Chapelle, O., Zien, A. (2005). Semi-supervised classification by low density separation. In: AISTATS, (pp. 57–64).
Chen, W. J., Shao, Y. H., Deng, N. Y., & Feng, Z. L. (2014). Laplacian least squares twin support vector machine for semi-supervised classification. Neurocomputing, 145, 465–476.
Chen, W. J., Shao, Y. H., & Hong, N. (2014). Laplacian smooth twin support vector machine for semi-supervised classification. International Journal of Machine Learning and Cybernetics, 5(3), 459–468.
Chen, W. J., Shao, Y. H., Li, C. N., & Deng, N. Y. (2016). MLTSVM: A novel twin support vector machine to multi-label learning. Pattern Recognition, 52, 61–74.
Chen, W. J., Shao, Y. H., Xu, D. K., & Fu, Y. F. (2014). Manifold proximal support vector machine for semi-supervised classification. Applied Intelligence, 40(4), 623–638.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
Cullum, J. K., & Willoughby, R. A. (2002). Lanczos algorithms for large symmetric eigenvalue computations: Vol. I: Theory. Philadelphia, PA: Society for Industrial and Applied Mathematics.
De Angelis, P. L., & Toraldo, G. (1993). On the identification property of a projected gradient method. SIAM Journal on Numerical Analysis, 30(5), 1483–1497.
De Asmundis, R., di Serafino, D., Hager, W. W., Toraldo, G., & Zhang, H. (2014). An efficient gradient method using the Yuan steplength. Computational Optimization and Applications, 59(3), 541–563.
De Asmundis, R., di Serafino, D., Riccio, F., & Toraldo, G. (2013). On spectral properties of steepest descent methods. IMA Journal of Numerical Analysis, 33, 1416–1435.
di Serafino, D., Ruggiero, V., Toraldo, G., & Zanni, L. (2017). On the steplength selection in gradient methods for unconstrained optimization. Applied Mathematics and Computation,. doi:10.1016/j.amc.2017.07.037M.
Figueiredo, M. A. T., Nowak, R. D., & Wright, S. J. (2007). Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing, 1(4), 586–597.
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.
Guarracino, M. R., Cifarelli, C., Seref, O., & Pardalos, P. M. (2007). A classification method based on generalized eigenvalue problems. Optimisation Methods and Software, 22(1), 73–81.
Guarracino, M.R., Irpino, A., Verde, R. (2010). Multiclass generalized eigenvalue proximal support vector machines. In International conference on complex, intelligent and software intensive systems (CISIS), 2010, (pp. 25–32). IEEE.
Guarracino, M. R., Sangiovanni, M., Severino, G., Toraldo, G., & Viola, M. (2016). On the regularization of generalized eigenvalues classifiers. AIP Conference Proceedings, 1776(1), 040005.
Guarracino, M. R., Xanthopoulos, P., Pyrgiotakis, G., Tomaino, V., Moudgil, B. M., & Pardalos, P. M. (2011). Classification of cancer cell death with spectral dimensionality reduction and generalized eigenvalues. Artificial Intelligence in Medicine, 53(2), 119–125.
Joachims, T. (1999). Transductive inference for text classification using support vector machines. In ICML, (Vol. 99, 200–209).
Lancaster, P., Ye, Q. (1989). Variational properties and Rayleigh quotient algorithms for symmetric matrix pencils. In The Gohberg Anniversary collection, pp. 247–278. Springer.
Leordeanu, M., Zanfir, A., Sminchisescu, C. (2011). Semi-supervised learning and optimization for hypergraph matching. In IEEE international conference on computer vision (ICCV), 2011, (pp. 2274–2281). IEEE.
LapReGEC and GenSyntheticSpheres Download Page. http://www.na.icar.cnr.it/~mariog/lapregec.html
Mangasarian, O. L., & Wild, E. W. (2006). Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 69–74.
Melacci, S., & Belkin, M. (2011). Laplacian support vector machines trained in the primal. The Journal of Machine Learning Research, 12, 1149–1184.
Pi, J., Fenn, M., Pardalos, P. (2016). Detecting silica-coated gold nanostars within surface-enhanced resonance Raman spectroscopy mapping via semi-supervised framework combining feature selection and classification. In 2016 32nd Southern biomedical engineering conference (SBEC), (pp. 89–90). IEEE.
Rätsch, G. (2001). Ida benchmark repository. World Wide Web. http://ida.first.fhg.de/projects/bench/benchmarks.htm.
Saad, Y. (1992). Numerical methods for large eigenvalue problems. Manchester: Manchester University Press.
Sinha, K. (2014). Semi-supervised learning. In C. C. Aggarwal (Ed.), Data classification: Algorithm and applications, data mining and knowledge discovery series (pp. 511–536). Boca Raton, FL: CRC Press.
Tian, Z., Hwang, T., & Kuang, R. (2009). A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics, 25(21), 2831–2838.
Viola, M., Sangiovanni, M., Toraldo, G., & Guarracino, M. R. (2017). A generalized eigenvalues classifier with embedded feature selection. Optimization Letters, 11(2), 299–311.
Wilkinson, J. H. (Ed.). (1988). The algebraic eigenvalue problem. New York, NY: Oxford University Press Inc.
Ye, Q. (1989). Variational principles and numerical algorithms for symmetric matrix pencils. Calgary, AB: University of Calgary Theses.
Zhou, T., Tao, D., Wu, X. (2010). NESVM: A fast gradient method for support vector machines. In IEEE 10th international conference on data mining (ICDM), 2010, (pp. 679–688).
Zhu, X. (2010). Semi-supervised learning. In Encyclopedia of machine learning, (pp. 892–897). Springer.
Zhu, X., & Goldberg, A. B. (2009). Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3(1), 1–130.
Acknowledgements
Mara Sangiovanni was supported by Interomics Italian Flagship Project. Mario Guarracino work has been conducted at National Research Institute University Higher School of Economics and has been supported by the RSF Grant No. 14-41-00039.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Viola, M., Sangiovanni, M., Toraldo, G. et al. Semi-supervised generalized eigenvalues classification. Ann Oper Res 276, 249–266 (2019). https://doi.org/10.1007/s10479-017-2674-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-017-2674-1