Abstract
Statistical learning theory (SLT) provides the theoretical basis for many machine learning algorithms (e.g. SVMs and kernel methods). Invariance, as one type of popular prior knowledge in pattern analysis, has been widely incorporated into various statistical learning algorithms to improve learning performance. Though successful in some applications, existing invariance learning algorithms are task-specific, and lack a solid theoretical basis including consistency. In this paper, we first propose the problem of statistical learning with group invariance (or group invariance learning in short) to provide a unifying framework for existing invariance learning algorithms in pattern analysis by exploiting group invariance. We then introduce the group invariance empirical risk minimization (GIERM) method to solve the group invariance learning problem, which incorporates the group action on the original data into empirical risk minimization (ERM). Finally, we investigate the consistency of the GIERM method in detail. Our theoretical results include three theorems, covering the necessary and sufficient conditions of consistency, uniform two-sided convergence and uniform one-sided convergence for the group invariance learning process based on the GIERM method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Vapnik VN (1998) Statistical learning theory. Wiley, New York
von Luxburg, U, Schölkopf B (2011) Statistical learning theory: models, concepts, and results. In: Handbook of the history of logic, vol 10, pp 651–706. Elsevier
Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press, Cambridge, MA
Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71(7):1578–1594
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82
Haasdonk B, Burkhardt H (2007) Invariant kernel functions for pattern analysis and machine learning. Mach Learn 68(1):35–61
Wood J (1996) Invariant pattern recognition: a review. Pattern Recogn 29(1):1–17
Kondor R (2008) Group theoretical methods in machine learning. Ph.D. thesis, Columbia University
Simard PY, Cun YL, Denker JS (1993) Efficient pattern recognition using a new transformation distance. In: Hanson S, Cowan J, Giles C (eds) Advances in neural information processing systems 5. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 50–58
Simard PY, Cun YL, Denker JS, Victorri B (1998) Transformation invariance in pattern recognition - tangent distance and tangent propagation. In: Orr GB, Müller KR (eds) Neural networks: tricks of the trade, Lecture Notes in Computer Science, vol 1524, pp 239–274. Springer
Schölkopf B, Burges C, Vapnik VN (1996) Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, Vorbrüggen JC, Sendhoff B (eds) Proceedings of ICANN 96: Artificial Neural Networks, pp 47–52. Springer(1996)
Niyogi P, Girosi F, Poggio T (1998) Incorporating prior information in machine learning by creating virtual examples. Proc IEEE 86(11):2196–2209
DeCoste D, Schölkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190
Schulz-Mirbach H, Schölkopf B (1994) Constructing invariant features by averaging techniques. In: Proceedings of the 12th International Conference on Pattern Recognition (ICPR’94), pp 387–390. IEEE, Jerusalem, Israel
Kondor R, Jebara T (2003) A kernel between sets of vectors. In: Fawcett T, Mishra N (eds) Proceedings of the 20th International Conference on Machine Learning (ICML’03), pp 361–368. AAAI Press, Washington, DC (2003)
Wang L, Gao Y, Chan KL, Xue P, Yau WY (2005) Retrieval with knowledge-driven kernel design: an approach to improving svm based cbir with relevance feedback. In: Proceedings of the 10th International Conference on Computer Vision (ICCV’05), vol 2, pp 1355–1362. IEEE, Beijing, China
Reisert M, Burkhardt H (2007) Learning equivariant functions with matrix valued kernels. J Mach Learn Res 8(3):385–408
Graepel T, Herbrich R (2004) Invariant pattern recognition by semidefinite programming machines. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in Neural Information Processing Systems 16 (NIPS 2003), pp 33–40. MIT Press
Bhattacharyya C, Shivaswamy PK, Smola AJ (2005) A second order cone programming formulation for classifying missing data. In: Saul L, Weiss Y, Bottou L (eds) Advances in Neural Information Processing Systems 17 (NIPS 2004), pp 153–160. MIT Press
Shivaswamy PK, Jebara T (2006) Permutation invariant svms. In: Cohen WW, Moore A (eds) Proceedings of the 23rd International Conference on Machine Learning (ICML’06), pp 817–824. ACM, Pittsburgh, USA
Jebara T (2003) Convex invariance learning. In: Bishop CM, Frey BJ (eds) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AI & Statistics’03). Key West, Florida
Teo CH, Globerson A, Roweis ST, Smola AJ (2008) Convex learning with invariances. In: Advances in Neural Information Processing Systems 20 (NIPS 2007), pp 1489–1496. Curran Associates, Inc.
Kumar MP, Torr PHS, Zisserman A (2007) An invariant large margin nearest neighbour classifier. In: Proceedings of the 11th International Conference on Computer Vision (ICCV 2007), pp 1–8. IEEE, Rio de Janeiro
Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector regression. Mach Learn 70(1):89–118
Vedaldi A, Blaschko M, Zisserman A (2011) Learning equivariant structured output svm regressors. In: Proceedings of the 13th International Conference on Computer Vision (ICCV’11). pp 959–966. IEEE, Barcelona
Eaton ML (1989) Group invariance applications in statistics. In: Regional conference series in Probability and Statistics, vol 1, pp i–v+1–133. Institute of Mathematical Statistics
Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, Berlin
Acknowledgements
This work was partially supported by National Natural Science Foundation (NSFC) under Grant No. U1636205.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Xu, W., Huang, D. & Zhou, S. Statistical learning with group invariance: problem, method and consistency. Int. J. Mach. Learn. & Cyber. 10, 1503–1511 (2019). https://doi.org/10.1007/s13042-018-0829-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-018-0829-2