Random Projection, Margins, Kernels, and Feature-Selection

Blum, Avrim

doi:10.1007/11752790_3

Avrim Blum²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3940))

Included in the following conference series:

International Statistical and Optimization Perspectives Workshop "Subspace, Latent Structure and Feature Selection"

3994 Accesses
32 Citations

Abstract

Random projection is a simple technique that has had a number of applications in algorithm design. In the context of machine learning, it can provide insight into questions such as “why is a learning problem easier if data is separable by a large margin?” and “in what sense is choosing a kernel much like choosing a set of features?” This talk is intended to provide an introduction to random projection and to survey some simple learning algorithms and other applications to learning based on it. I will also discuss how, given a kernel as a black-box function, we can use various forms of random projection to extract an explicit small feature space that captures much of what the kernel is doing. This talk is based in large part on work in [BB05, BBV04] joint with Nina Balcan and Santosh Vempala.

Download to read the full chapter text

Chapter PDF

Random Projections for Large-Scale Regression

Representation Using Linear Combinations

Kernel Ridge Regression

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Achlioptas, D.: Database-friendly random projections. Journal of Computer and System Sciences 66(4), 671–687 (2003)
Article MathSciNet MATH Google Scholar
Arriaga, R.I., Vempala, S.: An algorithmic theory of learning, robust concepts and random projection. In: Proceedings of the 40th Annual IEEE Symposium on Foundation of Computer Science, pp. 616–623 (1999)
Google Scholar
Balcan, M.-F., Blum, A.: A PAC-style model for learning from labeled and unlabeled data. In: Proceedings of the 18th Annual Conference on Computational Learning Theory (COLT), pp. 111–126 (2005)
Google Scholar
Balcan, M.-F., Blum, A.: On a theory of kernels as similarity functions (mansucript, 2006)
Google Scholar
Balcan, M.F., Blum, A., Vempala, S.: Kernels as features: On kernels, margins, and low-dimensional mappings. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS, vol. 3244, pp. 194–205. Springer, Heidelberg (2004), An extended version is available at: http://www.cs.cmu.edu/~avrim/Papers/
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory (1992)
Google Scholar
Ben-Israel, A., Greville, T.N.E.: Generalized Inverses: Theory and Applications. Wiley, New York (1974)
MATH Google Scholar
Block, H.D.: The perceptron: A model for brain functioning. Reviews of Modern Physics 34, 123–135 (1962), Reprinted in Neurocomputing, Anderson and Rosenfeld
Google Scholar
Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Dasgupta, S.: Experiments with random projection. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 143–151 (2000)
Google Scholar
Dasgupta, S., Gupta, A.: An elementary proof of the Johnson-Lindenstrauss Lemma. Random Structures & Algorithms 22(1), 60–65 (2002)
Article MATH Google Scholar
Rabani, Y., Kushilevitz, E., Ostrovsky, R.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Computing 30(2), 457–474 (2000)
Article MathSciNet MATH Google Scholar
Fradkin, D., Madigan, D.: Experiments with random projections for machine learning. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 517–522 (2003)
Google Scholar
Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.E.: Large margin classification using the Perceptron algorithm. Machine Learning 37(3), 277–296 (1999)
Article MATH Google Scholar
Goal, N., Bebis, G., Nefian, A.: Face recognition experiments with random projection. In: Proceedings SPIE, vol. 5779, pp. 426–437 (2005)
Google Scholar
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 1115–1145 (1995)
Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 604–613 (1998)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability, pp. 189–206 (1984)
Google Scholar
Littlestone, N.: From on-line to batch learning. In: COLT 1989: Proceedings of the 2nd Annual Workshop on Computational Learning Theory, pp. 269–284 (1989)
Google Scholar
Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks 12, 181–201 (2001)
Article Google Scholar
Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. The MIT Press, Cambridge (1969)
MATH Google Scholar
Novikoff, A.B.J.: On convergence proofs on perceptrons. In: Proceedings of the Symposium on the Mathematical Theory of Automata, vol. XII, pp. 615–622 (1962)
Google Scholar
Schapire, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990)
Google Scholar
Schulman, L.: Clustering for edge-cost minimization. In: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pp. 547–555 (2000)
Google Scholar
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Trans. on Information Theory 44(5), 1926–1940 (1998)
Article MathSciNet MATH Google Scholar
Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons Inc., New York (1998)
MATH Google Scholar
Vempala, S.: Random projection: A new approach to VLSI layout. In: Proceedings of the 39th Annual IEEE Symposium on Foundation of Computer Science, pp. 389–395 (1998)
Google Scholar
Vempala, S.: The Random Projection Method. In: American Mathematical Society. DIMACS: Series in Discrete Mathematics and Theoretical Computer Science (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213-3891, USA
Avrim Blum

Authors

Avrim Blum
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIS Research Group, University of Southampton, Southampton, U.K.
Craig Saunders
Dept. of Knowledge Technologies, Jozef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik
School of Electronics and Computer Science, University of Southampton, Building 1, Highfield Campus, SO17 1BJ, Southampton, UK
Steve Gunn
The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St., WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blum, A. (2006). Random Projection, Margins, Kernels, and Feature-Selection. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds) Subspace, Latent Structure and Feature Selection. SLSFS 2005. Lecture Notes in Computer Science, vol 3940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752790_3

Download citation

DOI: https://doi.org/10.1007/11752790_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34137-6
Online ISBN: 978-3-540-34138-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Random Projection, Margins, Kernels, and Feature-Selection

Abstract

Chapter PDF

Similar content being viewed by others

Random Projections for Large-Scale Regression

Representation Using Linear Combinations

Kernel Ridge Regression

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Random Projection, Margins, Kernels, and Feature-Selection

Abstract

Chapter PDF

Similar content being viewed by others

Random Projections for Large-Scale Regression

Representation Using Linear Combinations

Kernel Ridge Regression

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation