Abstract
We present an improved version of random projections that takes advantage of marginal norms. Using a maximum likelihood estimator (MLE), margin-constrained random projections can improve estimation accuracy considerably. Theoretical properties of this estimator are analyzed in detail.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vempala, S.S.: The Random Projection Method. American Mathematical Society, Providence, RI (2004)
Arriaga, R., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. In: Proc. of FOCS, pp. 616–623 (1999) (Also to appear in Machine Learning)
Dasgupta, S.: Learning mixtures of gaussians. In: Proc. of FOCS, New York, pp. 634–644 (1999)
Fradkin, D., Madigan, D.: Experiments with random projections for machine learning. In: Proc. of KDD, Washington, DC, pp. 517–522 (2003)
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proc. of ICML, Washington, DC, pp. 186–193 (2003)
Balcan, M.-F., Blum, A., Vempala, S.S.: On kernels, margins, and low-dimensional mappings. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS, vol. 3244, pp. 194–205. Springer, Heidelberg (2004)
Papadimitriou, C.H., Raghavan, P., Tamaki, H., Vempala, S.: Latent semantic indexing: A probabilistic analysis. In: Proc. of PODS, Seattle, WA, pp. 159–168 (1998)
Achlioptas, D., McSherry, F., Schölkopf, B.: Sampling techniques for kernel methods. In: Proc. of NIPS, Vancouver, BC, Canada, pp. 335–342 (2001)
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to image and text data. In: Proc. of KDD, San Francisco, CA, pp. 245–250 (2001)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proc. of STOC, Montreal, Quebec, Canada, pp. 380–388 (2002)
Ravichandran, D., Pantel, P., Hovy, E.: Randomized algorithms and NLP: Using locality sensitive hash function for high speed noun clustering. In: Proc. of ACL, Ann Arbor, MI, pp. 622–629 (2005)
Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18, 92–106 (2006)
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mapping into Hilbert space. Contemporary Mathematics 26, 189–206 (1984)
Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Proc. of STOC, Dallas, TX, pp. 604–613 (1998)
Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66, 671–687 (2003)
Bartlett, M.S.: Approximate confidence intervals, II. Biometrika 40, 306–317 (1953)
Small, C.G., Wang, J., Yang, Z.: Eliminating multiple root problems in estimation. Statistical Science 15, 313–341 (2000)
Lehmann, E.L., Romano, J.P.: Testing Statistical Hypothesis, 3rd edn. Springer, New York (2005)
Li, P., Paul, D., Narasimhan, R., Cioffi, J.: On the distribution of SINR for the MMSE MIMO receiver and performance analysis. IEEE Trans. Inform. Theory 52, 271–286 (2006)
Li, P., Hastie, T.J., Church, K.W.: Margin-constrained random projections and very sparse random projections. Technical report, Department of Statistics, Stanford University (2006)
Li, P., Church, K.W., Hastie, T.J.: A sketched-based sampling algorithm on sparse data. Technical report, Department of Statistics, Stanford University (2006)
Shenton, L.R., Bowman, K.: Higher moments of a maximum-likelihood estimate. Journal of Royal Statistical Society B 25, 305–317 (1963)
Ferrari, S.L.P., Botter, D.A., Cordeiro, G.M., Cribari-Neto, F.: Second and third order bias reduction for one-parameter family models. Stat. and Prob. Letters 30, 339–345 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, P., Hastie, T.J., Church, K.W. (2006). Improving Random Projections Using Marginal Information. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_46
Download citation
DOI: https://doi.org/10.1007/11776420_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)