Abstract
Random projection is a dimension reduction technique which can be used to estimate Euclidean distances, inner products, angles [9], or even \(l_p\) distances (for even p) [10] between pairs of high dimensional vectors. We extend the work of Li [9] and our prior work [7] to show how marginal information, principal components, and control variates can be used with the multivariate normal distribution to improve the accuracy of the inner product estimate of vectors. We call our method COntrol Variates For Estimation via First Eigenvectors (COVFEFE). We demonstrate the results of COVFEFE on the Arcene and MNIST datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anaraki, F.P., Hughes, S.: Memory and computation efficient PCA via very sparse random projections. In: Proceedings of the 31st International Conference on Machine Learning (2014)
Fowler, J.: Compressive-projection principal component analysis. IEEE Trans. Image Process. 18(10), 2230–2242 (2009)
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 545–552. MIT Press (2005). http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge.pdf
Honda, K., Nonoguchi, R., Notsu, A., Ichihashi, H.: PCA-guided k-means clustering with incomplete data. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1710–1714. IEEE (2011)
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26(189–206), 1 (1984)
Kang, K., Hooker, G.: Improving the recovery of principal components with semi-deterministic random projections. In: 2016 Annual Conference on Information Science and Systems, CISS 2016, Princeton, NJ, USA, 16–18 March 2016, pp. 596–601 (2016). https://doi.org/10.1109/CISS.2016.7460570
Kang, K., Hooker, G.: Random projections with control variates. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, vol. 1, ICPRAM, pp. 138–147. INSTICC, ScitePress (2017)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)
Li, P., Hastie, T.J., Church, K.W.: Improving random projections using marginal information. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 635–649. Springer, Heidelberg (2006). doi:10.1007/11776420_46
Li, P., Mahoney, M.W., She, Y.: Approximating higher-order distances using random projections. CoRR abs/1203.3492 (2012). http://arxiv.org/abs/1203.3492
Loia, V., Tomasiello, S., Vaccaro, A.: Using fuzzy transform in multi-agent based monitoring of smart grids. Inf. Sci. 388, 209–224 (2017)
Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley-Interscience, Hoboken (2005)
Petersen, K.B., Pedersen, M.S.: The matrix cookbook. http://www2.imm.dtu.dk/pubdb/p.php?3274, version 20121115
Ross, S.M.: Simulation, 4th edn. Academic Press Inc., Orlando (2006)
Xu, Q., Ding, C., Liu, J., Luo, B.: PCA-guided search for k-means. Pattern Recogn. Lett. 54, 50–55 (2015)
Acknowledgements
We thank the reviewers who provided us with much helpful comments. This research was supported by the SUTD Faculty Fellow Award.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kang, K. (2017). Using the Multivariate Normal to Improve Random Projections. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2017. IDEAL 2017. Lecture Notes in Computer Science(), vol 10585. Springer, Cham. https://doi.org/10.1007/978-3-319-68935-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-68935-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68934-0
Online ISBN: 978-3-319-68935-7
eBook Packages: Computer ScienceComputer Science (R0)