Using the Multivariate Normal to Improve Random Projections

Kang, Keegan

doi:10.1007/978-3-319-68935-7_43

Keegan Kang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10585))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

2125 Accesses
4 Citations
39 Altmetric

Abstract

Random projection is a dimension reduction technique which can be used to estimate Euclidean distances, inner products, angles [9], or even \(l_p\) distances (for even p) [10] between pairs of high dimensional vectors. We extend the work of Li [9] and our prior work [7] to show how marginal information, principal components, and control variates can be used with the multivariate normal distribution to improve the accuracy of the inner product estimate of vectors. We call our method COntrol Variates For Estimation via First Eigenvectors (COVFEFE). We demonstrate the results of COVFEFE on the Arcene and MNIST datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anaraki, F.P., Hughes, S.: Memory and computation efficient PCA via very sparse random projections. In: Proceedings of the 31st International Conference on Machine Learning (2014)
Google Scholar
Fowler, J.: Compressive-projection principal component analysis. IEEE Trans. Image Process. 18(10), 2230–2242 (2009)
Article MathSciNet Google Scholar
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 545–552. MIT Press (2005). http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge.pdf
Honda, K., Nonoguchi, R., Notsu, A., Ichihashi, H.: PCA-guided k-means clustering with incomplete data. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1710–1714. IEEE (2011)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26(189–206), 1 (1984)
MathSciNet MATH Google Scholar
Kang, K., Hooker, G.: Improving the recovery of principal components with semi-deterministic random projections. In: 2016 Annual Conference on Information Science and Systems, CISS 2016, Princeton, NJ, USA, 16–18 March 2016, pp. 596–601 (2016). https://doi.org/10.1109/CISS.2016.7460570
Kang, K., Hooker, G.: Random projections with control variates. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, vol. 1, ICPRAM, pp. 138–147. INSTICC, ScitePress (2017)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)
Google Scholar
Li, P., Hastie, T.J., Church, K.W.: Improving random projections using marginal information. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 635–649. Springer, Heidelberg (2006). doi:10.1007/11776420_46
Chapter Google Scholar
Li, P., Mahoney, M.W., She, Y.: Approximating higher-order distances using random projections. CoRR abs/1203.3492 (2012). http://arxiv.org/abs/1203.3492
Loia, V., Tomasiello, S., Vaccaro, A.: Using fuzzy transform in multi-agent based monitoring of smart grids. Inf. Sci. 388, 209–224 (2017)
Article Google Scholar
Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley-Interscience, Hoboken (2005)
MATH Google Scholar
Petersen, K.B., Pedersen, M.S.: The matrix cookbook. http://www2.imm.dtu.dk/pubdb/p.php?3274, version 20121115
Ross, S.M.: Simulation, 4th edn. Academic Press Inc., Orlando (2006)
MATH Google Scholar
Xu, Q., Ding, C., Liu, J., Luo, B.: PCA-guided search for k-means. Pattern Recogn. Lett. 54, 50–55 (2015)
Article Google Scholar

Download references

Acknowledgements

We thank the reviewers who provided us with much helpful comments. This research was supported by the SUTD Faculty Fellow Award.

Author information

Authors and Affiliations

Singapore University of Technology and Design, Singapore, 487372, Singapore
Keegan Kang

Authors

Keegan Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keegan Kang .

Editor information

Editors and Affiliations

University of Manchester, Manchester, United Kingdom
Hujun Yin
School of Electronic and Electrical Engineering, Nanjing University, Nanjiing, China
Yang Gao
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Songcan Chen
Guilin University of Electronic Technology, Guilin, China
Yimin Wen
Guilin University of Electronic Technology, Guilin, China
Guoyong Cai
Guilin University of Electronic Technology, Guilin, China
Tianlong Gu
Beijing University of Posts and Telecommunications, Beijing, China
Junping Du
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros
Southeast University, Nanjing, China
Minling Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, K. (2017). Using the Multivariate Normal to Improve Random Projections. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2017. IDEAL 2017. Lecture Notes in Computer Science(), vol 10585. Springer, Cham. https://doi.org/10.1007/978-3-319-68935-7_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-68935-7_43
Published: 06 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68934-0
Online ISBN: 978-3-319-68935-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics