Abstract
The increasingly interconnected and instrumented world, provides a deluge of data generated by multiple sensors in the form of continuous streams. Efficient stream processing needs control over the number of useful variables. This is because maintaining data structure in reduced sub-spaces, given that data is generated at high frequencies and is typically follows non-stationary distributions, brings new challenges for dimensionality reduction algorithms. In this work we introduce NARPCA, a neural network streaming PCA algorithm capable to explain the variance-covariance structure of a set of variables in a stream through linear combinations. The essentially neural-based algorithm is leveraged by a novel incremental computation method and system operating on data streams and capable of achieving low-latency and high-throughput when learning from data streams, while maintaining resource usage guarantees. We evaluate NARPCA in real-world data experiments and demonstrate low-latency (millisecond level) and high-throughput (thousands events/second) for simultaneous eigenvalues and eigenvectors estimation in a multi-class classification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Albert, B., Ricard Gavaldà, G.H., Pfahringer, B.: Machine learning for data streams with practical examples. In: MOA. MIT Press (2018)
Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 861–868. IEEE (2012)
Axenie, C., Tudoran, R., Bortoli, S., Hassan, M.A.H., Foroni, D., Brasche, G.: STARLORD: sliding window temporal accumulate-retract learning for online reasoning on datastreams. In: 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018, Orlando, FL, USA, 17–20 December 2018, pp. 1115–1122 (2018)
Baker, C.G., Gallivan, K.A., Van Dooren, P.: Low-rank incremental methods for computing dominant singular subspaces. Linear Algebra Appl. 436(8), 2866–2888 (2012)
Balsubramani, A., Dasgupta, S., Freund, Y.: The fast convergence of incremental PCA. In: Advances in Neural Information Processing Systems, pp. 3174–3182 (2013)
Boutsidis, C., Garber, D., Karnin, Z., Liberty, E.: Online principal components analysis. In: Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 887–901 (2015)
Brand, M.: Incremental singular value decomposition of uncertain data with missing values. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 707–720. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47969-4_47
Bubeck, S., Cesa-Bianchi, N., et al.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends® in Mach. Learn. 5(1), 1–122 (2012)
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flink\(^{\rm TM}\): Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 28–38 (2015). https://flink.apache.org/introduction.html
Chin, T.J., Suter, D.: Incremental kernel principal component analysis. IEEE Trans. Image Process. 16(6), 1662–1674 (2007)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore, MD, USA (1996)
Hallgren, F., Northrop, P.: Incremental kernel PCA and the nystróm method. arXiv preprint arXiv:1802.00043 (2018)
Householder, A.S.: Unitary triangularization of a nonsymmetric matrix. J. ACM 5(4), 339–342 (1958). https://doi.org/10.1145/320941.320947
Jain, P., Jin, C., Kakade, S.M., Netrapalli, P., Sidford, A.: Streaming PCA: matching matrix Bernstein and near-optimal finite sample guarantees for Oja’s algorithm. In: Conference on Learning Theory, pp. 1147–1164 (2016)
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Li, Y.: On incremental and robust subspace learning. Pattern Recogn. 37(7), 1509–1518 (2004)
Lois, B., Vaswani, N.: A correctness result for online robust PCA. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3791–3795. IEEE (2015)
Mitliagkas, I., Caramanis, C., Jain, P.: Memory limited, streaming PCA. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, vol. 2, pp. 2886–2894. Curran Associates Inc., USA (2013)
Nadler, B.: Finite sample approximation results for principal component analysis: a matrix perturbation approach. Ann. Statist. 36(6), 2791–2817 (2008)
Oja, E.: Simplified neuron model as a principal component analyzer. J. Math. Biol. 15(3), 267–273 (1982)
Oja, E.: Principal components, minor components, and linear neural networks. Neural Netw. 5(6), 927–935 (1992)
Qiu, J., Wang, H., Lu, J., Zhang, B., Du, K.L.: Neural network implementations for PCA and its extensions. In: 2012 ISRN Artificial Intelligence (2012)
Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2(6), 459–473 (1989). http://www.sciencedirect.com/science/article/pii/0893608089900440
Sarveniazi, A.: An actual survey of dimensionality reduction. Am. J. Comput. Matt. 4(4), 55–72 (2014)
Shamir, O.: Convergence of stochastic gradient descent for PCA. In: International Conference on Machine Learning, pp. 257–265 (2016)
Sharma, A., Paliwal, K.K., Imoto, S., Miyano, S.: Principal component analysis using GR decomposition. Int. J. Mach. Learn. Cybern. 4(6), 679–683 (2013). https://doi.org/10.1007/s13042-012-0131-7
Valpola, H.: From neural pca to deep unsupervised learning. In: Advances in Independent Component Analysis and Learning Machines, pp. 143–171. Elsevier (2015)
Vershynin, R.: How close is the sample covariance matrix to the actual covariance matrix? J. Theor. Probab. 25(3), 655–686 (2012)
Warmuth, M.K., Kuzmin, D.: Randomized PCA algorithms with regret bounds that are logarithmic in the dimension. In: Advances in Neural Information Processing Systems, pp. 1481–1488 (2007)
Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1034–1040 (2003)
Woo, S., Lee, C.: Incremental feature extraction based on decision boundaries. Pattern Recogn. 77, 65–74 (2018). http://www.sciencedirect.com/science/article/pii/S003132031730496X
Xu, L., Oja, E., Suen, C.Y.: Modified hebbian learning for curve and surface fitting. Neural Netw. 5(3), 441–457 (1992)
Yin, Y., Xu, D., Wang, X., Bai, M.: Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Trans. Cybern. 45(9), 1988–2000 (2015)
Zhan, J., Lois, B., Guo, H., Vaswani, N.: Online (and offline) robust PCA: novel algorithms and performance guarantees. In: Artificial intelligence and statistics, pp. 1488–1496 (2016)
Zhao, F., Rekik, I., Lee, S.w., Liu, J., Zhang, J., Shen, D.: Two-phase incremental kernel pca for learning massive or online datasets. Complexity 2019 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Axenie, C., Tudoran, R., Bortoli, S., Al Hajj Hassan, M., Brasche, G. (2019). NARPCA: Neural Accumulate-Retract PCA for Low-Latency High-Throughput Processing on Datastreams. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation. ICANN 2019. Lecture Notes in Computer Science(), vol 11727. Springer, Cham. https://doi.org/10.1007/978-3-030-30487-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-30487-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30486-7
Online ISBN: 978-3-030-30487-4
eBook Packages: Computer ScienceComputer Science (R0)