NARPCA: Neural Accumulate-Retract PCA for Low-Latency High-Throughput Processing on Datastreams

Axenie, Cristian; Tudoran, Radu; Bortoli, Stefano; Al Hajj Hassan, Mohamad; Brasche, Goetz

doi:10.1007/978-3-030-30487-4_20

Cristian Axenie¹²,
Radu Tudoran¹²,
Stefano Bortoli¹²,
Mohamad Al Hajj Hassan¹² &
…
Goetz Brasche¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11727))

Included in the following conference series:

International Conference on Artificial Neural Networks

2891 Accesses

Abstract

The increasingly interconnected and instrumented world, provides a deluge of data generated by multiple sensors in the form of continuous streams. Efficient stream processing needs control over the number of useful variables. This is because maintaining data structure in reduced sub-spaces, given that data is generated at high frequencies and is typically follows non-stationary distributions, brings new challenges for dimensionality reduction algorithms. In this work we introduce NARPCA, a neural network streaming PCA algorithm capable to explain the variance-covariance structure of a set of variables in a stream through linear combinations. The essentially neural-based algorithm is leveraged by a novel incremental computation method and system operating on data streams and capable of achieving low-latency and high-throughput when learning from data streams, while maintaining resource usage guarantees. We evaluate NARPCA in real-world data experiments and demonstrate low-latency (millisecond level) and high-throughput (thousands events/second) for simultaneous eigenvalues and eigenvectors estimation in a multi-class classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/omlstreaming/icmla2018.

References

Albert, B., Ricard Gavaldà, G.H., Pfahringer, B.: Machine learning for data streams with practical examples. In: MOA. MIT Press (2018)
Google Scholar
Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 861–868. IEEE (2012)
Google Scholar
Axenie, C., Tudoran, R., Bortoli, S., Hassan, M.A.H., Foroni, D., Brasche, G.: STARLORD: sliding window temporal accumulate-retract learning for online reasoning on datastreams. In: 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018, Orlando, FL, USA, 17–20 December 2018, pp. 1115–1122 (2018)
Google Scholar
Baker, C.G., Gallivan, K.A., Van Dooren, P.: Low-rank incremental methods for computing dominant singular subspaces. Linear Algebra Appl. 436(8), 2866–2888 (2012)
Article MathSciNet Google Scholar
Balsubramani, A., Dasgupta, S., Freund, Y.: The fast convergence of incremental PCA. In: Advances in Neural Information Processing Systems, pp. 3174–3182 (2013)
Google Scholar
Boutsidis, C., Garber, D., Karnin, Z., Liberty, E.: Online principal components analysis. In: Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 887–901 (2015)
Google Scholar
Brand, M.: Incremental singular value decomposition of uncertain data with missing values. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 707–720. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47969-4_47
Chapter Google Scholar
Bubeck, S., Cesa-Bianchi, N., et al.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends® in Mach. Learn. 5(1), 1–122 (2012)
Article Google Scholar
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flink\(^{\rm TM}\): Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 28–38 (2015). https://flink.apache.org/introduction.html
Google Scholar
Chin, T.J., Suter, D.: Incremental kernel principal component analysis. IEEE Trans. Image Process. 16(6), 1662–1674 (2007)
Article MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore, MD, USA (1996)
MATH Google Scholar
Hallgren, F., Northrop, P.: Incremental kernel PCA and the nystróm method. arXiv preprint arXiv:1802.00043 (2018)
Householder, A.S.: Unitary triangularization of a nonsymmetric matrix. J. ACM 5(4), 339–342 (1958). https://doi.org/10.1145/320941.320947
Article MathSciNet MATH Google Scholar
Jain, P., Jin, C., Kakade, S.M., Netrapalli, P., Sidford, A.: Streaming PCA: matching matrix Bernstein and near-optimal finite sample guarantees for Oja’s algorithm. In: Conference on Learning Theory, pp. 1147–1164 (2016)
Google Scholar
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Article Google Scholar
Li, Y.: On incremental and robust subspace learning. Pattern Recogn. 37(7), 1509–1518 (2004)
Article Google Scholar
Lois, B., Vaswani, N.: A correctness result for online robust PCA. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3791–3795. IEEE (2015)
Google Scholar
Mitliagkas, I., Caramanis, C., Jain, P.: Memory limited, streaming PCA. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, vol. 2, pp. 2886–2894. Curran Associates Inc., USA (2013)
Google Scholar
Nadler, B.: Finite sample approximation results for principal component analysis: a matrix perturbation approach. Ann. Statist. 36(6), 2791–2817 (2008)
Article MathSciNet Google Scholar
Oja, E.: Simplified neuron model as a principal component analyzer. J. Math. Biol. 15(3), 267–273 (1982)
Article MathSciNet Google Scholar
Oja, E.: Principal components, minor components, and linear neural networks. Neural Netw. 5(6), 927–935 (1992)
Article Google Scholar
Qiu, J., Wang, H., Lu, J., Zhang, B., Du, K.L.: Neural network implementations for PCA and its extensions. In: 2012 ISRN Artificial Intelligence (2012)
Article Google Scholar
Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2(6), 459–473 (1989). http://www.sciencedirect.com/science/article/pii/0893608089900440
Article Google Scholar
Sarveniazi, A.: An actual survey of dimensionality reduction. Am. J. Comput. Matt. 4(4), 55–72 (2014)
Article Google Scholar
Shamir, O.: Convergence of stochastic gradient descent for PCA. In: International Conference on Machine Learning, pp. 257–265 (2016)
Google Scholar
Sharma, A., Paliwal, K.K., Imoto, S., Miyano, S.: Principal component analysis using GR decomposition. Int. J. Mach. Learn. Cybern. 4(6), 679–683 (2013). https://doi.org/10.1007/s13042-012-0131-7
Article Google Scholar
Valpola, H.: From neural pca to deep unsupervised learning. In: Advances in Independent Component Analysis and Learning Machines, pp. 143–171. Elsevier (2015)
Google Scholar
Vershynin, R.: How close is the sample covariance matrix to the actual covariance matrix? J. Theor. Probab. 25(3), 655–686 (2012)
Article MathSciNet Google Scholar
Warmuth, M.K., Kuzmin, D.: Randomized PCA algorithms with regret bounds that are logarithmic in the dimension. In: Advances in Neural Information Processing Systems, pp. 1481–1488 (2007)
Google Scholar
Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1034–1040 (2003)
Article Google Scholar
Woo, S., Lee, C.: Incremental feature extraction based on decision boundaries. Pattern Recogn. 77, 65–74 (2018). http://www.sciencedirect.com/science/article/pii/S003132031730496X
Article Google Scholar
Xu, L., Oja, E., Suen, C.Y.: Modified hebbian learning for curve and surface fitting. Neural Netw. 5(3), 441–457 (1992)
Article Google Scholar
Yin, Y., Xu, D., Wang, X., Bai, M.: Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Trans. Cybern. 45(9), 1988–2000 (2015)
Article Google Scholar
Zhan, J., Lois, B., Guo, H., Vaswani, N.: Online (and offline) robust PCA: novel algorithms and performance guarantees. In: Artificial intelligence and statistics, pp. 1488–1496 (2016)
Google Scholar
Zhao, F., Rekik, I., Lee, S.w., Liu, J., Zhang, J., Shen, D.: Two-phase incremental kernel pca for learning massive or online datasets. Complexity 2019 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Huawei German Research Center, Riesstrasse 25, 80992, Munich, Germany
Cristian Axenie, Radu Tudoran, Stefano Bortoli, Mohamad Al Hajj Hassan & Goetz Brasche

Authors

Cristian Axenie
View author publications
You can also search for this author in PubMed Google Scholar
Radu Tudoran
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Bortoli
View author publications
You can also search for this author in PubMed Google Scholar
Mohamad Al Hajj Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Goetz Brasche
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristian Axenie .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Axenie, C., Tudoran, R., Bortoli, S., Al Hajj Hassan, M., Brasche, G. (2019). NARPCA: Neural Accumulate-Retract PCA for Low-Latency High-Throughput Processing on Datastreams. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation. ICANN 2019. Lecture Notes in Computer Science(), vol 11727. Springer, Cham. https://doi.org/10.1007/978-3-030-30487-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-30487-4_20
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30486-7
Online ISBN: 978-3-030-30487-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics