Backpropagation Through States: Training Neural Networks with Sequentially Semiseparable Weight Matrices

Kissel, Matthias; Gottwald, Martin; Gjeroska, Biljana; Paukner, Philipp; Diepold, Klaus

doi:10.1007/978-3-031-16474-3_39

Matthias Kissel¹²,
Martin Gottwald¹²,
Biljana Gjeroska¹²,
Philipp Paukner¹² &
…
Klaus Diepold¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13566))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

1326 Accesses
1 Citations

Abstract

Matrix-Vector multiplications usually represent the dominant part of computational operations needed to propagate information through a neural network. This number of operations can be reduced if the weight matrices are structured. In this paper, we introduce a training algorithm for neural networks with sequentially semiseparable weight matrices based on the backpropagation algorithm. By exploiting the structures in the weight matrices, the computational complexity for computing the matrix-vector product can be reduced to the subquadratic domain. We show that this can lead to computing time reductions on a microcontroller. Furthermore, we analyze the generalization capabilities of neural networks with sequentially semiseparable matrices. Our experiments show that neural networks with structured weight matrices can outperform standard feed-forward neural networks in terms of test prediction accuracy for several real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alimoglu, F., Alpaydin, E.: Methods of combining multiple classifiers based on different representations for pen-based handwritten digit recognition. In: Proceedings of the Fifth Turkish Artificial Intelligence and Artificial Neural Networks Symposium (TAINN 1996). Citeseer (1996)
Google Scholar
Chandrasekaran, S., Gu, M., Pals, T.: A fast ULV decomposition solver for hierarchically semiseparable representations. SIAM J. Matrix Anal. Appl. 28(3), 603–622 (2006)
Article MathSciNet Google Scholar
Denil, M., Shakibi, B., Dinh, L., De Freitas, N., et al.: Predicting parameters in deep learning. In: Advances in Neural Information Processing Systems, pp. 2148–2156 (2013)
Google Scholar
Dewilde, P., Van der Veen, A.J.: Time-Varying Systems and Computations. Springer, New York (1998). https://doi.org/10.1007/978-1-4757-2817-0
Book MATH Google Scholar
Fan, Y., Lin, L., Ying, L., Zepeda-Núnez, L.: A multiscale neural network based on hierarchical matrices. Multisc. Model. Simul. 17(4), 1189–1213 (2019)
Article MathSciNet Google Scholar
Gantmakher, F., Krein, M.: Sur les matrices completement non négatives et oscillatoires. Compos. Math. 4, 445–476 (1937)
MATH Google Scholar
Gardner, A., Kanno, J., Duncan, C.A., Selmic, R.: Measuring distance between unordered sets of different sizes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 137–143 (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ioannou, Y., Robertson, D., Shotton, J., Cipolla, R., Criminisi, A.: Training CNNs with low-rank filters for efficient image classification. arXiv preprint arXiv:1511.06744 (2015)
Kung, S., Lin, D.: Optimal Hankel-norm model reductions: multivariable systems. IEEE Trans. Autom. Control 26(4), 832–852 (1981)
Article MathSciNet Google Scholar
Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553 (2014)
Sindhwani, V., Sainath, T., Kumar, S.: Structured transforms for small-footprint deep learning. In: Advances in Neural Information Processing Systems, pp. 3088–3096 (2015)
Google Scholar
Tai, C., Xiao, T., Zhang, Y., Wang, X., et al.: Convolutional neural networks with low-rank regularization. arXiv preprint arXiv:1511.06067 (2015)
Thomas, A.T., Gu, A., Dao, T., Rudra, A., Ré, C.: Learning compressed transforms with low displacement rank. Adv. Neural. Inf. Process. Syst. 2018, 9052 (2018)
Google Scholar
Titti, A., Squartini, S., Piazza, F.: A new time-variant neural based approach for nonstationary and non-linear system identification. In: 2005 IEEE International Symposium on Circuits and Systems, pp. 5134–5137. IEEE (2005)
Google Scholar
Van Lint, J., Hoogendoorn, S., van Zuylen, H.J.: Accurate freeway travel time prediction with state-space neural networks under missing data. Transp. Rese. Part C: Emerg. Technol. 13(5–6), 347–369 (2005)
Article Google Scholar
Vandebril, R., Van Barel, M., Golub, G., Mastronardi, N.: A bibliography on semiseparable matrices. Calcolo 42(3), 249–270 (2005)
Article MathSciNet Google Scholar
Vandebril, R., Van Barel, M., Mastronardi, N.: Matrix Computations and Semiseparable Matrices: Linear Systems, vol. 1. JHU Press (2007)
Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 2074–2082 (2016)
Google Scholar
Xie, D., Xiong, J., Pu, S.: All you need is beyond a good init: exploring better solution for training extremely deep convolutional neural networks with orthonormality and modulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6176–6185 (2017)
Google Scholar
Zamarreño, J.M., Vega, P.: State space neural network. properties and application. Neural Netw. 11(6), 1099–1112 (1998)
Google Scholar
Zhao, L., Liao, S., Wang, Y., Li, Z., Tang, J., Yuan, B.: Theoretical properties for neural networks with weight matrices of low displacement rank. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 4082–4090. JMLR. org (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Technical University of Munich, Arcisstr. 21, 80333, Munich, Germany
Matthias Kissel, Martin Gottwald, Biljana Gjeroska, Philipp Paukner & Klaus Diepold

Authors

Matthias Kissel
View author publications
You can also search for this author in PubMed Google Scholar
Martin Gottwald
View author publications
You can also search for this author in PubMed Google Scholar
Biljana Gjeroska
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Paukner
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Diepold
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Kissel .

Editor information

Editors and Affiliations

ISEP/GECAD, Polytechnic Institute of Porto, Porto, Portugal
Goreti Marreiros
IST/INESC-ID, University of Lisbon, Lisbon, Portugal
Bruno Martins
IST/INESC-ID, University of Lisbon, Porto Salvo, Portugal
Ana Paiva
CISUC, University of Coimbra, Coimbra, Portugal
Bernardete Ribeiro
IST/INESC-ID, University of Lisbon, Porto Salvo, Portugal
Alberto Sardinha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kissel, M., Gottwald, M., Gjeroska, B., Paukner, P., Diepold, K. (2022). Backpropagation Through States: Training Neural Networks with Sequentially Semiseparable Weight Matrices. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-16474-3_39
Published: 13 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16473-6
Online ISBN: 978-3-031-16474-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Backpropagation Through States: Training Neural Networks with Sequentially Semiseparable Weight Matrices