Abstract
Convolution neural networks (CNNs) based on the discrete convolutional operation have achieved great success in image processing, voice and audio processing, natural language processing and other fields. However, it is still an open problem how to develop new models instead of CNNs. Using the idea of the sequence block matrix product, we propose a novel operation and its corresponding neural network, namely two-dimensional discrete matrix-product operation (TDDMPO) and matrix-product neural network (MPNN). We present the definition of the TDDMPO, a series of its properties and matrix-product theorem in detail, and then construct its corresponding MPNN. Experimental results on Fashion-MNIST, SVHN, FLOWER17 and FLOWER102 datasets show that MPNNs obtain 1.65–13.04% relative performance improvement in comparison with the corresponding CNNs, and the amount of calculation of matrix-product layers of MPNNs obtains 41× to 57× reduction in comparison with the corresponding convolutional layers of CNNs. Hence, it is a potential model that may open some new directions for deep neural networks, particularly alternatives to CNNs.












Similar content being viewed by others
Data Availability Statement
Fashion-MNIST dataset that supports the findings of this study is openly available in [GRAVITI] at [https://www.graviti.cn/open-datasets/FashionM-NIST], reference number [31]. SVHN dataset that supports the findings of this study is openly available in [GRAVITI] at [https://www.graviti.cn/open-datasets/SVHN], reference number [32]. 17 Category Flower (FLOWER17) dataset that supports the findings of this study is openly available in [GRAVITI] at [https://www.graviti.cn/open-datasets/Flower17], reference number [33]. 102 Category Flower (FLOWER102) dataset that supports the findings of this study is openly available in [GRAVITI] at [https://www.graviti.cn/open-datasets/Flower102], reference number [33].
References
Hubel DH, Wiesel T (1962) Receptive fields, binocular interaction, and functional architecture in the cats visual cortex. J Physiol 160(1):106–154
Wiesel T, Hubel DH (1959) Receptive fields of single neurons in the cats striate cortex. J Physiol 148(3):574–591
Fukushima K (1979) Neural network model for a mechanism of pattern recognition unaffected by shift in position-Neocognitron. IEICE Techn Rep 62(10):658–665
Fukushima K (1980) Neocognitron: a self-organizing neural network for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
Fukushima K (2013) Artificial vision by multi-layered neural networks: neocognitron and its advances. Neural Netw 37:103–119
LeCun Y, Boser B, Denker JS et al (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
LeCun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst, pp 1097–1105
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Wengrowski E, Purri M, Dana K et al (2019) Deep CNNs as a method to classify rotating objects based on monostatic RCS. IET Radar Sonar Navig 13(7):1092–1100
Wu X, Zhang Z, Zhang W et al (2021) A convolutional neural network based on grouping structure for scene classification. Remote Sens 13(13):2457–2477
Hagag A, Omara I, Alfarra ANK, Mekawy F (2021) Handwritten chemical formulas classification model using deep transfer convolutional neural networks. In: International Conference on Electronic Engineering (ICEEM), pp 1–6
Teli MN (2021) TeliNet, a simple and shallow convolution neural network (CNN) to classify CT scans of COVID-19 patients. arXiv:2107.04930
Shawky OA, Hagag A, El-Dahshan E et al (2020) Remote sensing image scene classification using CNN-MLP with data augmentation. Optik Int J Light Electron Opt 165356
He K, Gkioxari G, Dollr P et al (2017) Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, pp 2980–2988
Liu B, Liu Q, Zhang T et al (2019) MSSTResNet-TLD: a robust tracking method based on tracking-learning-detection framework by using multi-scale spatio-temporal residual network feature model. Neurocomputing 175–194
Liu Z, Waqas M, Yang J et al (2021) A multi-task CNN for maritime target detection. IEEE Signal Process Lett 28:434–438
Fan M, Tian S, Liu K et al (2021) Infrared small target detection based on region proposal and CNN classifier. SIViP 1–10
Hou F, Lei W, Li S et al (2021) Deep learning-based subsurface target detection from GPR scans. IEEE Sens J 21(6):8161–8171
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Silver D, Schrittwieser J, Simonyan K et al (2017) Mastering the game of Go without human knowledge. Nature 550(7676):354–359
Zoughi T, Homayounpour MM (2019) A gender-aware deep neural network structure for speech recognition, Iranian Journal of Science and Technology-Transactions of. Electr Eng 43(3):635–644
Perdana BBSP, Irawan B, Setianingsih C (2019) Hate speech detection in indonesian language on instagram comment section using deep neural network classification method. In: 2019 IEEE Asia Pacific Conference on Wireless and Mobile (APWiMob). IEEE
Krishnan PT, Balasubramanian P (2019) Detection of alphabets for machine translation of sign language using deep neural net. In: 2019 International Conference on Data Science and Communication (IconDSC)
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with EM routing. In: International Conference on Representation Learning
Gonzalez RC, Wintz P (1997) Digital image processing. Addison-Wesley, New York
Bhabatosh C (1977) Digital image processing and analysis. PHI Learning Pvt Ltd, New Delhi
Zhang XD (2017) Matrix analysis and applications. Cambridge University Press, Cambridge
Bouvrie J (2006) Notes on convolutional neural networks. Center for Biological and Computational Learning, Massachusetts, pp 38–44
Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747
Netzer Y, Wang T, Coates A et al (2011) Reading digits in natural images with unsupervised feature learning. Adv Neural Inf Process Syst 4–12
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Sixth Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2008, Bhubaneswar, India, 16–19 December 2008. IEEE
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is supported by Anhui Polytechnic University Introduced Talent Research Startup Fund (No. 2020YQQ039).
Appendix 1
Appendix 1
In “Appendix 1” section, we offer the detailed proofs of related conclusions in Sect. 2.
1. Proof of Formula (6)
Proof
According to Formula (2),
So Formula (6) holds. \(\square \)
2. Proof of Formula (7)
Proof
According to Formula (2),
So Formula (7) holds. \(\square \)
3. Proof of Formula (8)
Proof
According to Formula (2),
So Formula (8) holds. \(\square \)
Proof
(1) According to Formula (2),
(2) According to Formula (2),
3) According to Formula (2),
In summary, Formulas (9)-(11) hold. \(\square \)
5. Proof of Formula (13)
Proof
Substitute Formula (13) into Formula (12). We have
Because
where \(z_1\) and \(z_2\) are integer, the following formula is satisfied in the transformation interval, namely
It can be seen that the two-dimensional discrete Fourier transform defined by Formula (13) is unique. \(\square \)
6. Proof of Formula (14)
Proof
According to Formula (13),
According to Formula (12),
In summary, Formula (14) holds. \(\square \)
7. Proof of Formula (15)
Proof
According to Formula (12),
So Formula (15) holds. \(\square \)
8. Proof of Formula (16)
Proof
According to Formula (13),
So \(F(u,v) \Leftrightarrow \frac{1}{MN}f(-m,-n)\). According to Formula (30),
So \(MN\cdot F(-u,-v)\Leftrightarrow f(m,n)\). In summary, Formula (16) holds. \(\square \)
9. Proof of Formula (17)
Proof
According to Formula (12),
So \(f(m\pm m_0,n\pm n_0)\Leftrightarrow e^{\pm j2\pi (um_0/M+vn_0/N)}F(u,v)\). According to Formula (13),
So \(e^{\mp j2\pi (um_0/M+vn_0/N)}f(m,n)\Leftrightarrow F(u\pm u_0,v\pm v_0)\). In summary, Formula (17) holds. \(\square \)
Rights and permissions
About this article
Cite this article
Shan, C., Ou, J. & Chen, X. Matrix-product neural network based on sequence block matrix product. J Supercomput 78, 8467–8492 (2022). https://doi.org/10.1007/s11227-021-04194-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04194-5