Multilayer Perceptrons with Banach-Like Perceptrons Based on Semi-inner Products – About Approximation Completeness

Villmann, Thomas; Engelsberger, Alexander

doi:10.1007/978-3-031-23492-7_14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13588))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

449 Accesses

Abstract

The paper reconsiders multilayer perceptron networks for the case where the Euclidean inner product is replaced by a semi-inner product. This would be of interest, if the dissimilarity measure between data is given by a general norm such that the Euclidean inner product is not longer consistent to that situation. We prove mathematically that the universal approximation completeness is guaranteed also for those networks where the used semi-inner products are related either to uniformly convex or to reflexive Banach-spaces. Most famous examples of uniformly convex Banach spaces are the spaces $L_{p}$ and $l_{p}$ for $1<p<\infty $. The result is valid for all discriminatory activation functions including the sigmoid and the ReLU activation.

A. Engelsberger—Supported by an ESF PhD grant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bishop, C.: Pattern Recognition and Machine Learning. Springer, London (2006)
MATH Google Scholar
Braun, J., Griebel, M.: On a constructive proof of Kolmogorov’s superposition theorem. Constr. Approx. 30, 653–675 (2009). https://doi.org/10.1007/s00365-009-9054-2
Article MathSciNet MATH Google Scholar
Chieng, H., Wahid, N., Pauline, O., Perla, S.: Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning. Int. J. Adv. Intell. Inform. 4(2), 76–86 (2018)
Article Google Scholar
Clarkson, J.: Uniformly convex spaces. Trans. Am. Math. Soc. 40, 396–414 (1936)
Article MathSciNet MATH Google Scholar
Cybenko, G.: Approximations by superpositions of a sigmoidal function. Math. Control Sig. Syst. 2(4), 303–314 (1989). https://doi.org/10.1007/BF02551274
Article MathSciNet MATH Google Scholar
Faulkner, G.D.: Representation of linear functionals in a Banach space. Rocky Mt. J. Math. 7(4), 789–792 (1977)
Article MathSciNet MATH Google Scholar
Giles, J.: Classes of semi-inner-product spaces. Trans. Am. Math. Soc. 129, 436–446 (1967)
Article MathSciNet MATH Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Gorban, A.: Approximation of continuous functions of several variables by an arbitrary nonlinear continuous function of one variable, linear functions, and their superpositions. Appl. Math. Lett. 11(3), 45–49 (1998)
Article MathSciNet MATH Google Scholar
Guilhoto, L.: An overview of artificial neural networks for mathematicians (2018). http://math.uchicago.edu/~may/REU2018/REUPapers/Guilhoto.pdf
Hanin, B.: Universal function approximation by deep neural networks with bounded width and ReLU activations. Mathematics 7(992), 1–9 (2019)
Google Scholar
Hanner, O.: On the uniform convexity of $L^p$ and $l^p$. Ark. Mat. 3(19), 239–244 (1956)
Article MathSciNet MATH Google Scholar
Hertz, J.A., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation, Volume 1 of Santa Fe Institute Studies in the Sciences of Complexity: Lecture Notes. Addison-Wesley, Redwood City (1991)
Google Scholar
Kolmogorov, A.: On the representation of continuous functions of several variables as superpositions of continuous functions of one variable and addition. Doklady Academ Nauk SSSR 114(5), 953–956 (1957)
MATH Google Scholar
Kolmogorov, A., Fomin, S.: Reelle Funktionen und Funktionalanalysis. VEB Deutscher Verlag der Wissenschaften, Berlin (1975)
MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), San Diego, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)
Google Scholar
Kůrková, V.: Kolmogorov’s theorem and multilayer neural networks. Neural Netw. 5, 501–506 (1992)
Article Google Scholar
Lange, M., Biehl, M., Villmann, T.: Non-Euclidean principal component analysis by Hebbian learning. Neurocomputing 147, 107–119 (2015)
Article Google Scholar
LeCun, Y., Cortes, C., Burges, C.: The MNIST database (1998)
Google Scholar
Lumer, G.: Semi-inner-product spaces. Trans. Am. Math. Soc. 100, 29–43 (1961)
Article MathSciNet MATH Google Scholar
Nath, B.: Topologies on generalized semi-inner product spaces. Composito Mathematica 23(3), 309–316 (1971)
MathSciNet MATH Google Scholar
Ramachandran, P., Zoph, B., Le, Q.: Searching for activation functions. Technical report, Google Brain (2018). arXiv:1710.05941v1
Riesz, F., Nagy, B.Sz.: Vorlesungen über Functionalanalysis, 4th edn. Verlag Harri Deutsch, Frankfurt/M. (1982)
Google Scholar
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958)
Article Google Scholar
Rudin, W.: Functional Analysis, 2nd edn. MacGraw-Hill Inc., New York (1991)
MATH Google Scholar
Steinwart, I., Christmann, A.: Support Vector Machines. Information Science and Statistics, Springer, Heidelberg (2008). https://doi.org/10.1007/978-0-387-77242-4
Book MATH Google Scholar
Triebel, H.: Analysis und mathematische Physik, 3rd revised edn. BSB B.G. Teubner Verlagsgesellschaft, Leipzig (1989)
Google Scholar
Villmann, T., Haase, S., Kaden, M.: Kernelized vector quantization in gradient-descent learning. Neurocomputing 147, 83–95 (2015)
Article Google Scholar
Villmann, T., Ravichandran, J., Villmann, A., Nebel, D., Kaden, M.: Investigation of activation functions for generalized learning vector quantization. In: Vellido, A., Gibert, K., Angulo, C., Martín Guerrero, J.D. (eds.) WSOM 2019. AISC, vol. 976, pp. 179–188. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-19642-4_18
Chapter Google Scholar
Zhang, H., Xu, Y., Zhang, J.: Reproducing kernel banach spaces for machine learning. J. Mach. Learn. Res. 10, 2741–2775 (2009)
MathSciNet MATH Google Scholar
Zhang, H., Zhang, J.: Generalized semi-inner products with applications to regularized learning. J. Math. Anal. Appl. 372, 181–196 (2010)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Saxon Institute for Computational Intelligence and Machine Learning (SICIM), University of Applied Sciences, Mittweida, Germany
Thomas Villmann & Alexander Engelsberger

Authors

Thomas Villmann
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Engelsberger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Villmann .

Editor information

Editors and Affiliations

Systems Research Institute of the Polish Academy of Sciences, Warsaw, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Appendix

In this appendix we give some useful definitions regarding SIPs and Banach spaces, which are used in the text as well as some basic statements and remarks.

Definition 23

A Banach space $\mathcal {B}$ is denoted as strictly convex iff for $\textbf{x},\textbf{y}\ne 0$ with $\left\| \textbf{x}\right\| +\left\| \textbf{y}\right\| =\left\| \textbf{x}+\textbf{y}\right\| $ we can always conclude that $\textbf{x}=\lambda \textbf{y}$ for some $\lambda >0$.

Lemma 24

A Banach space $\mathcal {B}$ with SIP $\left[ \cdot ,\cdot \right] $ is strictly convex iff for $\textbf{x},\textbf{y}\ne 0$ with $\left[ \textbf{x},\textbf{y}\right] =\left\| \textbf{x}\right\| \cdot \left\| \textbf{y}\right\| $ we can always conclude that $\textbf{x}=\lambda \textbf{y}$ for some $\lambda >0$.

Proof

The proof can be found in [7]. $\square $

The following definition for the uniform convexity was introduced in [4]:

Definition 25

A Banach space $\mathcal {B}$ is denoted as uniformly convex iff for each $\varepsilon >0$ exists a $\delta \left( \varepsilon \right) >0$ such that if $\left\| \textbf{x}\right\| =\left\| \textbf{y}\right\| =1$ with $\left\| \textbf{x}-\textbf{y}\right\| >\varepsilon $ then $\frac{\left\| \left( \textbf{x}+\textbf{y}\right) \right\| }{2}<1-\delta \left( \varepsilon \right) $ is valid.

Definition 26

A Banach space $\mathcal {B}$ with SIP $\left[ \cdot ,\cdot \right] $ is denoted as continuous iff

$$ \Re \left\{ \left[ \textbf{x},\textbf{y}+\lambda \textbf{x}\right] \right\} \underset{\lambda \rightarrow 0}{\longrightarrow }\Re \left\{ \left[ \textbf{x},\textbf{y}\right] \right\} $$

is valid for $\lambda \in \mathbb {R}$. The space is uniformly continuous iff this limit is approached uniformly.

Definition 27

A Banach space $\mathcal {B}$ is denoted as reflexive iff the mapping $J:\mathcal {B}\rightarrow \mathcal {B}^{**}=\left( \mathcal {B}^{*}\right) ^{*}$ is surjective, where the star indicates the dual space.

Theorem 28

Let $\mathcal {B}$ be a Banach space. Then a necessary and sufficient condition for $\mathcal {B}$ to be reflexive is that for every $f\in \mathcal {B}^{*}$ exists an SIP $\left[ \cdot ,\cdot \right] $ and an element $\textbf{y}\in \mathcal {B}$ with $f\left( \textbf{x}\right) =\left[ \textbf{x},\textbf{y}\right] $ for all $\textbf{x}\in \mathcal {B}$. If $\mathcal {B}$ is strictly convex then $\textbf{y}$ is unique.

Proof

The proof can be found in [6, Theorem 2]. $\square $

Definition 29

A Banach space $\mathcal {B}$ is denoted as smooth iff for each $\textbf{x}\in \mathcal {B}$ with $\left\| \textbf{x}\right\| =1$ there exists a linear functional $f_{\textbf{x}}\in \mathcal {B}^{*}$ with $f_{\textbf{x}}\left( \textbf{x}\right) =\left\| f_{\textbf{x}}\right\| $. The existence of $f_{\textbf{x}}$ is guaranteed by the Hahn-Banach-Theorem.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Villmann, T., Engelsberger, A. (2023). Multilayer Perceptrons with Banach-Like Perceptrons Based on Semi-inner Products – About Approximation Completeness. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2022. Lecture Notes in Computer Science(), vol 13588. Springer, Cham. https://doi.org/10.1007/978-3-031-23492-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-23492-7_14
Published: 24 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23491-0
Online ISBN: 978-3-031-23492-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multilayer Perceptrons with Banach-Like Perceptrons Based on Semi-inner Products – About Approximation Completeness

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Definition 23

Lemma 24

Proof

Definition 25

Definition 26

Definition 27

Theorem 28

Proof

Definition 29

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation