Abstract
We develop an algebro-geometric formulation for neural networks in machine learning using the moduli space of framed quiver representations. We find natural Hermitian metrics on the universal bundles over the moduli whose expressions are independent of dimension vector, and show that their Ricci curvatures give a Kähler metric on the moduli. Moreover, we use toric moment maps to construct activation functions and prove the universal approximation theorem for the softmax function (also known as Boltzmann distribution) using toric geometry of the complex projective space.













Similar content being viewed by others
References
M. Abouzaid, Homogeneous coordinate rings and mirror symmetry for toric varieties, Geom. Topol. 10 (2006), 1097–1157.
M. Abreu, Kähler geometry of toric varieties and extremal metrics, Internat. J. Math. 9 (1998), no. 6, 641–651.
M. A. Armenta and P.-M. Jodoin, The representation theory of neural networks, Mathematics 9.24 (2021), 3216.
M.C.N. Cheng, V. Anagiannis, M. Weiler, P. de Haan, T.S. Cohen, and M. Welling, Covariance in physics and convolutional neural networks. Submitted, arXiv:1906.02481, 2019.
W. Crawley-Boevey, Normality of Marsden-Weinstein reductions for representations of quivers, Math. Ann. 325 (2003), no. 1, 55–79.
X. Chen, S. Donaldson, and S. Sun, Kähler-Einstein metrics on Fano manifolds. I: Approximation of metrics with cone singularities, J. Amer. Math. Soc. 28 (2015), no. 1, 183–197.
X. Chen, S. Donaldson, and S. Sun, Kähler-Einstein metrics on Fano manifolds. II: Limits with cone angle less than \(2\pi \), J. Amer. Math. Soc. 28 (2015), no. 1, 199–234.
X. Chen, S. Donaldson, and S. Sun, Kähler-Einstein metrics on Fano manifolds. III: Limits as cone angle approaches \(2\pi \) and completion of the main proof, J. Amer. Math. Soc. 28 (2015), no. 1, 235–278.
T.S. Cohen, M. Geiger, J. Koehler, and M. Welling, Spherical cnns, ICLR. arXiv:1801.10130
T.S. Cohen, M. Geiger, and M. Weiler, A general theory of equivariant cnns on homogeneous spaces, NeurlPS. arXiv:1811.02017.
K. Chan, S.-C. Lau, and N.C. Leung, SYZ mirror symmetry for toric Calabi-Yau manifolds, J. Differential Geom. 90 (2012), no. 2, 177–250.
X. Chen, S. Sun, and B. Wang, Kähler-Ricci flow, Kähler-Einstein metric, and K-stability, Geom. Topol. 22 (2018), no. 6, 3145–3173.
T.S. Cohen and M. Welling, Group equivariant convolutional networks, Proceedings of The 33rd International Conference on Machine Learning, vol. 48, 2016, pp. 2990–2999.
T.S. Cohen, M. Weiler, B. Kicanaoglu, and M. Welling, Gauge equivariant convolutional networks and the icosahedral cnn, Proceedings of the ICML. arXiv:1902.04615
G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems 2 (1989), no. 4, 303–314.
I. Chami, Z. Ying, C. Ré, and J. Leskovec, Hyperbolic graph convolutional neural networks, Advances in neural information processing systems. arXiv:1910.12933
P. de Haan, T. Cohen, and M. Welling, Natural graph networks. Submitted, arXiv:2007.08349, 2020.
S.K. Donaldson, Anti self-dual Yang-Mills connections over complex algebraic surfaces and stable vector bundles, Proc. London Math. Soc. (3) 50 (1985), no. 1, 1–26.
S.K. Donaldson, Symmetric spaces, Kähler geometry and Hamiltonian dynamics, Northern California Symplectic Geometry Seminar, Amer. Math. Soc. Transl. Ser. 2, vol. 196, Amer. Math. Soc., Providence, RI, 1999, pp. 13–33.
S.K. Donaldson, Stability, birational transformations and the Kahler-Einstein problem, Surveys in differential geometry. Vol. XVII, Surv. Differ. Geom., vol. 17, Int. Press, Boston, MA, 2012, pp. 203–228.
Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, and Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Submited, arXiv:1406.2572, 2014.
J. Engel and M. Reineke, Smooth models of quiver moduli, Math. Z. 262 (2009), no. 4, 817–848.
S. Fedotov, Framed moduli and Grassmannians of submodules, Trans. Amer. Math. Soc. 365 (2013), no. 8, 4153–4179.
B. Fang, C.-C. M. Liu, D. Treumann, and E. Zaslow, T-duality and homological mirror symmetry for toric varieties, Adv. Math. 229 (2012), no. 3, 1875–1911.
B. Fong and D. I. Spivak, An invitation to applied category theory: Seven sketches in compositionality, Cambridge University Press, Cambridge, 2019.
W. Fulton, Introduction to toric varieties, Annals of Mathematics Studies, vol. 131, Princeton University Press, Princeton, NJ, 1993.
O. Ganea, G. Becigneul, and T. Hofmann, Hyperbolic neural networks, Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), vol. 31, Curran Associates, Inc., 2018, pp. 5345–5355.
M. Goyal, R. Goyal, and B. Lall, Learning activation functions: A new paradigm for understanding neural networks. Submitted, arXiv:1906.09529, 2019.
V. Guillemin, Kaehler structures on toric varieties, J. Differential Geom. 40 (1994), no. 2, 285–309.
I. Ganev and R. Walters, The QR decomposition for radial neural networks. Submitted, arXiv:2107.02550, 2016.
R.S. Hamilton, Three-manifolds with positive Ricci curvature, J. Differential Geometry 17 (1982), no. 2, 255–306.
H. Hansol, Y. Kim, and S.-C. Lau, Immersed two-spheres and SYZ with application to Grassmannians. Submitted, arXiv:1805.11738, 2018.
B. Hanin and M. Sellke, Approximating continuous functions by relu nets of minimal width. Submitted, arXiv:1710.11278, 2017.
K. Hashimoto, S. Sugishita, A. Tanaka, and A. Tomiya, Deep learning and holographic QCD, Phys. Rev. D 98 (2018), no. 10, 106014, 15.
K. Hashimoto, S. Sugishita, A. Tanaka, and A. Tomiya, Deep learning and the ads/cft correspondence, Physical Review D 98 (2018), no. 4. https://doi.org/10.1103/PhysRevD.98.046019
Y.-H. He and S.-T. Yau, Graph laplacians, riemannian manifolds and their machine-learning. Submitted, arXiv:2006.16619, 2020.
G. Jeffreys and S.-C. Lau, Noncommutative geometry of computational models and uniformization for framed quiver varieties. Submitted, arXiv:2201.05900, 2022.
A.D. King, Moduli of representations of finite-dimensional algebras, Quart. J. Math. Oxford Ser. (2) 45 (1994), no. 180, 515–530.
H. Lin and S. Jegelka, Resnet with one-neuron hidden layers is a universal approximator, Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), vol. 31, Curran Associates, Inc., 2018, pp. 6169–6178.
M. Leshno, V.Ya. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Networks 6 (1993), no. 6, 861 – 867.
Z. Lu, H. Pu, F. Wang, Z. Hu, and L. Wang, The expressive power of neural networks: A view from the width. Submitted, arXiv:1709.02540, 2017.
S. Martin, Symplectic quotients by a nonabelian group and by its maximal torus. Submitted, arXiv:math/0001002, 2000.
H. Mhaskar, Q. Liao, and T. Poggio, Learning functions: When is deep better than shallow. Submitted, arXiv:1603.00988, 2016.
H. N. Mhaskar and C.A. Micchelli, Approximation by superposition of sigmoidal and radial basis functions, Adv. in Appl. Math. 13 (1992), no. 3, 350–373.
H. N. Mhaskar and T. Poggio, Deep vs. shallow networks: an approximation theory perspective, Anal. Appl. (Singap.) 14 (2016), no. 6, 829–848.
H. Nakajima, Instantons on ALE spaces, quiver varieties, and Kac-Moody algebras, Duke Math. J. 76 (1994), no. 2, 365–416.
H. Nakajima, Varieties associated with quivers, Representation theory of algebras and related topics (Mexico City, 1994), CMS Conf. Proc., vol. 19, Amer. Math. Soc., Providence, RI, 1996, pp. 139–157.
H. Nakajima, Quiver varieties and finite-dimensional representations of quantum affine algebras, J. Amer. Math. Soc. 14 (2001), no. 1, 145–238.
T. Nishinou, Y. Nohara, and K. Ueda, Toric degenerations of Gelfand-Cetlin systems and potential functions, Adv. Math. 224 (2010), no. 2, 648–706.
R. Pascanu, Y.N. Dauphin, S. Ganguli, and Y. Bengio, On the saddle point problem for non-convex optimization. Submitted, arXiv:1405.4604, 2014.
G. Perelman, The entropy formula for the ricci flow and its geometric applications. Submitted, arXiv:math/0211159, 2002.
G. Perelman, Ricci flow with surgery on three-manifolds. Submitted, arXiv:math/0303109, 2003.
P.P. Petrushev, Approximation by ridge functions and neural networks, SIAM J. Math. Anal. 30 (1999), no. 1, 155–189.
A. Pinkus, Approximation theory of the MLP model in neural networks, Acta numerica, 1999, Acta Numer., vol. 8, Cambridge Univ. Press, Cambridge, 1999, pp. 143–195.
M. Reineke, Framed quiver moduli, cohomology, and quantum groups, J. Algebra 320 (2008), no. 1, 94–115.
S. Semmes, Complex Monge-Ampère and symplectic manifolds, Amer. J. Math. 114 (1992), no. 3, 495–550.
M. Telgarsky, Benefits of depth in neural networks, 29th Annual Conference on Learning Theory (2016), 1517–1539.
G. Tian, Kähler-Einstein metrics with positive scalar curvature, Invent. Math. 130 (1997), no. 1, 1–37.
K. Uhlenbeck and S.-T. Yau, On the existence of Hermitian-Yang-Mills connections in stable vector bundles, Comm. Pure Appl. Math. 39 (1986), no. S, suppl., S257–S293.
S.-T. Yau, Review of Kähler-Einstein metrics in algebraic geometry, Israel Math. Conference Proceedings, Bar Ilan Univ., 1996, pp. 433–443.
L. Zhang, G. Naitzat, and L.-H. Lim, Tropical geometry of deep neural networks, International conference on machine learning. arXiv:1805.07091
Acknowledgements
We are grateful to Marco Antonio Armenta for informing us about the work [3] and the further useful discussions. We express our gratitude to Shing-Tung Yau for his generous encouragement. The work of S.C. Lau in this paper is partially supported by the Simons collaboration grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Joseph M. Landsberg.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jeffreys, G., Lau, SC. Kähler Geometry of Framed Quiver Moduli and Machine Learning. Found Comput Math 23, 1899–1957 (2023). https://doi.org/10.1007/s10208-022-09587-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-022-09587-3