Abstract
Energy-based models (EBMs) are experiencing a resurgence of interest in both the physics community and the machine learning community. This article provides an intuitive introduction to EBMs, without requiring any background in machine learning, connecting elementary concepts from physics with basic concepts and tools in generative models, and finally giving a perspective where current research in the field is heading. This article, in its original form, was written as an online lecture note in HTML and Javascript and contains interactive graphics. We recommend the reader to also visit the interactive version.
Similar content being viewed by others
Notes
The entropy of a system is given by \(S = \sum _x -P(x)\log P(x)\), where P(x) is the probability of a state x. If all states are equally likely, the entropy is maximal. If only one state \(P(x) = 1\) then the entropy is minimal.
By definition, being at thermal equilibrium with a bath of temperature T means that the system is also at tempearature T. The temperature determines the average energy of a system. With large T, the probability of high energy states increases, and so does the average energy.
For any distribution in the exponential family, a statistic T(x) is sufficient if we can write the probability p(x) as
$$\begin{aligned}p(x) = \exp \left( \alpha (\theta ) T(x) + A(\theta ) \right) ,\end{aligned}$$where \(\alpha (\theta )\) is a vector-valued function and \(A(\theta )\) is a scalar, which for a Boltzmann distribution is related to the partition function as \(A(\theta ) = \log (1/Z)\) (Li et al. 2013).
There are other conventions for the Ising energy function in the literature, where the signs of the energy terms change. For example,
$$\begin{aligned} E(\sigma ) = \sum \limits _i b_i \sigma _i + \sum \limits _{ij} w_{ij} \sigma _i\sigma _j. \end{aligned}$$We follow the convention introduced above.
Non-commuting operators do not have a common eigenfunction or eigenstate. Since these eigenfunctions in quantum mechanics (for hermitian operators) are orthogonal, non-commutating operators lead to non-orthogonal eigenfunctions or eigenstates and therefore there is no measurement operator that can reliably distinguish these non-orthogonal states. This gives rise to the uncertainty principle. The no-cloning theorem follows a similar argument. Orthogonal states can be cloned, but non-orthogonal ones cannot.
Droplet is a local low-energy cluster of spins where the distribution is disconnected from the rest of the system.
References
Amin MH, Andriyash E, Rolfe J, Kulchytskyy B, Melko R (2018) Quantum Boltzmann machine. Physical Review X 8(2):021050
Aurell E, Ekeberg M (2012) Inverse Ising inference using all the data. PhysicaL Review Letters 108(9):090201
Amit DJ, Gutfreund H, Sompolinsky H (1985) Spin-glass models of neural networks. Physical Review A 32(2):1007
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cognitive Science 9(1):147–169
Biamonte JD (2008) Nonperturbative k-body to two-body commuting conversion Hamiltonians and embedding problem instances into Ising spins. Physical Review A 77(5):052331
Babbush R, O’Gorman B, Aspuru-Guzik A (2013) Resource efficient gadgets for compiling adiabatic quantum optimization problems. Annalen der Physik 525(10–11):877–888
Borders WA, Pervaiz AZ, Fukami S, Camsari KY, Ohno H, Datta S (2019) Integer factorization using stochastic magnetic tunnel junctions. Nature 573(7774):390–393
Benedetti M, Realpe-Gómez J, Biswas R, Perdomo-Ortiz A (2017) Quantum-assisted learning of hardware-embedded probabilistic graphical models. Physical Review X 7(4):041052
Courville A, Bergstra J, Bengio Y (2011) A spike and slab restricted Boltzmann machine. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, pp 233–241
Cipra BA (1987) An introduction to the Ising model. The American Mathematical Monthly 94(10):937–959
Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. In: Aistats, vol 10. Citeseer, pp 33–40
Carleo G, Troyer M (2017) Solving the quantum many-body problem with artificial neural networks. Science 355(6325):602–606
Du Y, Lin T, Mordatch I (2019) Model based planning with energy based models. arXiv:1909.06878
Du Y, Mordatch I (2019) Implicit generation and generalization in energy-based models
Dahl G, Ranzato MA, Mohamed A-R, Hinton GE (2010) Phone recognition with the mean-covariance restricted Boltzmann machine. In: Advances in neural information processing systems. pp 469–477
Earl DJ, Deem MW (2005) Parallel tempering: theory, applications, and new perspectives. Phys Chem Chem Phys 7:3910–3916
Finn C, Christiano P, Abbeel P, Levine S (2016) A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models. arXiv:1611.03852
Gao X, Duan L-M (2017) Efficient representation of quantum many-body states with deep neural networks. Nature Communications 8(1):662
Goldstein H (2002) Classical mechanics. Pearson Education
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications
Hamze F, de Freitas N (2012) From fields to trees. arXiv:1207.4149
Huembeli P, Dauphin A, Wittek P, Gogolin C (2019) Automated discovery of characteristic features of phase transitions in many-body localization. Physical Review B 99(10):104106
Hen I (2017) Solving spin glasses with optimized trees of clustered spins. Phys Rev E 96:022105
Hopfield JJ, Feinstein DI, Palmer RG (1983) Unlearning has a stabilizing effect in collective memories. Nature 304(5922):158
Hartnett GS, Mohseni M (2020) Self-supervised learning of generative spin-glasses with normalizing flows
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences 79(8):2554–2558
Houdayer J (2001) A cluster Monte Carlo algorithm for 2-dimensional spin glasses. The European Physical Journal B-Condensed Matter and Complex Systems 22(4):479–484
Hsieh CY, Sun Q, Zhang S, Lee CK (2021) Unitary-coupled restricted boltzmann machine ansatz for quantum simulations. NPJ Quantum Information 7(1):1–10
Haarnoja T, Tang H, Abbeel P, Levine S (2017) Reinforcement learning with deep energy-based policies. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp 1352–1361, JMLR. org
Iten R, Metger T, Wilming H, Del Rio L, Renner R (2018) Discovering physical concepts with neural networks. arXiv:1807.10300
Jaynes ET (1957) Information theory and statistical mechanics. Physical Review 106(4):620
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Khoshaman A, Vinci W, Denis B, Andriyash E, Sadeghi H, Amin MH (2018) Quantum variational autoencoder. Quantum Science and Technology 4(1):014001
Kieferová M, Wiebe N (2017) Tomography and generative training with quantum Boltzmann machines. Physical Review A 96(6):062327
LeCun Y, Chopra S, Hadsell R (2006) A tutorial on energy-based learning
Le Roux N, Bengio Y (2008) Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6):1631–1649
Le Roux N, Bengio Y (2010) Deep belief networks are compact universal approximators. Neural Computation 22(8):2192–2207
Liu J-G, Wang L (2018) Differentiable learning of quantum circuit born machines. Physical Review A 98(6):062324
Li X, Wang B, Liu Y, Lee TS (2013) Learning discriminative sufficient statistics score space for classification. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, pp 49–64
Melko RG, Carleo G, Carrasquilla J, Cirac JI (2019) Restricted Boltzmann machines in quantum physics. Nature Physics 15(9):887–892
Mezard M, Montanari A (2009) Information, physics, and computation. Oxford University Press Inc, New York
Moore C, Mertens S (2011) The nature of computation. Oxford University Press Inc, New York
Melnikov AA, Nautrup HP, Krenn M, Dunjko V, Tiersch M, Zeilinger A, Briegel HJ (2018) Active learning machine learns to create new quantum experiments. Proceedings of the National Academy of Sciences 115(6):1221–1226
Mohseni M (2021) Article in preparation
Neyshabur B, Bhojanapalli S, McAllester D, Srebro N (2017) Exploring generalization in deep learning. In: Advances in neural information processing systems. pp 5947–5956
Nielsen MA, Chuang I (2002) Quantum computation and quantum information
Nijkamp E, Hill M, Han T, Zhu S-C, Wu YN (2019) On the anatomy of mcmc-based maximum likelihood learning of energy-based models
Robert CP, Casella G (1999) The Metropolis—Hastings algorithm. In: Monte Carlo statistical methods. Springer, pp 231–283
Rojas R (1996) Neural networks: a systematic introduction. Springer-Verlag, Berlin, Heidelberg
Rojas R (2013) Neural networks: a systematic introduction. Springer Science & Business Media, New York
Swersky K, Buchman D, Freitas ND, Marlin BM, et al (2011) On autoencoders and score matching for energy based models. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). pp 1201–1208
Selby A (2014) Efficient subgraph-based sampling of Ising-type models with frustration. arXiv:1409.3934
Sutton B, Faria R, Ghantasala LA, Jaiswal R, Camsari KY, Datta S (2020) Autonomous probabilistic coprocessing with petaflips per second. IEEE Access 8:157238–157252
Salakhutdinov R, Mnih A, Hinton G (2007) Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning Pages. ACM, pp 791–798
Stein DL, Newman CM (2013) Spin glasses and complexity. Princeton University Press, Princeton
Swendsen RH, Wang J-S (1987) Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett 58:86–88
Torlai G, Mazzola G, Carrasquilla J, Troyer M, Melko R, Carleo G (2018) Neural-network quantum state tomography. Nature Physics 14(5):447
van Hemmen JL (1986) Spin-glass models of a neural network. Physical Review A 34(4):3435–3445
Verdon G, Marks J, Nanda S, Leichenauer S, Hidary J (2019) Quantum Hamiltonian-based models and the variational quantum thermalizer algorithm. arXiv:1910.02071
Wetzel SJ (2017) Unsupervised learning of phase transitions: from principal component analysis to variational autoencoders. Physical Review E 96(2):022140
Wolff U (1989) Collective Monte Carlo updating for spin systems. Phys Rev Lett 62:361–364
Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv:1605.07717
Zhao, J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network. arXiv:1609.03126
Zhang J, Wang H, Chu J, Huang S, Li T, Zhao Q (2019) Improved Gaussian-Bernoulli restricted Boltzmann machine for learning discriminative representations. Knowledge-Based Systems 185:104911
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huembeli, P., Arrazola, J.M., Killoran, N. et al. The physics of energy-based models. Quantum Mach. Intell. 4, 1 (2022). https://doi.org/10.1007/s42484-021-00057-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42484-021-00057-7