Skip to main content
Log in

Massively Parallel Probabilistic Reasoning with Boltzmann Machines

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

We present a method for mapping a given Bayesian network to a Boltzmann machine architecture, in the sense that the the updating process of the resulting Boltzmann machine model probably converges to a state which can be mapped back to a maximum a posteriori (MAP) probability state in the probability distribution represented by the Bayesian network. The Boltzmann machine model can be implemented efficiently on massively parallel hardware, since the resulting structure can be divided into two separate clusters where all the nodes in one cluster can be updated simultaneously. This means that the proposed mapping can be used for providing Bayesian network models with a massively parallel probabilistic reasoning module, capable of finding the MAP states in a computationally efficient manner. From the neural network point of view, the mapping from a Bayesian network to a Boltzmann machine can be seen as a method for automatically determining the structure and the connection weights of a Boltzmann machine by incorporating high-level, probabilistic information directly into the neural network architecture, without recourse to a time-consuming and unreliable learning process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. J. Anderson and E. Rosenfeld (Eds.), Neurocomputing: Foundations of Research, MIT Press: Cambridge, MA, 1988.

    Google Scholar 

  2. J. Anderson and E. Rosenfeld (Eds.), Neurocomputing 2: Directions for Research, MIT Press: Cambridge, MA, 1991.

    Google Scholar 

  3. D. Rumelhart and J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA, 1986.

    Google Scholar 

  4. J. McClelland and D. Rumelhart (Eds.), Parallel Distributed Processing, MIT Press: Cambridge, MA, vol. 2, 1986.

    Google Scholar 

  5. J. Hopfield and D. Tank, “Neural computation of decisions in optimization problems,” Biological Cybernetics, vol. 52, pp. 141–152, 1985.

    Google Scholar 

  6. E. Baum, “Towards practical ‘neural’ computation for combinatorial optimization problems,” in: Proceedings of the AIP Conference 151: Neural Networks for Computing, edited by J. Denker, Snowbird, UT, pp. 53–58, 1986.

  7. G. Hinton and T. Sejnowski, “Optimal perceptual inference,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 1983, pp. 448–453.

  8. G. Hinton and T. Sejnowski, Learning and Relearning in Boltzmann Machines, in J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA 3, pp. 282–317, 1986.

    Google Scholar 

  9. Y. Freund and D. Haussler, “Unsupervised learning of distributions on binary vectors using two layer networks,” in Neural Information Processing Systems 4, edited by J. Moody, S. Hanson, and R. Lippmann, Morgan Kaufmann Publishers: San Mateo, CA, pp. 912–919, 1992.

    Google Scholar 

  10. C. Galland, “Learning in deterministic Boltzmann machine networks,” Ph.D. Thesis, Department of Physics, University of Toronto, 1992.

  11. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers: San Mateo, CA, 1988.

    Google Scholar 

  12. R. Shachter, “Probabilistic inference and influence diagrams,” Operations Research, vol. 36,no. 4, pp. 589–604, 1988.

    Google Scholar 

  13. R. Neapolitan, Probabilistic Reasoning in Expert Systems, John Wiley & Sons: New York, NY, 1990.

    Google Scholar 

  14. D. Heckerman, D. Geiger, and D. Chickering, “Learning Bayesian networks: The combination of knowledge and statistical data,” Machine Learning, vol. 20,no. 3, pp. 197–243, 1995.

    Google Scholar 

  15. M. Henrion, “An introduction to algorithms for inference in belief nets,” in Uncertainty in Artificial Intelligence 5, edited by M. Henrion, R. Shachter, L. Kanal, and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 129–138, 1990.

    Google Scholar 

  16. G. Cooper, “The computational complexity of probabilistic inference using Bayesian belief networks,” Artificial Intelligence, vol. 42,no. 2/3, pp. 393–405, 1990.

    Google Scholar 

  17. S. Shimony, “Finding MAPs for belief networks is NP-hard,” Artificial Intelligence, vol. 68, pp. 399–410, 1994.

    Google Scholar 

  18. N. Metropolis, A. Rosenbluth, M. Rosenbluth, M. Teller, and E. Teller, “Equations of state calculations by fast computing machines,” Journal of Chem. Phys., vol. 21, pp. 1087–1092, 1953.

    Google Scholar 

  19. S. Kirkpatrick, D. Gelatt, and M. Vecchi, “Optimization by simulated annealing,” Science, vol. 220,no. 4598, pp. 671–680, 1983.

    Google Scholar 

  20. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, pp. 721–741, 1984.

    Google Scholar 

  21. E. Aarts and J. Korst, Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing, John Wiley & Sons: Chichester, 1989

    Google Scholar 

  22. L. Ingber, “Very fast simulated re-annealing,” Mathematical Computer Modelling, vol. 8,no. 12, pp. 967–973, 1989.

    Google Scholar 

  23. L. Ingber, “Adaptive simulated annealing (ASA): Lessons learned,” Control and Cybernetics, vol. 25,no. 1, pp. 33–54, 1996.

    Google Scholar 

  24. H. Szu and R. Hartley, “Nonconvex Optimization by Fast Simulated Annealing,” Proceedings of the IEEE, vol. 75,no. 11, pp. 1538–1540, 1987.

    Google Scholar 

  25. G. Bilbro, R. Mann, and T. Miller, “Optimization by mean field annealing,” in Advances in Neural Information Processing Systems I, edited by D. Touretzky, Morgan Kaufmann Publishers: San Mateo, CA, pp. 91–98, 1989.

    Google Scholar 

  26. C. Peterson and J. Anderson, “A mean field theory learning algorithm for neural networks,” Complex Systems, vol. 1, pp. 995–1019, 1987.

    Google Scholar 

  27. J. Alspector, T. Zeppenfeld, and S. Luna, “A volatility measure for annealing in feedback neural networks,” Neural Computation vol. 4, pp. 191–195, 1992.

    Google Scholar 

  28. N.R.S. Ansari and G. Wang, “An efficient annealing algorithm for global optimization in Boltzmann machines,” Journal of Applied Intelligence, vol. 3,no. 3, pp. 177–192, 1993.

    Google Scholar 

  29. S. Rajasekaran and J.H. Reif, “Nested annealing: A provable improvement to simulated annealing,” Theoretical Computer Science, vol. 99, pp. 157–176, 1992.

    Google Scholar 

  30. D. Greening, “Parallel simulated annealing techniques,” Physica D, vol. 42, pp. 293–306, 1990.

    Google Scholar 

  31. L. Ingber, “Simulated annealing: Practice versus theory,” Mathematical Computer Modelling, vol. 18,no. 11, pp. 29–57, 1993.

    Google Scholar 

  32. H. Geffner and J. Pearl, “On the probabilistic semantics of connectionist networks, Technical Report R-84, UCLA Computer Science Department, Los Angeles, CA, 1987.

    Google Scholar 

  33. K. Laskey, “Adapting connectionist learning to Bayesian networks,” International Journal of Approximate Reasoning, vol. 4, pp. 261–282, 1990.

    Google Scholar 

  34. T. Hrycej, “Common features of neural-network models of high and low level human information processing,” in Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), edited by T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, Espoo, Finland, 1991, pp. 861–866.

  35. R. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, vol. 56, pp. 71–113, 1992.

    Google Scholar 

  36. P. Myllymäki, “Mapping Bayesian networks to stochastic neural networks: A foundation for hybrid Bayesian-neural systems,” Ph.D. Thesis, Report A-1995–1, Department of Computer Science, University of Helsinki, 1995.

  37. P. Myllymäki and P. Orponen, “Programming the harmonium,” in Proceedings of the International Joint Conference on Neural Networks, vol. 1, Singapore, 1991, pp. 671–677.

    Google Scholar 

  38. P. Myllymäki, “Bayesian reasoning by stochastic neural networks,” Ph.Lic. Thesis, Tech. Rep. C-1993–67, Department of Computer Science, University of Helsinki, 1993.

  39. P. Smolensky, Information Processing in Dynamical Systems: Foundations of Harmony Theory, in J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA3, pp. 194–281, 1986.

    Google Scholar 

  40. P. Myllymäki, “Using Bayesian networks for incorporating probabilistic a priori knowledge into Boltzmann machines,” in Proceedings of SOUTHCON'94, Orlando, pp. 97–102, 1994.

  41. P. Myllymäki, “Mapping Bayesian networks to Boltzmann machines,” in Proceedings of Applied Decision Technologies 1995, edited by A. Gammerman, NeuroCOLT Technical Report NC-TR–95–034, London, 1995, pp. 269–280.

  42. J. Barnden and J. Pollack (Eds.), Advances in Connectionist and Neural Computation Theory, Vol. I: High Level Connectionist Models, Ablex Publishing Company: Norwood, NJ, 1991.

    Google Scholar 

  43. G. Hinton, “Special issue on connectionist symbol processing,” Artificial Intelligence, vol. 46,no. 1–2, 1990.

  44. S. Goonatilake and S. Khebbal (Eds.), Intelligent Hybrid Systems, John Wiley & Sons: Chichester, 1995.

    Google Scholar 

  45. R. Sun, Integrating Rules and Connectionism for Robust Commonsense Reasoning, John Wiley & Sons: Chichester, 1994.

    Google Scholar 

  46. P. Floréen, P. Myllymäki, P. Orponen, and H. Tirri, “Neural representation of concepts for robust inference,” in Proceedings of the International Symposium Computational Intelligence II, edited by F. Gardin and G. Mauri, Milano, Italy, pp. 89–98, 1989.

  47. P. Myllymäki, H. Tirri, P. Floréen, and P. Orponen, “Compiling high-level specifications into neural networks,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2, Washington, D.C., 1990, pp. 475–478.

    Google Scholar 

  48. P. Floréen, P. Myllymäki, P. Orponen, and H. Tirri, “Compiling object declarations into connectionist networks,” AI Communications, vol. 3,no. 4, pp. 172–183, 1990.

    Google Scholar 

  49. L. Shastri, Semantic Networks: An Evidential Formalization and Its Connectionist Realization. Pitman: London, 1988.

    Google Scholar 

  50. J. Pearl, “Fusion, propagation and structuring in belief networks,” Artificial Intelligence, vol. 29, pp. 241–288, 1986.

    Google Scholar 

  51. R. Howard and J. Matheson, “Influence diagrams,” in Readings in Decision Analysis, edited by R.A. Howard and J.E. Matheson, Strategic Decisions Group: Menlo Park, CA, pp. 763–771, 1984.

    Google Scholar 

  52. F. Jensen, An Introduction to Bayesian Networks. UCL Press: London, 1996.

    Google Scholar 

  53. E. Castillo, J. Gutiérrez, and A. Hadi, “Expert systems and probabilistic network models,” Monographs in Computer Science, Springer-Verlag: New York, NY, 1997.

    Google Scholar 

  54. T. Hrycej, “Gibbs sampling in Bayesian networks,” Artificial Intelligence, vol. 46, pp. 351–363, 1990.

    Google Scholar 

  55. S. Lauritzen and D. Spiegelhalter, “Local computations with probabilities on graphical structures and their application to expert systems,” J. Royal Stat. Soc., Ser. B, vol. 50,no. 2, pp. 157–224, 1988. Reprinted as pp. 415–448 in [63].

    Google Scholar 

  56. D. Spiegelhalter, “Probabilistic reasoning in predictive expert systems,” in Uncertainty in Artificial Intelligence 1, edited by L. Kanal and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 47–67, 1986.

    Google Scholar 

  57. W. Hastings, “Monte Carlo sampling methods using Markov chains and their applications,” Biometrika, vol. 57, pp. 97–109, 1970.

    Google Scholar 

  58. A. Barker, “Monte Carlo calculations of the radial distribution functions for a proton-electron plasma,” Aust. J. Phys., vol. 18, pp. 119–133, 1965.

    Google Scholar 

  59. A. de Gloria, P. Faraboschi, and M. Olivieri, “Clustered Boltzmann Machines: Massively parallel architectures for constrained optimization problems,” Parallel Computing, pp. 163–175, 1993.

  60. D. Johnson, C. Aragon, L. McGeoch, and C. Schevon, “Optimization by simulated annealing: An experimental evaluation; Part I, graph partitioning,” Operations Research, vol. 37,no. 6, pp. 865–892, 1989.

    Google Scholar 

  61. H. Chin and G. Cooper, “Bayesian belief network inference using simulation,” in Uncertainty in Artificial Intelligence 3, edited by L. Kanal and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 129–147, 1989.

    Google Scholar 

  62. G. Hinton, “Connectionist learning procedures,” Artificial Intelligence, vol. 40,no. 1–3, 1989.

  63. G. Shafer and J. Pearl (Eds.), Readings in Uncertain Reasoning, Morgan Kaufmann Publishers: San Mateo, CA, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Myllymäki, P. Massively Parallel Probabilistic Reasoning with Boltzmann Machines. Applied Intelligence 11, 31–44 (1999). https://doi.org/10.1023/A:1008324530006

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008324530006

Navigation