Abstract
This work reports the results obtained with the application of High Order Boltzmann Machines without hidden units to construct classifiers for some problems that represent different learning paradigms. The Boltzmann Machine weight updating algorithm remains the same even when some of the units can take values in a discrete set or in a continuous interval. The absence of hidden units and the restriction to classification problems allows for the estimation of the connection statistics, without the computational cost involved in the application of simulated annealing. In this setting, the learning process can be sped up several orders of magnitude with no appreciable loss of quality of the results obtained.
Similar content being viewed by others
References
E.H.L. Aarts and J.H.M. Korst, Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing, John Wiley & Sons, 1989.
D.H. Ackley, G.E. Hinton, and T.J. Sejnowski, “A learning algorithm for Boltzmann Machines,” Cogn. Sci., vol. 9, pp. 147–169, 1985.
R. Durbin and D.E. Rumelhart, “Product units: A computationally powerful and biologically plausible extension to backpropagation networks,” Neural Computation, vol. 1, pp. 133–142.
G. Pinkas, “Energy minimization and satisfiability of propositional logic,” edited by Touretzky, Elman, Sejnowski, and Hinton, Connectionist Models Summer School, 1990.
B. Lenze, “How to make sigma-pi neural networks perform perfectly on regular training sets,” Neural Networks, vol. 7,no. 8, pp. 1285–1293, 1994.
F.X. Albizuri, A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “The high order Boltzmann Machine: Learned distribution and topology,” IEEE Trans. Neural Networks, vol. 6,no. 3, pp. 767–770, 1995.
F.X. Albizuri, “Maquina de Boltzmann de alto orden: Una red neuronal con técnicas de Monte Carlo para modelado de distribuciones de probabilidad. Caracterización y estructura,” Ph.D. Thesis Dept. CCIA Univ. Pais Vasco, 1995.
F.X. Albizuri, A. d'Anjou, M. Graña, and J.A. Lozano, “Convergence properties of high-order Boltzmann Machines,” Neural Networks (in press).
C. Peterson and J.R. Anderson, “A mean field algorithm for neural networks,” complex Systems, vol. 1, pp. 995–1019, 1987.
C. Peterson and E. Hartman, “Explorations of the mean field theory learning algorithm,” Neural Networks, vol. 2,no. 6, pp. 475–494, 1989.
G.E. Hinton, “Deterministic Boltmann learning performs steepest descent in weight-space,” Neural Computation, vol. 1, pp. 143–150, 1989.
G.L. Bilbro, R. Mann, T.K. Miller, W.E. Snyder, D.E. van den Bout, and M. White, “Optimization by mean field annealing,” Advances in Neural Information Processing Systems, edited by I. Touretzky, Morgan Kauffamn: Sna Mateo, CA, pp. 91–98, 1989.
G.L. Bilbro, W.E. Snyder, S.J. Garnier, and J.W. Gault, “Mean field annealing: A formalism for constructing GNC-like algorithms,” IEEE Trans. Neural Networks, vol. 3,no. 1, pp. 131–138, 1992.
L. Saul and M.I. Jordan, “Learning in Boltzmann trees,” Neural Computation, vol. 6, pp. 1174–1184, 1994.
T.J. Sejnowski, “Higher order Boltzmann Machines,” in Neural Networks for Computing AIP Conf. Proc. 151, edited by Denker, Snowbird UT, 1986, pp. 398–403.
G.E. Hinton, “Connectionist learning procedures,” Artificial Intelligence, vol. 40, pp. 185–234, 1989.
S.J. Perantonis and P.J.G. Lisboa, “Translation, rotation and scale invariant pattern recognition by high-order neural networks and moment classifiers,” IEEE Trans. Neural Net., vol. 3,no. 2, pp. 241–251.
S. Sunthakar and V.A. Jaravine, “Invariant pattern recognition using high-order neural networks,” Intelligent Robots and Computer Vision, vol. SPIE 1826, pp. 160–167, 1992.
J.G. Taylor and S. Coombes, “Learning higher order correlations,” Neural Networks, vol. 6, pp. 423–427, 1993.
J. Mendel and L.X. Wang, “Identification of moving-average systems using higher-order statistics and learning,” in Neural Networks for Signal Processing, edited by B. Kosko, Prentice-Hall, pp. 91–120, 1993.
M. Heywood and P. Noakes, “A framework for improved training of sigma-pi networks,” IEEE Trans. Neural Networks, vol. 6,no. 4, pp. 893–903, 1995.
J.M. Karlholm, “Associative memories with short-range higher order coupling,” Neural Networks, vol. 6, pp. 409–421, 1993.
J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation, Addison Wesley, 1991.
B. Kosko, Neural Networks and Fuzzy Systems, Prentice Hall, 1992.
T. Kohonen, Self-Organization and Associative Memory, Springer-Verlag, 1989.
K.M. Gutzmann, “Combinatorial optimization using a continuous state Boltzmann Machine,” Proc. IEEE Int. Conf. Neural Networks, San Diego, CA, 1987, pp. III-721–734.
S. Amari, “Dualistic geometry of the manifolds of high-order neurons,” Neural Networks, vol. 4, pp. 443–451, 1991.
S. Amari, K. Kurata, and H. Nagaoka, “Information geometry of Boltzmann Machines,” IEEE trans. Neural Networks, vol. 3,no. 2, pp. 260–271, 1992.
J. Alspector, T. Zeppenfeld, and S. Luna, “A volatility measure for annealing in feedback neural networks,” Neural Computation, vol. 4, pp. 191–195, 1992.
S. Kirpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by simulated annealing,” Science, vol. 20, pp. 671–680, 1983.
C. Peterson and B. Soderberg, “A new method for mapping optimization problems onto neural networks,” Int. Journal Neural Systems, vol. 1,no. 1, pp. 3–22, 1989.
E. Goles and S. Martinez, Neural and Automata Networks. Dynamical Behavior and Applications, Kluwer Acad. Pub., 1991.
C.T. Lin and C.S.G. Lee, “A multi-valued Boltzmann Machine,” IEEE Trans. Systems, Man and Cybernetics, vol. 25,no. 4, pp. 660–669, 1995.
L. Parra and G. Deco, “Continuous Boltzmann Machine with rotor neurons,” Neural Networks, vol. 8,no. 3, pp. 375–385, 1995.
R.S. Zemel, C.K.I. Williams, and M.C. Mozer, “Directionalunit Boltzmann Machines,” in Adv. Neural Information Processing Systems, edited by S.J. Hanson, J.D. Cowan, and C.L. Giles, Morgan Kauffamn, vol. 5, pp. 172–179, 1993.
G.A. Kohring, “On the Q-state neuron problem in attractor neural networks,” Neural Networks, vol. 6, pp. 573–581, 1993.
S.L. Lauritzen, “Lectures on contingency tables,” Inst. Elect. Sys., Dept. Math. Comp. Science, Univ. Aalborg (Denmark), 1989.
Y.M.M. Bishop, S.E. Fienberg, and P.W. Holland, Discrete Multivariate Analysis. Theory and Practice, MIT, Press 1975 (10th edition, 1989).
E.B. Andersen, The Statistical Analysis of Categorical Data, Springer Verlag, 1991.
H. Gefner and J. Pearl, “On the probabilistic semantics of connectionist networks,” 1st IEEE Int. Conf. Neural Networks, pp. 187–195.
R.M. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, vol. 56, pp. 71–113, 1992.
R.M. Neal, “Asymmetric parallel Boltzmann Machines are belief networks,” Neural Computation, vol. 4, pp. 832–834, 1992.
C. Berge, “Optimisation and hypergraph theory,” European Journal of Operational Research, vol. 46, pp. 297–303, 1990.
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kauffman Pub., 1988.
M. Graña, A. D'Anjou, F.X. Albizuri, J.A. Lozano, P. Larrañaga, Y. Yurramendi, M. Hernandez, J.L. Jimenez, F.J. Torrealdea, and M. Poza, “Experimentos de aprendizaje con Máquinas de Boltzmann de alto orden,” Informática y Automática (in press).
S.B. Thrun et al., “The MONK's problems: A performance comparison of different learning algorithms,” Informe CMU-CS–91–197, Carnegie Mellon Univ.
M. Graña, V. Lavin, A. D'Anjou, F.X. Albizuri, and J.A. Lozano, High-Order Boltzmann Machines Applied to the Monk's Problems ESSAN'94, DFacto Press: Brussels, Belgium, pp. 117–122.
A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “Máquinas de Boltzmann para la resolución del problema de la satisfiabilidad en el cálculo proposicional,” Revista Española de Informática y Automática, vol. 24, pp. 40–49, 1992.
A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “Solving satisfiability via Boltzmann Machines,” IEEE Trans. on Patt. Anal. and Mach. Int., vol. 15,no. 5, pp. 514–521.
R.P. Gorman and T.J. Sejnowski, “Analysis of hidden units in a layered network trained to classify sonar targets,” Neural Networks, vol. 1, pp. 75–89, 1988.
D.H. Deterding, “Speaker normalisation for automatic speech recognition,” Ph.D. Thesis, University of Cambridge, 1989.
A.J. Robinson, “Dynamic error propagation networks,” Ph.D. Thesis, Cambridge University, Engineering Department, 1989.
A.J. Robinson and F. Fallside, “A dynamic connectionist model for phoneme recognition,” Proceedings of nEuro '88, Paris, June, 1988.
F.L. Chung and T. Lee, “Fuzzy competitive learning,” Neural Networks, vol. 7,no. 3, pp. 539–551, 1994.
M. Cottrell et al., Time Series and Neural Networks: A Statistical Method for Weight Elimination ESSAN'93, DFacto Press: Brussels, Belgium, pp. 157–164, 1993.
O. Fambon and C. Jutten, A Comparison of Two Weight Pruning Methods ESSAN'94, DFacto Press: Brussels, Belgium, pp. 147–152, 1994.
M. Cottrell, B. Girard, Y. Girard, M. Mangeas, and C. Muller, “Neural modelling for time series: A statistical stepwise method for weight elimination,” IEEE Trans. Neural Networks, vol. 6,no. 6, pp. 1355–1364, 1995.
C. Jutten and O. Fambon, Pruning Methods: A Review ESANN'95, Dfacto Press, pp. 129–140, 1995.
C. Jutten and R. Chentouf, “A new scheme for incremental learning,” Neural Processing Letters, vol. 2,no. 1, pp. 1–4, 1995.
C. Jutten, “Learning in evolutive neural network architectures: An ill posed problem,” in From Natural to Artificial Neural Computation (IWANN'95), edited by J. Mira and F. Sandoval, Springer Verlag, vol. LNCS 930, pp. 361–373, 1995.
C. Lee Giles and C.W. Omlin, “Pruning recurrent neural networks for improved generalization performance,” IEEE Trans. Neural Networks, vol. 5,no. 5, pp. 848–856, 1994.
R. Reed, “Pruning algorithms—A survey,” IEEE Trans. Neural Networks, vol. 4,no. 5, pp. 740–747.
R.F. Albrecht, C.R. Reeves, and N.C. Steele (eds.), Artificial Neural Nets and Genetic Algorithms, Springer Verlag, 1993.
R.A. Zitar and M.H. Hassoun, “Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system,” IEEE Trans. Neural Networks, vol. 6,no. 4, pp. 859–879, 1995.
G.E. Hinton, Lectures at the Neural Network Summer School, Wolfson College: Cambridge, Sept. 1993.
P.J. Zwietering and E.H.L. Aarts, “The convergence of parallel Boltzmann Machines,” Parallel Processing in Neural Systems and Computers, edited by Eckmiller, Hartmann, and Hauske, North-Holland, pp. 277–280, 1990.
P.J. Zwietering and E.H.L. Aarts, “Parallel Boltzmann Machines: A mathematical model,” Journal of Parallel and Dist. Computers, vol. 13, pp. 65–75.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Gra~na, M., D'Anjou, A., Albizuri, F. et al. Experiments of Fast Learning with High Order Boltzmann Machines. Applied Intelligence 7, 287–303 (1997). https://doi.org/10.1023/A:1008257203142
Issue Date:
DOI: https://doi.org/10.1023/A:1008257203142