MMLD Inference of Multilayer Perceptrons

Makalic, Enes; Allison, Lloyd

doi:10.1007/978-3-642-44958-1_20

Enes Makalic¹⁷ &
Lloyd Allison¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7070))

1609 Accesses

Abstract

A multilayer perceptron comprising a single hidden layer of neurons with sigmoidal transfer functions can approximate any computable function to arbitrary accuracy. The size of the hidden layer dictates the approximation capability of the multilayer perceptron and automatically determining a suitable network size for a given data set is an interesting question. This paper considers the problem of inferring the size of multilayer perceptron networks with the MMLD model selection criterion which is based on the minimum message length principle. The two main contributions of the paper are: (1) a new model selection criterion for inference of fully-connected multilayer perceptrons in regression problems, and (2) an efficient algorithm for computing MMLD-type codelengths in mathematically challenging model classes. Empirical performance of the new algorithm is demonstrated on artificially generated and real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Article Google Scholar
Daniels, H., Kamp, B.: Application of MLP networks to bond rating and house pricing. Neural Computing and Applications 8(3), 226–234 (1999)
Article Google Scholar
Cardinaux, F., Sanderson, C., Marcel, S.: Comparison of MLP and GMM classifiers for face verification on XM2VTS. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 911–920. Springer, Heidelberg (2003)
Chapter Google Scholar
Duch, W., Adamczak, R., Grabczewski, K., Jankowski, N., Zal, G.: Medical diagnosis support using neural and machine learning methods. In: International Conference on Engineering Applications of Neural Networks (EANN 1998), Gibraltar, pp. 292–295 (1998)
Google Scholar
Dowe, D.L.: Foreword re C. S. Wallace. The Computer Journal 51(5), 523–560 (2008)
Article Google Scholar
Fitzgibbon, L.J., Dowe, D.L., Allison, L.: Univariate polynomial inference by Monte Carlo message length approximation. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), pp. 147–154 (2002)
Google Scholar
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length, 1st edn. Information Science and Statistics. Springer (2005)
Google Scholar
Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11(2), 185–194 (1968)
Article MATH Google Scholar
Wallace, C.S., Freeman, P.R.: Estimation and inference by compact coding. Journal of the Royal Statistical Society (Series B) 49(3), 240–252 (1987)
MathSciNet MATH Google Scholar
Solomonoff, R.J.: A formal theory of inductive inference. Information and Control 7(2), 1–22, 224–254 (1964)
Google Scholar
Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Problems of Information Transmission 1(1), 1–7 (1965)
MathSciNet Google Scholar
Chaitin, G.J.: A theory of program size formally identical to information theory. Journal of the Association for Computing Machinery 22(3), 329–340 (1975)
Article MathSciNet MATH Google Scholar
Wallace, C.S., Freeman, P.R.: Single-factor analysis by minimum message length estimation. Journal of the Royal Statistical Society (Series B) 54(1), 195–209 (1992)
Google Scholar
Wallace, C., Dowe, D.L.: MML mixture modelling of multi-state, Poisson, von Mises circular and Gaussian distributions. In: Proceedings of the 6th International Workshop on Artificial Intelligence and Statistics, Ft. Lauderdale, Florida, U.S.A, pp. 529–536 (1997)
Google Scholar
Fukumizu, K.: A regularity condition of the information matrix of a multilayer perceptron network. Neural Networks 9(5), 871–879 (1996)
Article Google Scholar
Makalic, E.: Minimum Message Length Inference of Artificial Neural Networks. PhD thesis, Clayton School of Information Technology, Monash University (2007)
Google Scholar
Zador, P.: Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Transactions on Information Theory 28(2), 139–149 (1982)
Article MathSciNet MATH Google Scholar
Stroud, A.H.: Approximate calculation of multiple integrals. Prentice-Hall (1971)
Google Scholar
Petersen, W.P., Bernasconi, A.: Uniform sampling from an n-sphere: Isotropic method. Technical Report TR-97-06, Swiss Center for Scientific Computing, Zürich, Switzerland (1997)
Google Scholar
Fitzgibbon, L.J., Dowe, D.L., Allison, L.: Bayesian posterior comprehension via message from Monte Carlo. In: Proc. 2nd Hawaii International Conference on Statistics and Related Fields. Springer (June 2003)
Google Scholar
Neal, R.M.: Bayesian learning for neural networks. Lecture Notes in Statistics. Springer (August 1996)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Scarpetta, S., Rattray, M., Saad, D.: Natural gradient matrix momentum. In: Ninth International Conference on Artificial Neural Networks (ICANN 1999), pp. 43–48 (1999)
Google Scholar
Pearlmutter, B.A.: Fast exact multiplication by the hessian. Neural Computation 6(1), 147–160 (1994)
Article Google Scholar
MacKay, D.J.C.: Comparison of approximate methods for handling hyperparameters. Neural Computation 11(5), 1035–1068 (1999)
Article Google Scholar
Makalic, E., Schmidt, D.F.: Minimum message length shrinkage estimation. Statistics & Probability Letters 79(9), 1155–1161 (2009)
Article MathSciNet MATH Google Scholar
Lendasse, A., Simon, G., Wertz, V., Verleysen, M.: Fast bootstrap methodology for regression model selection. Neurocomputing 64(2), 537–541 (2005)
Google Scholar
Alippi, C.: FPE-based criteria to dimension feedforward neural topologies. IEEE Transactions on Circuits and Systems – I: Fundamental Theory and Applications 56(8), 962–973 (1999)
Article Google Scholar
Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19(1), 1–67 (1991)
Article MathSciNet MATH Google Scholar
Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)
Article MathSciNet MATH Google Scholar
Hurvich, C.M., Tsai, C.L.: Model selection for extended quasi-likelihood models in small samples. Biometrics 51(3), 1077–1084 (1995)
Article MATH Google Scholar
Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6(2), 461–464 (1978)
Article MathSciNet MATH Google Scholar
Small, M., Tse, C.K.: Minimum description length neural networks for time series prediction. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics) 66(6), 066701 (2002)
Google Scholar
Murata, N., Yoshizawa, S., Ichi Amari, S.: Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE Transactions on Neural Networks 5(6), 865–872 (1994)
Article Google Scholar
Nelson, W.: Analysis of performance-degradation data. IEEE Transactions on Reliability 2(2), 149–155 (1981)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for MEGA Epidemiology, The University of Melbourne, Carlton, VIC, 3053, Australia
Enes Makalic
Faculty of Information Technology, Monash University, Clayton, VIC, 3800, Australia
Lloyd Allison

Authors

Enes Makalic
View author publications
You can also search for this author in PubMed Google Scholar
Lloyd Allison
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Information Technology, Clayton School of Information Technology, Monash University, Bldg. 63, Wellington Road, 3800, Clayton, VIC, Australia
David L. Dowe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Makalic, E., Allison, L. (2013). MMLD Inference of Multilayer Perceptrons. In: Dowe, D.L. (eds) Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence. Lecture Notes in Computer Science, vol 7070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44958-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-44958-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-44957-4
Online ISBN: 978-3-642-44958-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics