Skip to main content
Log in

Interpolation models with multiple hyperparameters

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A traditional interpolation model is characterized by the choice of regularizer applied to the interpolant, and the choice of noise model. Typically, the regularizer has a single regularization constant α, and the noise model has a single parameter β. The ratio α/β alone is responsible for determining globally all these attributes of the interpolant: its ‘complexity’, ‘flexibility’, ‘smoothness’, ‘characteristic scale length’, and ‘characteristic amplitude’. We suggest that interpolation models should be able to capture more than just one flavour of simplicity and complexity. We describe Bayesian models in which the interpolant has a smoothness that varies spatially. We emphasize the importance, in practical implementation, of the concept of ‘conditional convexity’ when designing models with many hyperparameters. We apply the new models to the interpolation of neuronal spike data and demonstrate a substantial improvement in generalization error.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Blake, A. and Zisserman, A. (1987) Visual Reconstruction. Cambridge, Mass.: MIT Press.

    Google Scholar 

  • Carter, C. K. and Kohn, R. (1994) On Gibbs sampling for state-space models. Biometrika, 81(3), 541-53.

    Google Scholar 

  • Gilks, W. and Wild, P. (1992) Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337-48.

    Google Scholar 

  • Gu, C. and Wahba, G. (1991) Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM Journal on Scientific and Statistical Computing, 12, 383-98.

    Google Scholar 

  • Gull, S. F. (1989) Developments in maximum entropy data analysis. In J. Skilling (ed.) Maximum Entropy and Bayesian Methods, Cambridge 1988, pp. 53-71. Dordrecht: Kluwer.

    Google Scholar 

  • Kimeldorf, G. S. and Wahba, G. (1970) A correspondence between Bayesian estimation of stochastic processes and smoothing by splines. Annals of Mathematical Statistics, 41(2), 495-502.

    Google Scholar 

  • Lewicki, M. (1994) Bayesian modeling and classification of neural signals. Neural Computation, 6(5), 1005-30.

    Google Scholar 

  • MacKay, D. J. C. (1992) Bayesian interpolation. Neural Computation, 4(3), 415-47.

    Google Scholar 

  • MacKay, D. J. C. (1994) Bayesian non-linear modelling for the prediction competition. In ASHRAE Transactions, V.100, Pt.2, pp. 1053-62, Atlanta, Georgia: ASHRAE.

    Google Scholar 

  • MacKay, D. J. C. (1995) Probabilistic networks: New models and new methods. In ICANN '95, pp. 331-7, Paris: EC2 and Cie.

    Google Scholar 

  • MacKay, D. J. C. (1996) Hyperparameters: Optimize, or integrate out? In G. Heidbreder (ed.) Maximum Entropy and Bayesian Methods, Santa Barbara 1993, pp. 43-60. Dordrecht: Kluwer.

    Google Scholar 

  • Muller, H. G. and Stadtmuller, U. (1987) Variable bandwidth kernel estimators of regression-curves. Annals of Statistics, 15(1), 182-201.

    Google Scholar 

  • Neal, R. M. (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Department of Computer Science, University of Toronto.

  • Neal, R. M. (1996) Bayesian Learning for Neural Networks. Number 118 in Lecture Notes in Statistics. New York: Springer.

    Google Scholar 

  • Shephard, N. (1994) Partial non-Gaussian state-space. Biometrika, 81(1), 115-31.

    Google Scholar 

  • Smith, A. (1991) Bayesian computational methods. Philosophical Transactions of the Royal Society of London A, 337, 369-86.

    Google Scholar 

  • Thomas, A., Spiegelhalter, D. J. and Gilks, W. R. (1992) BUGS: A program to perform Bayesian inference using Gibbs sampling. In J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith (eds) Bayesian Statistics 4, pp. 837-42. Oxford: Clarendon Press.

    Google Scholar 

  • West, M. (1984) Outlier models and prior distributions in Bayesian linear-regression. Journal of the Royal Statistical Society Series B-Methodological, 46(3), 431-9.

    Google Scholar 

  • Williams, C. K. I. and Rasmussen, C. E. (1996) Gaussian processes for regression. In D. S. Touretzky, M. C. Mozer and M. E. Hasselmo (eds) Advances in Neural Information Processing Systems 8. MIT Press.

  • Wolpert, D. H. (1993) On the use of evidence in neural networks. In C. L. Giles, S. J. Hanson and J. D. Cowan (eds) Advances in Neural Information Processing Systems 5, pp. 539-46. San Mateo, California: Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

MACKAY, D.J.C., TAKEUCHI, R. Interpolation models with multiple hyperparameters. Statistics and Computing 8, 15–23 (1998). https://doi.org/10.1023/A:1008862908404

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008862908404

Navigation