Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks

Schmitt, Marvin; Bürkner, Paul-Christian; Köthe, Ullrich; Radev, Stefan T.

doi:10.1007/978-3-031-54605-1_35

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14264))

Included in the following conference series:

DAGM German Conference on Pattern Recognition

724 Accesses

Abstract

Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately—that is, if the true system behavior at test time deviates from the one seen during training? We conceptualize the types of model misspecification arising in SBI and systematically investigate how the performance of neural posterior approximators gradually deteriorates under these misspecifications, making inference results less and less trustworthy. To notify users about this problem, we propose a new misspecification measure that can be trained in an unsupervised fashion (i.e., without training data from the true distribution) and reliably detects model misspecification at test time. Our experiments clearly demonstrate the utility of our new measure both on toy examples with an analytical ground-truth and on representative scientific tasks in cell biology, cognitive decision making, and disease outbreak dynamics. We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Conditional Coverage Estimation for High-Quality Prediction Intervals

Article 26 March 2023

MAS Network: Surrogate Neural Network for Multi-agent Simulation

Acceleration of PDE-Based Biological Simulation Through the Development of Neural Network Metamodels

Article 06 November 2020

Notes

1.
We demonstrate in Experiment 1 that model misspecification also affects the performance of non-amortized sequential neural posterior estimation.

References

Alquier, P., Ridgway, J.: Concentration of tempered posteriors and of their variational approximations. arXiv:1706.09293 [cs, math, stat] (2019). arXiv: 1706.09293
Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks (2019)
Google Scholar
Berger, J.O., Wolpert, R.L.: The Likelihood Principle. No. v. 6 in Lecture Notes-Monograph Series. 2nd edn. Institute of Mathematical Statistics, Hayward (1988)
Google Scholar
Bieringer, S., et al.: Measuring QCD splittings with invertible networks. SciPost Phys. Proc. 10(6), 126 (2021)
Google Scholar
Bloem-Reddy, B., Teh, Y.W.: Probabilistic symmetries and invariant neural networks. J. Mach. Learn. Res. 21, 90–1 (2020)
MathSciNet Google Scholar
Butter, A., et al.: Machine learning and LHC event generation. arXiv preprint arXiv:2203.07460 (2022)
Bürkner, P.C., Gabry, J., Vehtari, A.: Approximate leave-future-out cross-validation for Bayesian time series models. J. Stat. Comput. Simul. 90(14), 2499–2523 (2020). https://doi.org/10.1080/00949655.2020.1783262. arXiv:1902.06281 [stat]
Cannon, P., Ward, D., Schmon, S.M.: Investigating the impact of model misspecification in neural simulation-based inference (2022). arXiv:2209.01845 [cs, stat]
Cranmer, K., Brehmer, J., Louppe, G.: The frontier of simulation-based inference. Proc. Natl. Acad. Sci. 117(48), 30055–30062 (2020)
Article MathSciNet Google Scholar
Dehning, J., et al.: Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science 369(6500) (2020)
Google Scholar
Delaunoy, A., Hermans, J., Rozet, F., Wehenkel, A., Louppe, G.: Towards reliable simulation-based inference with balanced neural ratio estimation (2022). arXiv:2208.13624 [cs, stat]
Dellaporta, C., Knoblauch, J., Damoulas, T., Briol, F.X.: Robust Bayesian inference for simulator-based models via the MMD Posterior Bootstrap (2022). https://doi.org/10.48550/ARXIV.2202.04744
Dong, E., Du, H., Gardner, L.: An interactive web-based dashboard to track COVID-19 in real time. Lancet. Infect. Dis 20(5), 533–534 (2020). https://doi.org/10.1016/S1473-3099(20)30120-1
Article Google Scholar
Durkan, C., Murray, I., Papamakarios, G.: On contrastive learning for likelihood-free inference. In: International Conference on Machine Learning, pp. 2771–2781. PMLR (2020)
Google Scholar
Frazier, D.T., Drovandi, C.: Robust approximate Bayesian inference with synthetic likelihood. J. Comput. Graph. Stat. 30(4), 958–976 (2021). https://doi.org/10.1080/10618600.2021.1875839
Article MathSciNet Google Scholar
Frazier, D.T., Robert, C.P., Rousseau, J.: Model misspecification in approximate Bayesian computation: consequences and diagnostics. J. Royal Stat. Soc. Ser. B (Stat. Method.) 82(2), 421–444 (2020). https://doi.org/10.1111/rssb.12356
Article MathSciNet Google Scholar
Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., Gelman, A.: Visualization in Bayesian workflow. J. Royal Stat. Soc. Ser. A (Stat. Soc.) 182(2), 389–402 (2019)
Google Scholar
Ghaderi-Kangavari, A., Rad, J.A., Nunez, M.D.: A general integrative neurocognitive modeling framework to jointly describe EEG and decision-making on single trials. Comput. Brain Behav. (2023). https://doi.org/10.1007/s42113-023-00167-4
Article Google Scholar
Giummolè, F., Mameli, V., Ruli, E., Ventura, L.: Objective Bayesian inference with proper scoring rules. TEST 28(3), 728–755 (2019)
Article MathSciNet Google Scholar
Gonçalves, P.J., et al.: Training deep neural density estimators to identify mechanistic models of neural dynamics. Elife 9, e56261 (2020)
Article Google Scholar
Greenberg, D., Nonnenmacher, M., Macke, J.: Automatic posterior transformation for likelihood-free inference. In: International Conference on Machine Learning, pp. 2404–2414. PMLR (2019)
Google Scholar
Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.: A Kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012)
Google Scholar
Grünwald, P., Van Ommen, T., et al.: Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Anal. 12(4), 1069–1103 (2017)
Article MathSciNet Google Scholar
Hermans, J., Begy, V., Louppe, G.: Likelihood-free MCMC with amortized approximate ratio estimators. In: International Conference on Machine Learning, pp. 4239–4248. PMLR (2020)
Google Scholar
Hermans, J., Delaunoy, A., Rozet, F., Wehenkel, A., Louppe, G.: Averting a crisis in simulation-based inference. arXiv preprint arXiv:2110.06581 (2021)
Holmes, C.C., Walker, S.G.: Assigning a value to a power likelihood in a general Bayesian model. Biometrika 104(2), 497–503 (2017)
MathSciNet Google Scholar
Jones-Todd, C.M., et al.: Identifying prognostic structural features in tissue sections of colon cancer patients using point pattern analysis: Point pattern analysis of colon cancer tissue sections. Stat. Med. 38(8), 1421–1441 (2019). https://doi.org/10.1002/sim.8046
Article MathSciNet Google Scholar
Knoblauch, J., Jewson, J., Damoulas, T.: Generalized variational inference: three arguments for deriving new posteriors. arXiv preprint arXiv:1904.02063 (2019)
von Krause, M., Radev, S.T., Voss, A.: Mental speed is high until age 60 as revealed by analysis of over a million participants. Nat. Hum. Behav. 6(5), 700–708 (2022). https://doi.org/10.1038/s41562-021-01282-7
Article Google Scholar
Leclercq, F.: Simulation-based inference of Bayesian hierarchical models while checking for model misspecification (2022). arXiv:2209.11057 [astro-ph, q-bio, stat]
Loaiza-Maya, R., Martin, G.M., Frazier, D.T.: Focused Bayesian prediction. J. Appl. Economet. 36(5), 517–543 (2021)
Article MathSciNet Google Scholar
Lotfi, S., Izmailov, P., Benton, G., Goldblum, M., Wilson, A.G.: Bayesian model selection, the marginal likelihood, and generalization. arXiv preprint arXiv:2202.11678 (2022)
Lueckmann, J.M., Boelts, J., Greenberg, D., Goncalves, P., Macke, J.: Benchmarking simulation-based inference. In: International Conference on Artificial Intelligence and Statistics, pp. 343–351. PMLR (2021)
Google Scholar
Lueckmann, J.M., Boelts, J., Greenberg, D., Goncalves, P., Macke, J.: Benchmarking simulation-based inference. In: Banerjee, A., Fukumizu, K. (eds.) Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 130, pp. 343–351. PMLR (2021)
Google Scholar
Lueckmann, J.M., Goncalves, P.J., Bassetto, G., Öcal, K., Nonnenmacher, M., Macke, J.H.: Flexible statistical inference for mechanistic models of neural dynamics. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Masegosa, A.: Learning under model misspecification: applications to variational and ensemble methods. In: Advances in Neural Information Processing Systems, vol. 33, pp. 5479–5491 (2020)
Google Scholar
Matsubara, T., Knoblauch, J., Briol, F.X., Oates, C.J.: Robust generalised bayesian inference for intractable likelihoods (2022). arXiv:2104.07359 [math, stat]
Pacchiardi, L., Dutta, R.: Likelihood-free inference with generative neural networks via scoring rule minimization. arXiv preprint arXiv:2205.15784 (2022)
Pacchiardi, L., Dutta, R.: Score matched neural exponential families for likelihood-free inference (2022). arXiv:2012.10903 [stat]
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 1–38 (2022). https://doi.org/10.1145/3439950. arXiv:2007.02500 [cs, stat]
Papamakarios, G., Murray, I.: Fast $\varepsilon $-free inference of simulation models with Bayesian conditional density estimation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Papamakarios, G., Sterratt, D., Murray, I.: Sequential neural likelihood: fast likelihood-free inference with autoregressive flows. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 837–848. PMLR (2019)
Google Scholar
Radev, S.T., et al.: OutbreakFlow: model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany. PLoS Comput. Biol. 17(10), e1009472 (2021)
Article Google Scholar
Radev, S.T., Mertens, U.K., Voss, A., Ardizzone, L., Köthe, U.: BayesFlow: learning complex stochastic models with invertible neural networks. IEEE Trans. Neural Netw. Learn. Syst. 33, 1452–1466 (2020)
Google Scholar
Ramesh, P., et al.: GATSBI: generative adversarial training for simulation-based inference. arXiv preprint arXiv:2203.06481 (2022)
Schmon, S.M., Cannon, P.W., Knoblauch, J.: Generalized posteriors in approximate Bayesian computation (2021). arXiv:2011.08644 [stat]
Shiono, T.: Estimation of agent-based models using Bayesian deep learning approach of BayesFlow. J. Econ. Dyn. Control 125, 104082 (2021)
Article MathSciNet Google Scholar
Säilynoja, T., Bürkner, P.C., Vehtari, A.: Graphical test for discrete uniformity and its applications in goodness of fit evaluation and multiple sample comparison (2021). arXiv:2103.10522 [stat]
Talts, S., Betancourt, M., Simpson, D., Vehtari, A., Gelman, A.: Validating Bayesian inference algorithms with simulation-based calibration (2020). arXiv:1804.06788 [stat]
Tejero-Cantero, A., et al.: SBI-a toolkit for simulation-based inference. arXiv preprint arXiv:2007.09114 (2020)
Thomas, O., Corander, J.: Diagnosing model misspecification and performing generalized Bayes’ updates via probabilistic classifiers. arXiv preprint arXiv:1912.05810 (2019)
Vehtari, A., Ojanen, J.: A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6 (2012). https://doi.org/10.1214/12-SS102
Ward, D., Cannon, P., Beaumont, M., Fasiolo, M., Schmon, S.M.: Robust neural posterior estimation and statistical model criticism (2022). arXiv:2210.06564 [cs, stat]
White, H.: Maximum likelihood estimation of misspecified models. Econometrica 50(1), 1–25 (1982)
Article MathSciNet Google Scholar
Wiqvist, S., Frellsen, J., Picchini, U.: Sequential neural posterior and likelihood approximation. arXiv preprint arXiv:2102.06522 (2021)
Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized out-of-distribution detection: a survey. arXiv:2110.11334 (2021)
Zhang, F., Gao, C.: Convergence rates of variational posterior distributions. Ann. Stat. 48(4), 2180–2207 (2020). https://doi.org/10.1214/19-AOS1883

Download references

Acknowledgments

MS and PCB were supported by the Cyber Valley Research Fund (grant number: CyVy-RF-2021-16) and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC-2075 - 390740016 (the Stuttgart Cluster of Excellence SimTech). UK was supported by the Informatics for Life initiative funded by the Klaus Tschira Foundation. STR was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC-2181 - 390900948 (the Heidelberg Cluster of Excellence STRUCTURES).

Author information

Authors and Affiliations

Cluster of Excellence SimTech, University of Stuttgart, Stuttgart, Germany
Marvin Schmitt & Paul-Christian Bürkner
Department of Statistics, TU Dortmund University, Dortmund, Germany
Paul-Christian Bürkner
Visual Learning Lab, Heidelberg University, Heidelberg, Germany
Ullrich Köthe
Cluster of Excellence STRUCTURES, Heidelberg University, Heidelberg, Germany
Stefan T. Radev

Authors

Marvin Schmitt
View author publications
You can also search for this author in PubMed Google Scholar
Paul-Christian Bürkner
View author publications
You can also search for this author in PubMed Google Scholar
Ullrich Köthe
View author publications
You can also search for this author in PubMed Google Scholar
Stefan T. Radev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marvin Schmitt .

Editor information

Editors and Affiliations

IWR, Heidelberg University, Heidelberg, Germany
Ullrich Köthe
IWR, Heidelberg University, Heidelberg, Germany
Carsten Rother

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3496 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmitt, M., Bürkner, PC., Köthe, U., Radev, S.T. (2024). Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks. In: Köthe, U., Rother, C. (eds) Pattern Recognition. DAGM GCPR 2023. Lecture Notes in Computer Science, vol 14264. Springer, Cham. https://doi.org/10.1007/978-3-031-54605-1_35

Download citation

DOI: https://doi.org/10.1007/978-3-031-54605-1_35
Published: 08 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54604-4
Online ISBN: 978-3-031-54605-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks