Abstract
Recent advances in probabilistic deep learning enable efficient amortized Bayesian inference in settings where the likelihood function is only implicitly defined by a simulation program (simulation-based inference; SBI). But how faithful is such inference if the simulation represents reality somewhat inaccurately—that is, if the true system behavior at test time deviates from the one seen during training? We conceptualize the types of model misspecification arising in SBI and systematically investigate how the performance of neural posterior approximators gradually deteriorates under these misspecifications, making inference results less and less trustworthy. To notify users about this problem, we propose a new misspecification measure that can be trained in an unsupervised fashion (i.e., without training data from the true distribution) and reliably detects model misspecification at test time. Our experiments clearly demonstrate the utility of our new measure both on toy examples with an analytical ground-truth and on representative scientific tasks in cell biology, cognitive decision making, and disease outbreak dynamics. We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We demonstrate in Experiment 1 that model misspecification also affects the performance of non-amortized sequential neural posterior estimation.
References
Alquier, P., Ridgway, J.: Concentration of tempered posteriors and of their variational approximations. arXiv:1706.09293 [cs, math, stat] (2019). arXiv: 1706.09293
Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks (2019)
Berger, J.O., Wolpert, R.L.: The Likelihood Principle. No. v. 6 in Lecture Notes-Monograph Series. 2nd edn. Institute of Mathematical Statistics, Hayward (1988)
Bieringer, S., et al.: Measuring QCD splittings with invertible networks. SciPost Phys. Proc. 10(6), 126 (2021)
Bloem-Reddy, B., Teh, Y.W.: Probabilistic symmetries and invariant neural networks. J. Mach. Learn. Res. 21, 90–1 (2020)
Butter, A., et al.: Machine learning and LHC event generation. arXiv preprint arXiv:2203.07460 (2022)
Bürkner, P.C., Gabry, J., Vehtari, A.: Approximate leave-future-out cross-validation for Bayesian time series models. J. Stat. Comput. Simul. 90(14), 2499–2523 (2020). https://doi.org/10.1080/00949655.2020.1783262. arXiv:1902.06281 [stat]
Cannon, P., Ward, D., Schmon, S.M.: Investigating the impact of model misspecification in neural simulation-based inference (2022). arXiv:2209.01845 [cs, stat]
Cranmer, K., Brehmer, J., Louppe, G.: The frontier of simulation-based inference. Proc. Natl. Acad. Sci. 117(48), 30055–30062 (2020)
Dehning, J., et al.: Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science 369(6500) (2020)
Delaunoy, A., Hermans, J., Rozet, F., Wehenkel, A., Louppe, G.: Towards reliable simulation-based inference with balanced neural ratio estimation (2022). arXiv:2208.13624 [cs, stat]
Dellaporta, C., Knoblauch, J., Damoulas, T., Briol, F.X.: Robust Bayesian inference for simulator-based models via the MMD Posterior Bootstrap (2022). https://doi.org/10.48550/ARXIV.2202.04744
Dong, E., Du, H., Gardner, L.: An interactive web-based dashboard to track COVID-19 in real time. Lancet. Infect. Dis 20(5), 533–534 (2020). https://doi.org/10.1016/S1473-3099(20)30120-1
Durkan, C., Murray, I., Papamakarios, G.: On contrastive learning for likelihood-free inference. In: International Conference on Machine Learning, pp. 2771–2781. PMLR (2020)
Frazier, D.T., Drovandi, C.: Robust approximate Bayesian inference with synthetic likelihood. J. Comput. Graph. Stat. 30(4), 958–976 (2021). https://doi.org/10.1080/10618600.2021.1875839
Frazier, D.T., Robert, C.P., Rousseau, J.: Model misspecification in approximate Bayesian computation: consequences and diagnostics. J. Royal Stat. Soc. Ser. B (Stat. Method.) 82(2), 421–444 (2020). https://doi.org/10.1111/rssb.12356
Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., Gelman, A.: Visualization in Bayesian workflow. J. Royal Stat. Soc. Ser. A (Stat. Soc.) 182(2), 389–402 (2019)
Ghaderi-Kangavari, A., Rad, J.A., Nunez, M.D.: A general integrative neurocognitive modeling framework to jointly describe EEG and decision-making on single trials. Comput. Brain Behav. (2023). https://doi.org/10.1007/s42113-023-00167-4
Giummolè, F., Mameli, V., Ruli, E., Ventura, L.: Objective Bayesian inference with proper scoring rules. TEST 28(3), 728–755 (2019)
Gonçalves, P.J., et al.: Training deep neural density estimators to identify mechanistic models of neural dynamics. Elife 9, e56261 (2020)
Greenberg, D., Nonnenmacher, M., Macke, J.: Automatic posterior transformation for likelihood-free inference. In: International Conference on Machine Learning, pp. 2404–2414. PMLR (2019)
Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.: A Kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012)
Grünwald, P., Van Ommen, T., et al.: Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Anal. 12(4), 1069–1103 (2017)
Hermans, J., Begy, V., Louppe, G.: Likelihood-free MCMC with amortized approximate ratio estimators. In: International Conference on Machine Learning, pp. 4239–4248. PMLR (2020)
Hermans, J., Delaunoy, A., Rozet, F., Wehenkel, A., Louppe, G.: Averting a crisis in simulation-based inference. arXiv preprint arXiv:2110.06581 (2021)
Holmes, C.C., Walker, S.G.: Assigning a value to a power likelihood in a general Bayesian model. Biometrika 104(2), 497–503 (2017)
Jones-Todd, C.M., et al.: Identifying prognostic structural features in tissue sections of colon cancer patients using point pattern analysis: Point pattern analysis of colon cancer tissue sections. Stat. Med. 38(8), 1421–1441 (2019). https://doi.org/10.1002/sim.8046
Knoblauch, J., Jewson, J., Damoulas, T.: Generalized variational inference: three arguments for deriving new posteriors. arXiv preprint arXiv:1904.02063 (2019)
von Krause, M., Radev, S.T., Voss, A.: Mental speed is high until age 60 as revealed by analysis of over a million participants. Nat. Hum. Behav. 6(5), 700–708 (2022). https://doi.org/10.1038/s41562-021-01282-7
Leclercq, F.: Simulation-based inference of Bayesian hierarchical models while checking for model misspecification (2022). arXiv:2209.11057 [astro-ph, q-bio, stat]
Loaiza-Maya, R., Martin, G.M., Frazier, D.T.: Focused Bayesian prediction. J. Appl. Economet. 36(5), 517–543 (2021)
Lotfi, S., Izmailov, P., Benton, G., Goldblum, M., Wilson, A.G.: Bayesian model selection, the marginal likelihood, and generalization. arXiv preprint arXiv:2202.11678 (2022)
Lueckmann, J.M., Boelts, J., Greenberg, D., Goncalves, P., Macke, J.: Benchmarking simulation-based inference. In: International Conference on Artificial Intelligence and Statistics, pp. 343–351. PMLR (2021)
Lueckmann, J.M., Boelts, J., Greenberg, D., Goncalves, P., Macke, J.: Benchmarking simulation-based inference. In: Banerjee, A., Fukumizu, K. (eds.) Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 130, pp. 343–351. PMLR (2021)
Lueckmann, J.M., Goncalves, P.J., Bassetto, G., Öcal, K., Nonnenmacher, M., Macke, J.H.: Flexible statistical inference for mechanistic models of neural dynamics. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Masegosa, A.: Learning under model misspecification: applications to variational and ensemble methods. In: Advances in Neural Information Processing Systems, vol. 33, pp. 5479–5491 (2020)
Matsubara, T., Knoblauch, J., Briol, F.X., Oates, C.J.: Robust generalised bayesian inference for intractable likelihoods (2022). arXiv:2104.07359 [math, stat]
Pacchiardi, L., Dutta, R.: Likelihood-free inference with generative neural networks via scoring rule minimization. arXiv preprint arXiv:2205.15784 (2022)
Pacchiardi, L., Dutta, R.: Score matched neural exponential families for likelihood-free inference (2022). arXiv:2012.10903 [stat]
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 1–38 (2022). https://doi.org/10.1145/3439950. arXiv:2007.02500 [cs, stat]
Papamakarios, G., Murray, I.: Fast \(\varepsilon \)-free inference of simulation models with Bayesian conditional density estimation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Papamakarios, G., Pavlakou, T., Murray, I.: Masked autoregressive flow for density estimation. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Papamakarios, G., Sterratt, D., Murray, I.: Sequential neural likelihood: fast likelihood-free inference with autoregressive flows. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 837–848. PMLR (2019)
Radev, S.T., et al.: OutbreakFlow: model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany. PLoS Comput. Biol. 17(10), e1009472 (2021)
Radev, S.T., Mertens, U.K., Voss, A., Ardizzone, L., Köthe, U.: BayesFlow: learning complex stochastic models with invertible neural networks. IEEE Trans. Neural Netw. Learn. Syst. 33, 1452–1466 (2020)
Ramesh, P., et al.: GATSBI: generative adversarial training for simulation-based inference. arXiv preprint arXiv:2203.06481 (2022)
Schmon, S.M., Cannon, P.W., Knoblauch, J.: Generalized posteriors in approximate Bayesian computation (2021). arXiv:2011.08644 [stat]
Shiono, T.: Estimation of agent-based models using Bayesian deep learning approach of BayesFlow. J. Econ. Dyn. Control 125, 104082 (2021)
Säilynoja, T., Bürkner, P.C., Vehtari, A.: Graphical test for discrete uniformity and its applications in goodness of fit evaluation and multiple sample comparison (2021). arXiv:2103.10522 [stat]
Talts, S., Betancourt, M., Simpson, D., Vehtari, A., Gelman, A.: Validating Bayesian inference algorithms with simulation-based calibration (2020). arXiv:1804.06788 [stat]
Tejero-Cantero, A., et al.: SBI-a toolkit for simulation-based inference. arXiv preprint arXiv:2007.09114 (2020)
Thomas, O., Corander, J.: Diagnosing model misspecification and performing generalized Bayes’ updates via probabilistic classifiers. arXiv preprint arXiv:1912.05810 (2019)
Vehtari, A., Ojanen, J.: A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6 (2012). https://doi.org/10.1214/12-SS102
Ward, D., Cannon, P., Beaumont, M., Fasiolo, M., Schmon, S.M.: Robust neural posterior estimation and statistical model criticism (2022). arXiv:2210.06564 [cs, stat]
White, H.: Maximum likelihood estimation of misspecified models. Econometrica 50(1), 1–25 (1982)
Wiqvist, S., Frellsen, J., Picchini, U.: Sequential neural posterior and likelihood approximation. arXiv preprint arXiv:2102.06522 (2021)
Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized out-of-distribution detection: a survey. arXiv:2110.11334 (2021)
Zhang, F., Gao, C.: Convergence rates of variational posterior distributions. Ann. Stat. 48(4), 2180–2207 (2020). https://doi.org/10.1214/19-AOS1883
Acknowledgments
MS and PCB were supported by the Cyber Valley Research Fund (grant number: CyVy-RF-2021-16) and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC-2075 - 390740016 (the Stuttgart Cluster of Excellence SimTech). UK was supported by the Informatics for Life initiative funded by the Klaus Tschira Foundation. STR was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC-2181 - 390900948 (the Heidelberg Cluster of Excellence STRUCTURES).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schmitt, M., Bürkner, PC., Köthe, U., Radev, S.T. (2024). Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks. In: Köthe, U., Rother, C. (eds) Pattern Recognition. DAGM GCPR 2023. Lecture Notes in Computer Science, vol 14264. Springer, Cham. https://doi.org/10.1007/978-3-031-54605-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-031-54605-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54604-4
Online ISBN: 978-3-031-54605-1
eBook Packages: Computer ScienceComputer Science (R0)