A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters

Yang, Jingya; Niu, Yuanling; Zhou, Qingping

doi:10.1007/s40314-023-02279-w

A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters

Published: 31 March 2023

Volume 42, article number 138, (2023)
Cite this article

Computational and Applied Mathematics Aims and scope Submit manuscript

152 Accesses
1 Citation
Explore all metrics

Abstract

We propose a conditional variational auto-encoder within Gibbs sampling (CVAE-within-Gibbs) for Bayesian linear inverse problems where the prior or the likelihood function depends on ambiguous hyperparameters. The method builds on ideas from classical sampling theory and recent advances in deep generative models to approximate complicated probability distributions. Specifically, we use a CVAE model which is trained with a large amount of data to learn the conditional density of hyperparameters in the original Gibbs sampler. The learned property of the conditional posterior provides more flexibility than classical Gibbs sampling because it avoids manually or experimentally determining the hyperpriors and their hyperparameters. We demonstrate the performance of the proposed method for three linear inverse problems, i.e., image deblurring, signal denoising, and boundary heat flux identification in a heat conduction problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving Bayesian inverse problems from the perspective of deep generative networks

Article 19 June 2019

Learning Posterior Distributions in Underdetermined Inverse Problems

Regularising Inverse Problems with Generative Machine Learning Models

Article Open access 09 October 2023

References

Agrawal S, Kim H, Sanz-Alonso D et al (2022) A variational inference approach to inverse problems with gamma hyperpriors. SIAM/ASA J Uncertain Quantif 10(4):1533–1559
Article MathSciNet MATH Google Scholar
Banham MR, Katsaggelos AK (1997) Digital image restoration. IEEE Signal Process Mag 14(2):24–41
Article Google Scholar
Bardsley JM, Cui T (2019) A metropolis-hastings-within-Gibbs sampler for nonlinear hierarchical-Bayesian inverse problems. In: 2017 MATRIX Annals. Springer, pp 3–12
Calvetti D, Pragliola M, Somersalo E et al (2020) Sparse reconstructions from few noisy data: analysis of hierarchical Bayesian models with generalized gamma hyperpriors. Inverse Prob 36(2):025010
Article MathSciNet MATH Google Scholar
Carriquiry AL, Pawlovich M et al (2004) From empirical Bayes to full Bayes: methods for analyzing traffic safety data
Casella G (1985) An introduction to empirical Bayes data analysis. Am Stat 39(2):83–87
MathSciNet Google Scholar
Cotter SL, Roberts GO, Stuart AM et al (2013) MCMC methods for functions: modifying old algorithms to make them faster. Stat Sci 28(3):424–446
Article MathSciNet MATH Google Scholar
Donatelli M, Ferrari P, Gazzola S (2022) Symmetrization techniques in image deblurring. arXiv preprint arXiv:2212.05879
Dunlop MM, Iglesias MA, Stuart AM (2017) Hierarchical Bayesian level set inversion. Stat Comput 27(6):1555–1584
Article MathSciNet MATH Google Scholar
Fox C, Norton RA (2016) Fast sampling in a linear-Gaussian inverse problem. SIAM/ASA J Uncertain Quantif 4(1):1191–1218
Article MathSciNet MATH Google Scholar
Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference. CRC Press, Boca Raton
Book MATH Google Scholar
Guha N, Wu X, Efendiev Y et al (2015) A variational bayesian approach for inverse problems with skew-t error distributions. J Comput Phys 301:377–393
Article MathSciNet MATH Google Scholar
Guo B, Han Y, Wen J (2019) Agem: Solving linear inverse problems via deep priors and sampling. Adv Neural Inf Process Syst 32:547–558
Guo L, Zhao XL, Gu XM et al (2021) Three-dimensional fractional total variation regularized tensor optimized model for image deblurring. Appl Math Comput 404(126):224
MathSciNet MATH Google Scholar
Jin B (2012) A variational Bayesian method to inverse problems with impulsive noise. J Comput Phys 231(2):423–435
Article MathSciNet MATH Google Scholar
Jin B, Zou J (2010) Hierarchical Bayesian inference for ill-posed problems via variational method. J Comput Phys 229(19):7317–7343
Article MathSciNet MATH Google Scholar
Kaipio J, Somersalo E (2006) Statistical and computational inverse problems, vol 160. Springer Science & Business Media, Berlin
MATH Google Scholar
Kingma DP, Welling M (2014) Auto-Encoding Variational Bayes. In: Bengio Y, LeCun Y (eds) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings. arXiv:1312.6114
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet MATH Google Scholar
Liu JS (1996) Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Stat Comput 6(2):113–119
Article Google Scholar
Liu Q, Tong XT (2020) Accelerating metropolis-within-Gibbs sampler with localized computations of differential equations. Stat Comput 30(4):1037–1056
Article MathSciNet MATH Google Scholar
Liu J, Sun Y, Xu X et al (2019) Image restoration using total variation regularized deep image prior. ICASSP 2019–2019 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP). IEEE, pp 7715–7719
Ma Y, Tan J, Krishnan N et al (2014) Empirical Bayes and full Bayes for signal estimation. arXiv preprint arXiv:1405.2113
Minkowycz W, Sparrow EM, Schneider GE et al (1988) Handbook of numerical heat transfer
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Paszke A, Gross S, Massa F, et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Plassier V, Vono M, Durmus A et al (2021) DG-LMC: a turn-key and scalable synchronous distributed MCMC algorithm via Langevin Monte Carlo within Gibbs. In: International Conference on Machine Learning, PMLR, pp 8577–8587
Rao AM, Jones DL (2000) A denoising approach to multisensor signal estimation. IEEE Trans Signal Process 48(5):1225–1234
Article Google Scholar
Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Phys D Nonlinear Phenom 60(1–4):259–268
Article MathSciNet MATH Google Scholar
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. Adv Neural Inf Process Syst 28:3483–3491
Stuart AM (2010) Inverse problems: a Bayesian perspective. Acta Numer 19:451–559
Article MathSciNet MATH Google Scholar
Su X, Zamzami N, Bouguila N (2022) A fully Bayesian inference with Gibbs sampling for finite and infinite discrete exponential mixture models. Appl Artif Intell 36(1):1–28
Tiwari KA, Raisutis R, Samaitis V (2017) Signal processing methods to improve the signal-to-noise ratio (SNR) in ultrasonic non-destructive testing of wind turbine blade. Procedia Struct Integr 5:1184–1191
Article Google Scholar
Tong XT, Morzfeld M, Marzouk YM (2020) MALA-within-Gibbs samplers for high-dimensional distributions with sparse conditional structure. SIAM J Sci Comput 42(3):A1765–A1788
Article MathSciNet MATH Google Scholar
Uribe F, Bardsley JM, Dong Y, et al (2021) A hybrid Gibbs sampler for edge-preserving tomographic reconstruction with uncertain view angles. arXiv preprint arXiv:2104.06919
Wang J, Zabaras N (2004) A Bayesian inference approach to the inverse heat conduction problem. Int J Heat Mass Transf 47(17–18):3927–3941
Article MATH Google Scholar
Wang R, Tao D (2014) Recent progress in image deblurring. arXiv preprint arXiv:1409.6838
Winkler C, Worrall D, Hoogeboom E et al (2019) Learning likelihoods with conditional normalizing flows. arXiv preprint arXiv:1912.00042
Xie J, Colonna JG, Zhang J (2021) Bioacoustic signal denoising: a review. Artif Intell Rev 54(5):3575–3597
Article Google Scholar
Zhang C, Arridge S, Jin B (2019) Expectation propagation for Poisson data. Inverse Probl 35(8):085006
Article MathSciNet MATH Google Scholar
Zhou Q, Liu W, Li J et al (2018) An approximate empirical Bayesian method for large-scale linear-Gaussian inverse problems. Inverse Probl 34(9):095001
Article MathSciNet MATH Google Scholar
Zhu X, Milanfar P (2010) Image reconstruction from videos distorted by atmospheric turbulence. In: Visual information processing and communication, SPIE, pp 228–235

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China under Grant 12101614, the Natural Science Foundation of Hunan Province, China, under Grant 2021JJ40715 and the Postgraduate Scientific Research Innovation Project of Hunan Province, China (CX20220288). We are grateful to the High Performance Computing Center of Central South University for assistance with the computations.

Author information

Authors and Affiliations

School of Mathematics and Statistics, HNP-LAMA, Central South University, Changsha, 410083, Hunan, China
Jingya Yang, Yuanling Niu & Qingping Zhou

Authors

Jingya Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanling Niu
View author publications
You can also search for this author in PubMed Google Scholar
Qingping Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingping Zhou.

Ethics declarations

Conflict of interest

This work does not have any conflicts of interest.

Data availability

The datasets and codes are available in the https://github.com/YangJingya27/CVAE-within-Gibbs.

Additional information

Communicated by Vinicius Albani.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Datasets

The details of datasets for the three experiments are summarized in Table 8. The unknown parameters \({\varvec{u}}\), the measurable data \({\varvec{y}}\), and the hyperparameters \(\varvec{\theta }\) are included in these datasets. In the image deblurring experiment, the variance of the prior distribution contains the hyperparameters \(\gamma \) and d, as shown in Eq. (11), and the mean of the prior distribution contains \(\mu \), and s is included in the variance of the noise term. In the signal denoising experiment and the IHCP experiment, the only hyperparameter \(\sigma _\mathrm{{obs}}\) is drawn from the variance of the noise term \(\varvec{\Sigma }_\mathrm{{obs}}=\sigma ^2_\mathrm{{obs}}{\textbf{I}}\). The intervals in the second column of Table 8 indicate the range of values of hyperparameters determined empirically. In order to generate dataset, P discrete points are uniformly selected from each interval to form different combinations of hyperparameters values, and corresponding \(\{y^i,u^i\},i=1,\ldots ,{\bar{N}}\) are generated based on each combination. The last three columns of the table represent the size of the synthetic datasets, the dimension of the attribute \({\varvec{u}}\), and the dimension of \({\varvec{y}}\). For example, in the first experiment, there are a total of \(200,000=50\times 20\times 20\times 10\) combinations of hyperparameters, and 10 samples are randomly generated based on each combination, yielding a total of 2,000,000 samples.

Table 8 Details of datasets

Full size table

Appendix B: Network architectures

1.1 Image deblurring

The CVAE model used for the image deblurring consists of 2 fully connected hidden layers for the encoder and 3 for the decoder, with 5 Gaussian latent variables. We selected batch size as 128, 10 epochs, and the Adam optimizer with learning rate \(1.7 \times 10^{-6}\). The ReLU activation functions are used in the encoder and decoder. The number of neurons contained in each layer is shown in Table 9.

1.2 Inverse heat conduction problem

In the IHCP experiment, we employed the reclassification strategy discussed in Sect. 5.1.1, i.e., a classification network for \({\varvec{y}}\) is added to CVAE. The CVAE used for IHCP consists of 3 fully connected hidden layers for the encoder, 5 for the re-classification network, and 3 for the decoder, with 5 Gaussian latent variables. We chose Xavier Initialization, batch size of 128, 40 epochs, the Adam optimizer with learning rate \(2\times 10^{-5}\) for the re-classification network and learning rate \(5.5\times 10^{-4}\) for the VAE. The Leaky ReLU activation functions are used in the encoder and decoder and the ReLU activation functions are used in the re-classification network. The number of neurons contained in each layer is shown in Table 9.

1.3 Signal denoising

In the signal denoising experiment, we also employed the reclassification strategy discussed in Sect. 5.1.1. The CVAE model used for signal denoising consists of 5 fully connected hidden layers for the encoder and 7 for the decoder, with 5 Gaussian latent variables. We used a ResNet with 3 residual blocks and a linear layer for the re-classification network. The residual block has 2 fully connected layers with the dropout probability of p = 0.2. We utilize Xavier Initialization and the Leaky ReLU activation functions in all three networks. We chose Xavier Initialization, batch size as 64, 40 epochs, the RMSprop optimizer with learning rate \(1\times 10^{-5}\) for the re-classification network, and the NAdam optimizer with learning rate \(1\times 10^{-4}\) for the CVAE. The number of neurons contained in each layer is shown in Table 9.

Table 9 Details of network architecture

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, J., Niu, Y. & Zhou, Q. A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters. Comp. Appl. Math. 42, 138 (2023). https://doi.org/10.1007/s40314-023-02279-w

Download citation

Received: 10 September 2022
Revised: 28 February 2023
Accepted: 14 March 2023
Published: 31 March 2023
DOI: https://doi.org/10.1007/s40314-023-02279-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters

Abstract

Access this article

Similar content being viewed by others

Solving Bayesian inverse problems from the perspective of deep generative networks

Learning Posterior Distributions in Underdetermined Inverse Problems

Regularising Inverse Problems with Generative Machine Learning Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Data availability

Additional information

Publisher's Note

Appendices

Appendix A: Datasets

Appendix B: Network architectures

1.1 Image deblurring

1.2 Inverse heat conduction problem

1.3 Signal denoising

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A CVAE-within-Gibbs sampler for Bayesian linear inverse problems with hyperparameters

Abstract

Access this article

Similar content being viewed by others

Solving Bayesian inverse problems from the perspective of deep generative networks

Learning Posterior Distributions in Underdetermined Inverse Problems

Regularising Inverse Problems with Generative Machine Learning Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Data availability

Additional information

Publisher's Note

Appendices

Appendix A: Datasets

Appendix B: Network architectures

1.1 Image deblurring

1.2 Inverse heat conduction problem

1.3 Signal denoising

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation