New approximate Bayesian computation algorithm for censored data

McCullough, Kristin; Dmitrieva, Tatiana; Ebrahimi, Nader

doi:10.1007/s00180-021-01167-3

New approximate Bayesian computation algorithm for censored data

Original paper
Published: 15 November 2021

Volume 37, pages 1369–1397, (2022)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Kristin McCullough¹,
Tatiana Dmitrieva² &
Nader Ebrahimi³

385 Accesses
1 Citation
Explore all metrics

Abstract

Approximate Bayesian computation refers to a family of algorithms that perform Bayesian inference under intractable likelihoods. In this paper we propose replacing the distance metric in certain algorithms with hypothesis testing. The benefits of which are that summary statistics are no longer required and censoring can be present in the observed data set without needing to simulate any censored data. We illustrate our proposed method through a nanotechnology application in which we estimate the concentration of particles in a liquid suspension. We prove that our method results in an approximation to the true posterior and that the parameter estimates are consistent. We further show, through comparative analysis, that it is more efficient than existing methods for censored data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Inference on Middle-Censored Data in a Dependent Setup

Article 01 September 2015

Inferences and Optimal Censoring Schemes for Progressively First-Failure Censored Nadarajah-Haghighi Distribution

Article Open access 16 November 2020

Optimal sampling and statistical inferences for Kumaraswamy distribution under progressive Type-II censoring schemes

Article Open access 26 July 2023

References

Balakrishnan N, Cramer E (2014) The art of progressive censoring: applications to reliability and quality. Springer, New York
Book Google Scholar
Beaumont M (2010) Approximate Bayesian computation in evolution and ecology. Annu Rev Ecol Evol Syst 41:379–406
Article Google Scholar
Beaumont M, Zhang W, Balding D (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035
Article Google Scholar
Blum M (2010) Approximate Bayesian computation: a nonparametric perspective. J Am Stat Assoc 105(491):1178–1187
Article MathSciNet Google Scholar
Blum M, Nunes M, Prangle D, Sisson S (2013) A comparative review of dimension reduction methods in approximate Bayesian computation. Stat Sci 28(2):189–208
Article MathSciNet Google Scholar
Braeckmans K, Buyens K, Bouquet W, Vervaet C, Joye P, De Vos F, Plawinskli L, Doeuvrei L, Angles-Canol E, Sanders N, Demeester J, Smedt S (2010) Sizing nanomatter in biological fluids by fluorescence single particle tracking. Nano Lett 10(11):4435–4442
Article Google Scholar
Cameron E, Pettitt AN (2012) Approximate Bayesian computation for astronomical model analysis: a case study in galaxy demographics and morphological transformation at high redshift. Mon Not R Astron Soc 425:44–65
Article Google Scholar
Chen HQPZ (2017) An improved two-stage procedure to compare hazard curves. J Stat Comput Simul 87(9):1877–1886
Article MathSciNet Google Scholar
Csilléry K, Blum MG, Gaggiotti OE, François O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25(7):410–418
Article Google Scholar
Dmitrieva T, McCullough K, Ebrahimi N (2020) Improved approximate Bayesian computation methods via empirical likelihood. Comput Stat 36:1–20
MathSciNet MATH Google Scholar
Ebrahimi N, McCullough K (2016) Using approximate Bayesian computation to assess the reliability of nanocomponents of a nanosystem. Int J Reliab Qual Saf Eng 23(2):1650009
Article Google Scholar
Fearnhead P, Prangle D (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. J R Stat Soc 74(3):419–474
Article MathSciNet Google Scholar
Frazier D, Martin G, Robert C, Rousseau J (2018) Asymptotic properties of approximate bayesian computation. Biometrika 105(3):503–697
Article MathSciNet Google Scholar
Grazian C, Liseo B (2015) Approximate Bayesian computation for copula estimation. Statistica 75(1):111–127
MATH Google Scholar
Griffin A, Shaw L, Stewart E (2018) Technical note: approximate Bayesian computation to improve long-return flood estimates using historical data. https://hess.copernicus.org/preprints/hess-2018-325/
Gutmann M, Dutta R, Kaski S, Corander J (2018) Likelihood-free inference via classification. Stat Comput 28:411–425
Article MathSciNet Google Scholar
Harrison J, Baker R (2017) An automatic adaptive method to combine summary statistics in approximate bayesian computation. PLoS ONE 15(8):e0236954
Jarvenpaa M, Gutmann M, Vehtari A (2018) Gaussian process modeling in approximate Bayesian computation to estimate horizontal gene transfer in bacteria. Ann Appl Stat 12(4):2228–2251
Article MathSciNet Google Scholar
Jennings E, Madigan M (2017) Astroabc?: An approximate Bayesian computation sequential Monte Carlo sampler for cosmological parameter estimation. Astron Comput 19:16–22
Article Google Scholar
Jiang B, Wu T, Zheng C, Wong W (2017) Learning summary statistic for approximate Bayesian computation via deep neural network. Stat Sin 27(4):1595–1618
MathSciNet MATH Google Scholar
Kraus D (2009) Adaptive Neyman’s smooth tests of homogeneity of two samples of survival data. Stat Plan Infer 139(10):3559–3569
Article MathSciNet Google Scholar
Krishnanathan K, Anderson S, Billings S, Kadirkamanathan V (2015) Computational system identification of continuous-time nonlinear systems using approximate Bayesian computation. Int J Syst Sci 47(15):3537–3544
Article MathSciNet Google Scholar
Li H, Han D, Hou Y, Chen H, Chen Z (2015) Statistical inference methods for two crossing survival curves: a comparison of methods. PLoS ONE 10(1):e0116774
Lintusaari J, Gutmann M, Dutta R, Kaski S, Corander J (2017) Fundamentals and recent developments in approximate Bayesian computation. Syst Biol 66(1):66–82
Google Scholar
Mansinghka V, Kulkarni T, Perov Y, Tenenbaum J (2013) Approximate bayesian image interpretation using generative probabilistic graphics programs. In: NIPS’13: Proceedings of the 26th international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, NIPS’13, pp 1520–1528
Marin J, Pudlo P, Robert CP, Ryder RJ (2012) Approximate Bayesian computational methods. Stat Comput 22(6):1167–1180
Article MathSciNet Google Scholar
Mason P (2016) Approximate Bayesian computation of the occurrence and size of defects in advanced gas-cooled nuclear reactor boilers. Rel Eng Syst Saf 146:21–25
Article Google Scholar
Masuda H, Ashoh H, Watanabe M, Nishio K, Nakao M, Tamamura T (2001) Square and triangular nanohole array architectures in anodic alumina. Adv Mater 13(3):189–192
Article Google Scholar
McCullough K, Ebrahimi N (2018) Approximate Bayesian computation for censored data and its application to reliability assessment. IISE Trans 50(5):419–430
Article Google Scholar
Qiu P, Sheng J (2008) A two-stage procedure for comparing hazard rate functions. J R Stat Soc Ser B Stat Methodol 70(1):191–208
MathSciNet MATH Google Scholar
Raynal L, Marin J, Pudlo P, Ribatet M, Robert C, Estoup A (2019) Abc random forests for Bayesian parameter inference. Bioinformatics 35(10):1720–1728
Article Google Scholar
Robert C (2016) Approximate bayesian computation: a survey on recent results. In: Monte Carlo and quasi-Monte Carlo methods. Springer, pp 185–205
Roding M, Zagato E, Remaut K, Braeckmans K (2016) Approximate bayesian computation for estimating number concentrations of monodisperse nanoparticles in suspension by optical microscopy. Phys Rev E 93(6):063311
Ruiz-Suarez S, Leos-Barajas V, Alvarez-Castro I, Morales JM (2020) Using approximate bayesian inference for a “steps and turns” continuous-time random walk observed at regular time intervals. PeerJ 8:e8452
Sheng J, Qiu P, Geyer C (2019) TSHRC: Two Stage Hazard Rate Comparison. R package version 0.1-6
Simola U, Cisewski-Kehe J, Gutmann M, Corander M (2021) Adaptive approximate bayesian computation tolerance selection. Bayesian Anal 16(2):371–395
Article MathSciNet Google Scholar
Spooner A, Sowmy A, Sachdev P, Kochan N, Trollor J, Brodaty H (2020) A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep 10:20410
Sweeting T, Kharroubi S (2005) Application of a predictive distribution formula to Bayesian computation for incomplete data models. Stat Comput 15:167–178
Article MathSciNet Google Scholar
Vock D, Wolfson J, Bandyopadhyay S, Adomavicius G, Johnson P, Vazquez-Benitez G, O’Connor P (2016) Adapting machine learning techniques to censored time-to-event health record data: a general-purpose approach using inverse probability of censoring weighting. J Biomed Inform 61:119–131
Wang Z, Kim J (2018) Approximate Bayesian inference under informative sampling. Biometrika 105(1):91–102
Williams J, Kim H, Crespi C (2020) Modeling observations with a detection limit using a truncated normal distribution with censoring. BMC Med Res Methodol 20:170
Article Google Scholar
Zeng X, Latimer M, Xiao Z, Panuganti S, Welp U, Kwok W, Xu T (2011) Hydrogen gas sensing with networks of ultrasmall palladium nanowires formed on filtration membranes. Nano Lett 11(1):262–268
Article Google Scholar
Zeng X, Wang Y, Deng H, Latimer M, Xiao Z, Pearson J, Xu T, Welp U, Crabtree G, Kwok W (2011) Networks of ultrasmall Pd/Cr nanowires as high performance hydrogen sensors. ACS Nano 5(9):7443–7452
Zeng X, Wang Y, Xiao Z, Latimer M, Xu T, Kwok W (2012) Hydrogen responses of ultrathin Pd films and nanowire networks with a Ti buffer layer. J Mater Sci 47(18):6647–6651
Article Google Scholar
Zhou J, Fukumizu K (2018) Local kernel dimension reduction in approximate Bayesian computation. Open J Stat 8:479–496
Article Google Scholar

Download references

Acknowledgements

We would like to thank Dr. Magnus Roding from the Bioscience and Materials division at RISE Research Institutes of Sweden for providing us with the data used in Sect. 5.

Author information

Authors and Affiliations

Grand View University, 1200 Grandview Ave, Des Moines, IA, 50316, USA
Kristin McCullough
Advocate Aurora Health, 3075 Highland Pkwy, Downers Grove, IL, 60515, USA
Tatiana Dmitrieva
Northern Illinois University, 1425 Lincoln Hwy, DeKalb, IL, 60115, USA
Nader Ebrahimi

Authors

Kristin McCullough
View author publications
You can also search for this author in PubMed Google Scholar
Tatiana Dmitrieva
View author publications
You can also search for this author in PubMed Google Scholar
Nader Ebrahimi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nader Ebrahimi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Lemma 1

It is clear that $pr(Y>y)=pr(X_1>y)pr(X_2>y)$. Then,

$$\begin{aligned} \begin{aligned} pr(Y+U)&=\int pr(Y>y-u)f_{U}(u) \mathrm{d}u \\&=\int pr(X_1>y-u)pr(X_2>y-u)f_{U}(u) \mathrm{d}u. \end{aligned} \end{aligned}$$

Also,

$$\begin{aligned} \begin{aligned} pr(\min (X_1+U, X_2+U)>y)&= \int pr(X_1+U>y) pr(X_2+U>y) f_{U}(u) \mathrm{d}u \\&=\int pr(X_1>y-u)pr(X_2>y-u)f_{U}(u) \mathrm{d}u. \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} pr(Y+U>y)=pr(\min (X_1+U, X_2+U)>y)). \end{aligned}$$

$\square $

Proof of Lemma 2

The moment generating function of $Y_i$ is $M_{Y_i}(t)=M_{X_i}(t)M_{U_i}(t)$ for $i=1,2$. If $Y_1$ is stochastically equivalent to $Y_2$, then

$$\begin{aligned} M_{Y_1}(t)=M_{X_1}(t)M_{U_1}(t) =M_{X_2}(t)M_{U_2}(t) =M_{Y_2}(t). \end{aligned}$$

This implies $M_{X_1}(t)=M_{X_2}(t)$, and thus $X_1$ is stochastically equivalent to $X_2$.

If $X_1$ is stochastically equivalent to $X_2$, then $M_{X_1}(t)=M_{X_2}(t)$, and thus $Y_1$ is stochastically equivalent to $Y_2$. $\square $

Proof of Lemma 3

Let A represent the event that $H_0$ is not rejected, i.e. accepted. Let T represent the event that $H_0$ is true, while F represents $H_0$ is false, i.e. the complement of T. The power of the test is $1-\beta $, where $\beta =pr(A|F)$.

We first look at the bias.

$$\begin{aligned} \begin{aligned} E({\hat{\theta }}| A)&=E({\hat{\theta }}|A\cap T)pr(T|A) + E({\hat{\theta }}|A\cap F)pr(F|A) \\&=E({\hat{\theta }}|A\cap T) \Big [1-pr(F|A) \Big ] + E({\hat{\theta }}|A\cap F)pr(F|A) \\&=E({\hat{\theta }}|A\cap T) \Bigg [1-\dfrac{pr(A|F)pr(F)}{pr(A)}\Bigg ] + E({\hat{\theta }}|A\cap F)\dfrac{pr(A|F)pr(F)}{pr(A)} \\&=E({\hat{\theta }}|A\cap T) \Bigg [1-\dfrac{\beta pr(F)}{pr(A)}\Bigg ] + E({\hat{\theta }}|A\cap F)\dfrac{\beta pr(F)}{pr(A)} \rightarrow \theta \quad \text { as } n\rightarrow \infty . \end{aligned} \end{aligned}$$

Since ${\hat{\theta }}$ is an asymptotically unbiased estimator of $\theta $ when $H_0$ is true, $E({\hat{\theta }}|A\cap T)\rightarrow \theta $ as $n\rightarrow \infty $. The power of the test goes to one, so $\beta \rightarrow 0$ as $n\rightarrow \infty $. Therefore the bias $\theta -E({\hat{\theta }}| A) \rightarrow 0$ as $n\rightarrow \infty $. $\square $

Proof of Theorem 1

Let A represent the event that $H_0$ is not rejected, i.e. accepted. Let T represent the event that $H_0$ is true, while F represents $H_0$ is false. The power of the test is $1-\beta $.

Applying the total conditional variance formula, we have:

$$\begin{aligned} \mathrm{var}({\hat{\theta }}|A)=E(\mathrm{var}({\hat{\theta }} | A) |A) + \mathrm{var}( E ({\hat{\theta }}| A) |A). \end{aligned}$$

(13)

By Lemma 3 the second term in Eq. (13) goes to zero as $n\rightarrow \infty $. Now consider the first term:

$$\begin{aligned} \begin{aligned} E(\mathrm{var}({\hat{\theta }} | A) |A)&=\mathrm{var}({\hat{\theta }}|A\cap T)pr(T|A) + \mathrm{var}({\hat{\theta }}|A\cap F)pr(F|A) \\&=\mathrm{var}({\hat{\theta }}|A\cap T) \Big [1-pr(F|A) \Big ] + \mathrm{var}({\hat{\theta }}|A\cap F)pr(F|A) \\&=\mathrm{var}({\hat{\theta }}|A\cap T) \Bigg [1-\dfrac{pr(A|F)pr(F)}{pr(A)}\Bigg ] \\&\quad + \mathrm{var}({\hat{\theta }}|A\cap F)\dfrac{pr(A|F)pr(F)}{pr(A)} \\&=\mathrm{var}({\hat{\theta }}|A\cap T) \Bigg [1-\dfrac{\beta pr(F)}{pr(A)}\Bigg ] + \mathrm{var}({\hat{\theta }}|A\cap F)\dfrac{\beta pr(F)}{pr(A)}. \end{aligned} \end{aligned}$$

We know that $\mathrm{var}({\hat{\theta }}|A\cap T) \rightarrow 0$ as $n\rightarrow \infty $, because ${\hat{\theta }}$ is a consistent estimator when $H_0$ is true. The power of the test goes to one, so $\beta \rightarrow 0$ as $n\rightarrow \infty $. Thus, $E(\mathrm{var}({\hat{\theta }} | A) |A)\rightarrow 0 \text { as } n\rightarrow \infty $. Therefore, ${\hat{\theta }}$ is a consistent estimator if $H_0$ is not rejected by a sufficient condition. $\square $

Proof of Theorem 2

Let $\{ \theta ^*_i\}_{i=1}^N$ be the resulting sequence of the algorithm. Each $ \theta ^*_i$ is an independent draw from $f(\theta |A)$, where A represent the event that $H_0$ is not rejected. Then

$$\begin{aligned} f(\theta ^*_i) \propto \displaystyle \sum _ {x^* \in {\mathcal {F}}} f(x^*|\theta ^*_i) \pi (\theta ^*_i) \mathbb {1}_{A} \propto \displaystyle \sum _ {x^*:~ A} f(x^*|\theta ^*_i) \pi (\theta ^*_i) \propto \pi _{A}(\theta ^*_i|x), \end{aligned}$$

where ${\mathcal {F}}$ is $\sigma $-algebra on some given set $\varOmega $, $x^*$ is simulated data, $f(x^*| \cdot )$ is the model for simulated data, and $\pi (\cdot )$ is the prior distribution. The resulting approximate posterior distribution $\pi _{H_0}(\theta ^*_i|x)$ depends on the test which helps to make a decision to reject the null hypothesis or not.

As the sample size of the observed data set approaches infinity, the sample size of the simulated data set also approaches infinity. Then the power of the test approaches one,

$$\begin{aligned} 1-\beta = pr(H_0~\mathrm{is}~\mathrm{rejected}~ |~H_0~\mathrm{is}~\mathrm{false}) \rightarrow 1. \end{aligned}$$

Meaning that when, in reality, the null hypothesis is false, the alternative hypothesis will not be rejected, and the corresponding candidate parameter value will not be taken into the resulting sequence of parameter values. Thus, the wrong parameter values will not be taken into the resulting sequence of parameter values at all. Hence, the resulting posterior distribution will be a true posterior,

$$\begin{aligned} \pi _{A}(\theta ^*_i|x) \rightarrow \pi (\theta ^*_i|x), \end{aligned}$$

as the sample sizes of the observed and simulated data sets approach infinity. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

McCullough, K., Dmitrieva, T. & Ebrahimi, N. New approximate Bayesian computation algorithm for censored data. Comput Stat 37, 1369–1397 (2022). https://doi.org/10.1007/s00180-021-01167-3

Download citation

Received: 10 March 2021
Accepted: 11 October 2021
Published: 15 November 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s00180-021-01167-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

New approximate Bayesian computation algorithm for censored data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Statistical Inference on Middle-Censored Data in a Dependent Setup

Inferences and Optimal Censoring Schemes for Progressively First-Failure Censored Nadarajah-Haghighi Distribution

Optimal sampling and statistical inferences for Kumaraswamy distribution under progressive Type-II censoring schemes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Proof of Lemma 1

Proof of Lemma 2

Proof of Lemma 3

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

New approximate Bayesian computation algorithm for censored data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Statistical Inference on Middle-Censored Data in a Dependent Setup

Inferences and Optimal Censoring Schemes for Progressively First-Failure Censored Nadarajah-Haghighi Distribution

Optimal sampling and statistical inferences for Kumaraswamy distribution under progressive Type-II censoring schemes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Lemma 1

Proof of Lemma 2

Proof of Lemma 3

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation