Abstract
We present a large deviations analysis for the performance of an interacting particle method for rare event estimation. The analysis is restricted to a one-dimensional setting, though even in this restricted setting a number of new techniques must be developed. In contrast to the large deviations analyses of related algorithms, for interacting particle schemes it is an occupation measure analysis that is relevant, and within this framework many standard assumptions (stationarity, Feller property) can no longer be assumed. The methods developed are not limited to the question of performance analysis, and in fact give the full large deviations principle for such systems.
Similar content being viewed by others
References
Breiman, L.: Probability Theory. Addison-Wesley, Reading, MA (1968)
Dean, T., Dupuis, P.: Splitting for rare event simulation: a large deviations approach to design and analysis. Stoch. Proc. Appl. 119, 562–587 (2009)
Dean, T., Dupuis, P.: The design and analysis of a generalized RESTART/DPR algorithm for rare event simulation. Ann. Oper. Res. 189, 63–102 (2011)
Del Moral, P., Garnier, J.: Genealogical particle analysis of rare events. Ann. Appl. Probab. 15, 2496–2534 (2005)
Del Moral, P., Lezaud, P.: Branching and interacting particle interpretations of rare event probabilities. In: Blom, H.A.P., Lygeros, J. (eds.) Stochastic Hybrid Systems. Lecture Notes in Control and Information Science, vol. 337. Springer, Berlin (2006)
Donsker, M.D., Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time, I. Commun. Pure Appl. Math. 28, 1–47 (1975)
Donsker, M.D., Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time, II. Commun. Pure Appl. Math. 28, 279–301 (1975)
Dupuis, P., Ellis, R.S.: A Weak Convergence Approach to the Theory of Large Deviations. Wiley, New York (1997)
Dupuis, P., Zeitouni, O.: A nonstandard form of the rate function for the occupation measure of a Markov chain. Stoch. Proc. Appl. 61, 249–261 (1996)
Garvels, M.J.J.: The splitting method in rare event simulation. PhD thesis, University of Twente, The Netherlands (2000)
Garvels, M.J.J., Kroese, D.P.: A Comparison of RESTART Implementation. IEEE, Washington, DC (1998)
Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: A large deviations perspective on the efficiency of multilevel splitting. IEEE Trans. Autom. Control 43, 1666–1679 (1998)
Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47, 585–600 (1999)
Haraszti, Z., Townsend, J.K.: The theory of direct probability redistribution and its application to rare event simulation. ACM Trans. Model. Comput. Simul. 9, 105–140 (1999)
Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951)
Kushner, H.J.: Weak Convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Vol. 3 of Systems and Control. Birkhaeuser, Boston (1990)
Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability, 2nd edn. Cambridge University Press, New York (2009)
Villen-Altamirano, M., Villen-Altamirano, J.: RESTART: A method for accelerating rare event simulations. In: Proceedings of the 13th Iternational Teletraffic Congress, Queueing, Performance and Control in ATM, Elsevier, Amsterdam, pp. 71–76 (1991)
Villen-Altamirano, M., Villen-Altamirano, J.: RESTART: a straightforward method for fast simulation of rare events. In: Proceedings of the 1994 Winter Simulation Conference, pp. 282–289 (1994)
Villen-Altamirano, M., Villen-Altamirano, J.: Analysis of RESTART simulation: theoretical basis and sensitivity study. Eur. Trans. Telecommu. 13, 373–385 (2002)
Acknowledgments
We would like to thank the referee for a number of suggestions that improved the paper. The study of Yi Cai was supported in part by the National Science Foundation (DMS-0706003) and the Air Force Office of Scientific Research (FA9550-09-1-0378). The study of Paul Dupuis was supported in part by the National Science Foundation (DMS-0706003 and DMS-1008331), the Army Research Office (W911NF-09-1-0155), and the Air Force Office of Scientific Research (FA9550-09-1-0378).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Lemma 5.2
Let \(t\) \(\in [0,1]\) be given, and let \(\theta _{i}=\theta _{i}(t)\) be iid with distribution \(\mu (\cdot |t)\) (for simplicity we omit the \(t\) -dependence of \(\theta _{i}\)). For \(v\in [0,1)\), define
Then with the understanding that \(v+\sum _{k=1}^{\ell }\theta _{k}<1\) for all \(\ell <\infty \) means \(R_{v}=\Delta \), the distribution of \(R_{v}\) is \( \gamma (\cdot |v,t).\) Thus for a bounded and continuous function \(f:\mathbb{R }\rightarrow \mathbb{R }\)
Suppose that \(v\) is not an element of \((0,1]\cap \left[ \mathbb{Z }/q\right] .\) Then for each \(i\) the indicator functions in the last display are continuous at \(v\) w.p.1, and so by the Lebesgue Dominated Convergence Theorem, if \(v_{j}\rightarrow v\) as \(j\rightarrow \infty \) then
The analogous argument applies when \(v\in (0,1]\cap \left[ \mathbb{Z }/q \right] \) if we restrict to \(v_{j}\downarrow v\), which proves the right continuity. \(\square \)
Proof of Lemma 5.3
To simplify notation we omit the \(t\) dependence. We need to show for bounded and continuous \(g\) and \(h\) that
Now \(f(v)=\int h(y)\gamma (\text{ d}y|v)\) is continuous from the right for all \(v\) and from the left save (possibly) those \(v\) of the form \(k/q.\) Let \(f^{\delta }(v)\) be continuous, have the same uniform bound as \(f\), and equal to \(f\) outside \(\cup _{k=1}^{q}([k/q]-\delta ,[k/q]).\) Then
Since \(\mu _{n}\Rightarrow \mu \) implies \(\mu (\cup _{k=1}^{q}([k/q]-\delta ,[k/q]))\le \varepsilon \),
and since \(f^{\delta }(x)\) is continuous, \(\mu _{n}\Rightarrow \mu \) implies
The result now follows since \(\varepsilon >0\) is arbitrary. \(\square \)
Proof of Lemma 7.7
Part 1. For fixed \(T<\infty \), consider a modification of \(\nu _{j,i}^{m}\), such that if \(\bar{\sigma }_{j}^{m}<T,\) then we redefine \(\nu _{j,i}^{m}=\mu \) for \(\bar{\sigma }_{j}^{m}<i\le T-1.\) With this modified definition of the control, which does not change the distribution of the hitting time on the set \(\{\bar{\sigma }_{j}^{m}<T\}\), we have
where \(\mathbb{Q }^{T}\) denotes the modified joint measure on the space of increments, \(\mathbb{Q }^{T}|_{\left\{ 0,\ldots ,T-1\right\} }\) denotes the restriction to the first \(T\) coordinates in the underlying product space, and \(\left. \mathbb{P }\right| _{\left\{ 0,\ldots ,T-1\right\} }\) denotes product measure with marginal \(\mu .\)
We now consider the disjoint, finite partition of the space \(\mathbb{R }^{T}=A_{j}^{T}\cup B_{j}^{T}\cup D_{j}^{T}\), where
Using the approximation property of relative entropy via sums over finite measurable partitions [8, Lemma 1.4.3(g)], we obtain
where \(\tau _{j}^{m,T},\alpha _{j}^{m,T}\in \mathcal P \left( \left\{ (-\infty ,-1]\cup \left[ j+1,\infty \right) ,\Sigma \right\} \right) \) are the measures induced by \(\mathbb{Q }^{T}|_{\left\{ 0,\ldots ,T-1\right\} }\) and \(\mathbb{P }|_{\left\{ 0,\ldots ,T-1\right\} }\), and \(\Sigma \) corresponds to the event \(\sigma _{j}^{m}\ge T.\)
We know that \(\alpha _{j}^{m,T}(\Sigma )\rightarrow 0\) as \(T\rightarrow \infty .\) Let \(\bar{\alpha }_{j}^{m}\) denote the extension of \(\alpha _{j}^{m} \) to \(\left\{ (-\infty ,-1]\cup \left[ j+1,\infty \right) ,\Sigma \right\} \) with \(\bar{\alpha }_{j}^{m}(\Sigma )=0.\) Let \(\bar{\tau }_{j}^{m}\) denote the limit of \(\tau _{j}^{m,T}\), which must exist by monotonicity. By the lower semi-continuity of relative entropy,
If \(\bar{\tau }_{j}^{m}(\Sigma )>0\) then \(R(\bar{\tau }_{j}^{m}\mathbf \parallel \bar{\alpha }_{j}^{m})=\infty \), and if \(\bar{\tau }_{j}^{m}(\Sigma )=0\) then \(R(\bar{\tau }_{j}^{m}{\parallel }\bar{\alpha } _{j}^{m})=R(\tau _{j}^{m}{\parallel }\alpha _{j}^{m}).\) The last sentence and (9.1), (9.2) and (9.2) imply
Part 2. Define
Since by assumption \(U(x;y)\) is bounded from above, according to [8, Proposition 4.5.1]
Since \(R(\tau {\parallel }\alpha _{j}^{m})<\infty \), by the definition of \(U\)
achieves the infimum in the variational formula, i.e.,
We next consider the analogous relations on the space of increments of the process. We have
where \(\bar{\sigma }_{j}^{m}\) denotes the first hitting time of the process and \(\mathbb{P }\) is the underlying product probability with marginal \(\mu .\) The definition of \(\alpha _{j}^{m}\) implies that for any integrable function \(K\)
Also, the minimum is achieved at
Since the first hitting probability \(\alpha _{j}^{m}\) is induced by \(\mathbb{P }\), (9.4) and (9.5) imply
Hence, by the chain rule
where \(\{\nu _{j,i}^{m}\}\) are the controls defined by factoring \(\mathbb{Q }.\)
Combining this with the bound proved in part 1, we have constructed controls and a controlled process \(\{\bar{Z}_{j,i}^{m}\}\) satisfying
where \(\tau \) is the distribution induced by the stopped, controlled process. Although here we have constructed the control on the canonical space \(\mathbb{R }^{\infty }\), it can be applied in the general setting by making the obvious identifications. \(\square \)
Proof of Proposition 7.10
Since \(\{{\hat{\mathbf{X }}}_{j}^{m}\}\) is tight, given \(\eta >0\), there is compact \(S_{1}\subset \mathcal T _{1}\) such that \(\mathbb{P }\{{\hat{\mathbf{X }}}_{j}^{m}\notin S_{1}\}\le \eta \) for all \(j\le m\), including the index that first exceeds \(i/\kappa .\) Using standard results from stochastic stability, there is a compact set \(S_{2}\subset \mathcal T _{1}\) such that, after starting in \(S_{1}\), the probability of escaping from \(S_{2}\) is also less than \(\eta .\)
To simplify the presentation, we denote the initial value of \(j\in [i/\kappa ,(i+1)/\kappa )\) by zero and the final value by \(m\), and write \(\hat{J}^{m}\) for \(\hat{J}_{i}^{m}.\) By the geometric ergodicity of \(\bar{p}_{i}^{\kappa }\) and the compactness of \(S_{2}\)
uniformly in \(\mathbf x \in S_{2}\), where \(\bar{p}_{i}^{\kappa ,(k)}\) is the \(k\)-fold convolution of \(\bar{p}_{i}^{\kappa }.\) Let \(\delta >0\), and assume that \(k\) is large enough that for such \(\mathbf x \)
Let \(\hat{p}_{j,\ell }^{m}\) denote convolution of the transition kernels \(\hat{p}_{jk+\ell }^{m},\hat{p}_{jk+\ell +1}^{m},\ldots \hat{p}_{(j+1)k+\ell -1}^{m}\), so that if \({\hat{\mathbf{X }}}_{kj+\ell }^{m}=\mathbf x \) then the distribution of \({\hat{\mathbf{X }}}_{k(j+1)+\ell }^{m}\) is given by \(\hat{p} _{j,\ell }^{m}(\mathbf x ,\cdot ).\)
Let \(\zeta >0\) be given. For \(m\) sufficiently large we know that
Iterating the (9.7) gives
for all \(\ell \) and all \(\mathbf x \in S_{2}\), and thus
for all \(\ell \) and all \(\mathbf x \in S_{2}.\)
Let \(k\in \mathbb N \) be given. We will break the empirical measure \(\hat{J} ^{m}\) into \(k\) sums, and it follows from the fact that \(k\) will be fixed as \( m\rightarrow \infty \) that we can consider just those \(m\) of the form \( rk,r\in \mathbb N .\) We then write
where the last equality defines the \(\hat{J}_{\ell }^{m}.\)
Now fix \(\ell \in \left\{ 0,1,\ldots ,k-1\right\} \) and consider any bound measurable function \(f:\mathcal T _{1}\rightarrow \mathbb R \) with \( \left\| {f}\right\| _{\infty }\le 1.\) As usual, we have that
is a martingale difference sequence, and thus by (9.8)
where \(|\varepsilon _{1,j}^{m}|\le k\zeta +\delta \) and \(\varepsilon _{2,j}^{m}\) has conditional mean zero and uniformly (in \(m\) and \(j\)) bounded second moment. It follows using a standard calculation that
Letting first \(m\rightarrow \infty \) shows that weak limits of the \(\hat{J} _{\ell }^{m}\) are all within \(2\left[ k\zeta +\delta \right] \) of \(\bar{ \lambda }_{i}^{\kappa }\) in total variation norm save on a set of probability no more than \(2\eta .\) We now send \(\zeta \downarrow 0\), then \(\delta \downarrow 0\), and finally \(\eta \downarrow 0\) to complete the proof. \(\square \)
Proof of Part 3 of Lemma 7.6
Part 3. Let \(h(\mathbf x,y )=(\text{ d}\bar{\Lambda }/\text{ d}\Lambda )(\mathbf x,y )\), where the Radon-Nikodym derivative is in \(\mathbf y .\) For \(k\in \mathbb N \) define
where \(h^{k}(\mathbf x,y )=k\wedge h(\mathbf x,y ).\) Also, let
for any Borel set \(A\) of \(\mathcal T _{2}.\) Obviously, \((\bar{\xi }^{k},\bar{\eta })\) is also admissible and the associated transition kernel is equivalent to \(\bar{p}(\mathbf x,\cdot ).\) Hence this transition kernel is also geometrically ergodic. Furthermore, the relative entropy
is uniformly bounded as a function of \(\mathbf x .\) Since \(S^{k}\left( \mathbf x \right) \rightarrow 0\) as \(k\rightarrow \infty \), the Dominated Convergence Theorem implies (7.7).
It is easy to check that the construction implies \(\Vert \bar{\Lambda } ^{k}(\cdot |\mathbf x )-\bar{\Lambda }(\cdot |\mathbf x )\Vert _{v}\rightarrow 0.\) It then follows from Proposition 7.10, which is stated and proved later in this section, that the first marginals of \(\bar{\xi }^{k}\) converges to \(\bar{\xi }_{1}\) in total variation, i.e., \(\Vert \bar{\xi }_{1}^{k}-\bar{\xi }_{1}\Vert _{v}\rightarrow 0.\) \(\square \)
Rights and permissions
About this article
Cite this article
Cai, Y., Dupuis, P. Analysis of an interacting particle method for rare event estimation. Queueing Syst 73, 345–406 (2013). https://doi.org/10.1007/s11134-013-9344-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11134-013-9344-z