Abstract
The adaptive last particle method is a simple and interesting alternative in the class of general splitting algorithms for estimating tail distributions. We consider this algorithm in the space of trajectories and for general reaction coordinates. Using a combinatorial approach in discrete state spaces, we demonstrate two new results. First, we are able to give the exact expression of the distribution of the number of iterations in an perfect version of the algorithm where trajectories are i.i.d. This result is an improvement of previous known results when the cumulative distribution function has discontinuities. Second, we show that an effective computational version of the algorithm where trajectories are no more i.i.d. follows the same statistics than the idealized version when the reaction coordinate is the committor function.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Au, S.K., Beck, J.L.: Estimation of small failure probabilities in high dimensions by subset simulation. Probab. Eng. Mech. 16(4), 263–277 (2001)
Au, S.K., Beck, J.L.: Subset simulation and its application to seismic risk based on dynamic analysis. J. Eng. Mech. 129(8), 901–917 (2003)
Botev, Z.I., LEcuyer, P., Rubino, G., Simard, R., Tuffin, B.: Static network reliability estimation via generalized splitting, INFORMS J. Comput. (2012). doi:10.1287/ijoc.1110.0493
Botev, Z.I., L’Ecuyer, P., Tuffin, B.: Dependent failures in highly-reliable static networks. In: Proceedings of the 2012 Winter Simulation Conference, IEEE Press, paper con250, pp. 430–441 (2012)
Botev, Z.I., Kroese, D.P.: An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodol. Comput. Appl. Probab. 10(4), 471–505 (2008)
Botev, Z.I., Kroese, D.P.: Efficient Monte Carlo Simulation via the generalized splitting method. Stat. Comput. 22(5), 1031–1040 (2011)
Bouchet, F., Laurie, J., Zaboronski, O.: Langevin dynamics, large deviations and instantons for the quasi-geostrophic model and two-dimensional Euler equations. J. Stat. Phys. (2014) (in press)
Cérou, F., Del Moral, P., LeGland, F., Lezaud, P.: Genetic genealogical models in rare event analysis. Alea 1, 181–203 (2006)
Cérou, F., Guyader, A.: Adaptive multilevel splitting for rare events analysis. Stoch. Anal. Appl. 25, 417–443 (2007)
Cérou, F., Del Moral, P., Furon, T., Guyader, A.: Sequential Monte Carlo for rare event estimation. Stat. Comput. 22, 795–808 (2012)
Cerquetti, A.: A simple proof of a generalization of the Chu–Vandermonde identity, arXiv:1012.1243 math.PR,1–4 (2010)
Del Moral, P.: Feynman–Kac Formulae, Genealogical and Interacting Particle Systems with Applications, Probability and Applications. Springer-Verlag, New York (2004)
Favaro, S., Punster, I., Walker, S.G.: On a generalized Chu–Vandermonde identity. Methodol. Comput. Appl. Probab. 14, 253–262 (2010)
Garvels, M.J.J.: The splitting method in rare event simulation, Thesis, University of Twente, Twente, (2000). http://www.math.utwente.nl/dos/sor/AiOs/Garvels/Thesis
Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)
Guyader, A., Hengartner, N., Matzner-Løber, E.: Simulation of extreme quantiles and extreme probabilities. Appl. Math. Optim. 64, 171–196 (2011)
Kahn, H., Harris, T.: Estimation of particle transmission by random sampling. Natl. Bur. Stand., Appl. Math. Ser. 12, 27–30 (1951)
Maragliano, L., Cottone, G., Ciccotti, G., Vanden-Eijnden, E.: Mapping the network of pathways of CO diffusion in myoglobin. J. Am. Chem. Soc. 132(3), 1010–1017 (2010)
Rosenbluth, M., Rosenbluth, A.: Monte Carlo calculation of the average extension of molecular chains. J. Chem. Phys. 23(2), 356–359 (1955)
Rubinstein, R.: The Gibbs cloner for combinatorial optimization, counting and sampling. Methodol. Comput. Appl. Probab. 11(4), 491–549 (2009)
Vanden-Eijnden, E., Venturoli, M., Ciccotti, G., Elber, R.: On the assumptions underlying milestoning. J. Chem. Phys. 29, 174102 (2008)
Villén-Altamirano, M., Villén-Altamirano, J.: RESTART: a straightforward method for fast simulation of rare events. In: Tew, J.D., Manivannan, M.S., Sadowski, D.A. Seila, A.F. (eds) Proceedings of the 1994 Winter Simulation Conference, Orlando 1994, pp. 282–289 (1994)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Preliminary notations
We recall here some classical definitions for the negative binomial law of first and second kind as well as the multinomial law:
-
\(X \sim \mathrm{NegBin}_1(N,p)\):
$$\begin{aligned} \Pr (X = k) = \left( {\begin{array}{c}k+N-1\\ k\end{array}}\right) (1-p)^N p^k,~k \ge 0. \end{aligned}$$Assume, one has a sequence of (Bernouilli) trials with \(p\), the probability of success, and \(1-p\) the probability of failure. The number of successes observed after \(N\) failures follows \(\mathrm{NegBin}_1(N,p)\).
-
\(X \sim \mathrm{NegBin}_2(N,p)\):
$$\begin{aligned} \Pr (X = k) = \left( {\begin{array}{c}k-1\\ k-N\end{array}}\right) (1-p)^N p^{k-N},~k \ge N \end{aligned}$$In this case, one is looking for the number of Bernouilli trials (instead of successes) to obtains \(N\) failures.
-
\((X_1,\cdots X_n) \sim \mathcal{M}(N,(p_k)_{1 \le k \le n})\), with \(p_1 + \cdots p_n = 1\) and \(k_1 + \cdots k_n = N\):
$$\begin{aligned} \Pr ((X_1,\cdots X_n) = (k_1,\cdots k_n)) = \frac{N!}{k_1!\cdots k_n!} p_1^{k_1} \cdots p_n^{k_n}. \end{aligned}$$
For positive integer numbers satisfying \(\sum _{j=1}^k a_j \le a\), we denote
Note that the notation differs from the notation with \(a_1,\cdots , a_{k+1}\) and \(\sum _j a_j = a\).
1.2 Lemma 1
Proof
We proceed by recurrence. The relation is trivial for \(n=0\). Assume the identity is true at level \(n\), and consider some arbitrary \(p \ge n+1\), we consider the expression at level \(n+1\). We denote \(S_{n+1} := {\displaystyle \sum \nolimits _{j=0}^{n+1} \beta _{pj}} = S_n + \beta _{p,n+1}\) we can write:
We apply the level \(n\) identity to the inner sum so that one has
We now use the general fact that
Plugging this expression in the general formula gives
which concludes the proof using the classical Chu identity. Note that the demonstration can be achieved in a more straightforward way using Pochhammer symbols [see e.g. Cerquetti (2010)].\(\square \)
Proof of Theorem 1 Let us consider a matricial description of the problem with a triangular matrix \(T\) of random (not independent) variables of size \((n+2) \times (n+2)\)
and a corresponding matrix of discrete probabilities
with
and \((\mathcal{P}_k)_{n+1} = \frac{\textstyle F_{n+1,k-1}}{\textstyle F_{k,k-1}}\). We use the convention that \(f_{j,-1} = f_{j0}\), \(F_{j,-1} = F_{j,0}\) and \(\mathcal{N}_{-1} = N\). We also use the notation \(|\mathcal{B}_k| = \sum _{j=0}^{n+1} (\mathcal{B}_k)_j = \sum _{j=k}^{n+1} B_{jk}\), moreover \(\mathcal{B}_k^\star \) refers to as the \(k^\mathrm{th}\) column of \(^t\!T\) the matrix transpose of \(T\) and \(|\mathcal{B}_k^\star | = \sum _{j = 0}^k B_{kj}\). For easier reading, we write the matrix \(T\) explicitly together with the sums of the lines and columns \(|\mathcal{B}_k|\) and \(|\mathcal{B}^\star _k|\) respectively:
We recall that the first column represents the number of trajectory maxima on each level sets after the initialization step. There is a total of \(N\) trajectories. The second column represents the number of trajectory maxima on each level sets after \(K_0\) iterations. The total of redistributed trajectories is \(B_{00} = \mathcal{N}_0\). The third column represents the number of trajectory maxima on each level sets after \(K_0+K_1\) iterations. The total of redistributed trajectories is \(B_{10}+B_{11}\). We also give the associated probability matrix:

The algorithm is such that for all \(0 \le k \le n+1\)
The consequence is that, by replacing the terms \( B_{n+1,k} = \mathcal{N}_{k-1} - \sum _{j=k}^n B_{jk}\), one has \(\mathcal{N}_{n+1} = N\).
The joint distribution of the matrix \(T\) is
where \(\mathcal{M}_{\mathcal{N}}(\mathcal{B})\) is a multinomial distribution with associated probability \(\mathcal{P}_k\) and parameter \(\mathcal{N}\). In more details, it is
Note that the last term \(\mathcal{M}_{\mathcal{N}_n} = \delta _{|\mathcal{B}_{n+1}| - \mathcal{N}_n}\) since
\((\mathcal{P}_{n+1})_{n+1} = 1\) and \(|\mathcal{B}_{n+1}| = B_{n+1,n+1} = \mathcal{N}_n\).
The aim of the demonstration is to compute the joint probability
The marginal \({\displaystyle \sum \nolimits _{\mathcal{B}_{n+1}^\star } \Pr (\mathcal{B}_0, \cdots \mathcal{B}_{n+1})}\) is easily found and it amounts to replace the Dirac terms by one and take \({\displaystyle B_{n+1,k} = \mathcal{N}_{k-1} - \sum \nolimits _{j = k}^n B_{jk}}\). Using the fact that \(\frac{f_{j,k-1}}{F_{j,k-1}} = \frac{f_{j0}}{F_{j0}}\), the joint distribution writes
One takes the logarithm of this expression:
Therefore by replacing the terms \(|\mathcal{B}_k|\) and \(|\mathcal{B}_j^\star |\) and going back to \(\Pr \),
The joint probability (39) thus takes the form
We now apply the combinatorial lemma to the different sums. The first is
by identifying \(\alpha _k\) with \(\mathcal{N}_{k-1}\), \(\gamma \) with \(\mathcal{N}_n\) and noting that
We thus obtain recursively
Now, we have \(1-\frac{\textstyle f_{k0}}{\textstyle F_{k0}} = \frac{\textstyle F_{k+1,0}}{\textstyle F_{k0}}\) and \(\frac{f_{k0}^{\mathcal{N}_k}}{F_{k0}^{\mathcal{N}_{k-1}}} = \) \(\left( \frac{f_{k0}}{F_{k0}} \right) ^{\mathcal{N}_k} F_{k0}^{\mathcal{N}_k-\mathcal{N}_{k-1}}\) and together with \(\mathcal{N}_{n+1} = N\), the product becomes
and using \(F_{n+1,0} = \frac{F_{n+1,0}}{F_{n0}} \cdots \frac{F_{10}}{F_{00}}\) we obtain the final result:
\(\square \)
Proof of Theorem 2 We now focus on the iterations \(K_p = \mathrm{NegBin_2}(\mathcal{N}_p,F_{p+1,p})\), for \(0 \le p \le i_b-1\). Since the \(\mathcal{N}_p\) are independent and \(K_p\) depends only on \(\mathcal{N}_p\), so are the iterations \(K_p\), i.e.
where \(\Pr (K_p|\mathcal{N}_p) = \left( {\begin{array}{c}K_p-1\\ K_p-\mathcal{N}_p\end{array}}\right) F_{p+1,p}^{\mathcal{N}_p} (1-F_{p+1,p})^{K_p-\mathcal{N}_p}\) with \(K_p \ge \mathcal{N}_p\) is the negative binomial law and \(\Pr (\mathcal{N}_p) = \left( {\begin{array}{c}N\\ \mathcal{N}_p\end{array}}\right) (1-F_{p+1,p})^{\mathcal{N}_p} F_{p+1,p}^{N-\mathcal{N}_p}\).
Note that the distribution obtained now follows a Pascal law or negative binomial of first kind with parameter \(N\) and \(F_{p+1,p}\) and behaves as the number \(K_p\) of time the trajectories have their maxima on the (worst) level set \(\varSigma _p\) (failures) before getting \(N\) successes, i.e. before all trajectory maxima are strictly above this level. The probability of the sum is easily found using the characteristic function of the Pascal law:
Finally, the total sum \({\displaystyle \mathcal{K} = \sum _{p=0}^{i_b-1} K_p}\) is
\(\square \)
Proof of Theorem 5 Theorem 3 shows that \(n \sim \mathrm{Bin}(N,1-\varDelta )\) and \(\mathcal{K} - \mathcal{K}_0 \sim \mathrm{Poisson}(-N \log \frac{\textstyle p_B}{\textstyle \varDelta })\). The moments are therefore
with \(\lambda = -N \log \frac{\textstyle p_B}{\textstyle \varDelta }\). For \(m=1\), it is \(\sum _n (1-\frac{n}{N}) \left( {\begin{array}{c}N\\ n\end{array}}\right) \varDelta ^n (1-\varDelta )^{N-n} \times \) \(\mathrm{e}^{-\lambda } \mathrm{e}^{\lambda (1-\frac{1}{N})} - p_B = \) \(p_B \left( \varDelta ^{-1}(1-N(1-\varDelta )/N) -1 \right) = 0\). The variance follows the same line of calculus. \(\square \)
Proof of Lemma 2 Let \(X_k^x\) the (discrete) Markov process in state space \(\mathcal{E}\) and starting from \(x\) at time \(k=0\). Let \(\mathbf{L}_k\) be an arbitrary path of length \(k\), \(\mathbf{L}_k \equiv [x_1 \cdots x_{k}] \in \mathcal{E}^k\). We define the set \(F_q^n\) of paths of length \(n+1\) starting from \(x_0\) fixed and hitting \(\varSigma _{\ge q}\) at time \(n\) before returning to \(\varSigma _A\) (with say observable strictly less than 0). This set is
and
We now use the law of total probability and the (strong) Markov property, namely
where \(k^\star \ge 0\) is such that \(\varPhi (X_{n+k^\star }^{x_0}) = \sup _{k \in [0,T_0]} \varPhi (X_k^{x_0})\). In term of the relative (marginal) hitting probability of \(\varSigma _{\ge q}\):
with \(\sum _{x \in \mathcal{E}} P_q(x_0,x) = 1\). One thus obtains
Using the previous notations, it writes as
where \(P_q\) is defined in (42) and \(f_{pq}(x_0) = \Pr (Q(x_0) = \sigma _p|Q(x_0) \ge \sigma _q)\). \(\square \)
Proof of Theorem 6 The condition (32) is sufficient since \(f_{pq}^N = f_{pq}\). It is also necessary. The demonstration of Theorem 1 uses two conditions:
The first condition is a condition of separability of \(f_{pq}^N\) in term of \(p\) and \(q\), it implies in particular that \(\frac{\textstyle f_{pq}^N}{\textstyle F_{pq}^N} = \frac{\textstyle f_{p0}^N}{\textstyle F_{p0}^N}\). The second means that \(f_{pq}^N\) is a probability. These two conditions are essential for using Lemma 1.
One sets \(u = \frac{\textstyle h_q^N G_{pq}}{\textstyle f_{pq}} \) with \(h_q^N \equiv \frac{1}{N-1} (1-N \epsilon ^{N-1} \frac{1-\epsilon }{1-\epsilon ^N})\), \(G_{pq} \equiv \displaystyle \int _{\varSigma _{\ge q}} \mathcal{F}_p \mathcal{H}_q ~dx\). Using \(\partial _p \partial _q \log f_{pq} = 0\) (see Property 1) it translates into \((1+u) \frac{\textstyle \partial ^2 u}{\textstyle \partial p\partial q} = \frac{\textstyle \partial u}{\textstyle \partial p} \frac{\textstyle \partial u}{\textstyle \partial q}\) whose general solution must be of the form \(u(N,p,q) = -1 + u_1(N,p) u_2(N,q)\). with the constraints
We have \(\partial _N \partial _p \left\{ \frac{\textstyle -1 + u_1(N,p) u_2(N,q)}{\textstyle h(N,q)} \right\} = 0\), and thus \( \partial _p u_1(N,p) = c(p,q) \frac{\textstyle h(N,q)}{\textstyle u_2(N,q)}\). Suppose \(c \ne 0\), by differentiating the last expression w.r.t \(q\) one obtains that \(u_2\) must have the following form
Plugging this expression into the expression for \(\partial _p u_1\) yields \(c(p,q) = G_1'(p) G_2(q)\) i.e.
One obtains a contradiction by remarking that necessarily \(G_2(q) = G_2\) due to the second constraint and that therefore the first constraint cannot be satisfied. One has therefore \(c = 0\) implying \(u_1(N,p) = \kappa _1(N)\). The first and second constraints therefore impose \(u_2\) of the following form:
that is, \(u(N,p,q) = G_2(q) h(N,q)\). Going back to the expression of \(f_{pq}\) and \(f^N_{pq}\), one has therefore
The conclusion is that \(f_{pq}^N = f_{pq} \left( 1 + h_q^N G_2(q) \right) \), summing over \(p\) yields that necessarily \(1 + h_q^N G_2(q) = 1\), i.e. \(G_2(q) = 0\). \(\square \)
Proof of Theorem 7 The first part of the proof is trivial, namely if the statistics do not depend on the initial condition on a given level set then the term \(\mathcal{F}_p(x)\) factorizes out leaving the term \(\int _{\varSigma _q} \mathcal{H}_q(x_0,x)~dx\) which is always zero. The second part of the proof is more technical. We would like to show that this condition is necessary as well. We introduce the following quantities, namely
and
where \(\mathrm{Hit}(\varSigma )\) is the hitting state of \(\varSigma \) and \(F_k^{(j)}(x) = \Pr (Q(x) = \sigma _k|x),~x \in \varSigma _j\) is another notation for \(\mathcal{F}_k\). We note that the integral over \(x\) of \(A_k^{(j)}(x)\) is independent of \(\varSigma _j\), i.e.
Translating the Markovian condition, the normalized constraint and condition (32) in term of these quantities, one can write
It simply amounts to write (32) as
that is, the hitting probability is now conditioned on \(Q_0 = \sigma _q\) instead of \(Q_0 \ge \sigma _q\). We consider the partition \(\varSigma _{A} \cup \varSigma _0 \cup \varSigma _1 \cup \varSigma _2\) with some arbitrary number of states \(x_i \in \varSigma _0\), \(y_i \in \varSigma _1\). We focus on the nontrivial case with \(q=1\) and \(M(x,y)\) is the Markov chain. One has
and
where \(\mathcal{M}(y)\) is the decoupled term in parenthesis and \(\sum _y \mathcal{M}(y) = \Pr (Q_0 \ge \sigma _1) = \overline{A_1} + \overline{A_2} = 1-\overline{A_0}\). Set \(a_i = A_1^{(1)}(y_i)\), \(b_i = A_2^{(1)}(y_i)\) and \(m_i = \mathcal{M}(y_i)\). The condition (43) translates into \(B_{11}^1 = (1-\overline{A_0})^{-1}(\overline{A_1})^2\), \(B_{12}^1 = B_{21}^1 = (1-\overline{A_0})^{-1} \overline{A_1} ~\overline{A_2}\) and \(B_{22}^1 = (1-\overline{A_0})^{-1}(\overline{A_2})^2\).
Using the fact that \(1-\overline{A_0} = \sum _i m_i\), the first equality can be written as a sum of squares
and similarly for the \(B_{22}^1\) expression. The unique solution is thus
the coupled equality being automatically fulfilled. Going back to the notations, it means that the terms \(F_k^{(1)}(y)\) must not depend on \(y\). A simple recursive step yields that it is true at each level which concludes the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Simonnet, E. Combinatorial analysis of the adaptive last particle method. Stat Comput 26, 211–230 (2016). https://doi.org/10.1007/s11222-014-9489-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9489-6