Nonequilibrium Sampling
These exmaples involve sampling from a nonequilbrium distribution. They are particularly focused on computing the mean firest passage time (MFPT) via the Hill relation [6]; see, also, https://statisticalbiophysicsblog.org/?p=275.
The Hill Relation
Consider the case of a discrete in time process, $X_t$, and two sets, $A$ and $B$. We wish to compute the mean first passage time for the process, starting in $A$ to reach $B$. One way of doing this would be sample initial conditions in $A$ (this distribution, $\mu_0$, must be specified), and run them until they reach $B$. But if the MFPT is long, it will be computationally prohibitive to observe enough transitions to obtain a statistically meaningful estimate.
Now, we study an augmented processs, $Y_t$, where $Y_t$ obeys the same dynamics as $X_t$ until it reaches $B$, at which point it resets in $A$. Denote the steady state distribution of the $Y_t$ process by $\nu$ The the Hill relation then says
\[\frac{1}{\text{MFPT for $X_\cdot$}} = \mathbb{P}_\nu(B)\]
Consequently, large MFPT's correspond to low probability events; some form of importance sampling will be needed. This is precisely the regime where WE can pay off. We will simulate the augmented process and estimate $\mathbb{P}_\nu(B)$ using the we with the observable $1_B(y)$.
To connect this to the continuous time case, frequently of interest, let us assume that $X_t$ follows overdamped Langevin dynamics
\[dX_t = -\nabla V(X_t)dt + \sqrt{2\beta^{-1}}dW_t,\]
The MFPT for this problem is computed first by solving the Poisson equation,
\[L\tau_B = 0, \quad \tau_B|_{\partial B} = 0,\]
and then computing
\[\text{MFPT $A\to B$} = \mathbb{E}_{\mu_0}[\tau_B(X)].\]
In the event that $A=\{x_0\}$, as it will be in the computational example, this is just $\tau_B(x_0)$. As this Poisson problem will typically be in high dimension, solving it via classical methods is not feasable.
The associated $Y_t$, obeys this the overdamped Langevin equation until it hits $\partial B$, at which point it instantaneously retarts in $A$ acoording ot the initial distribution $\mu_0(dx) = \gamma_0(x)dx$. The modified process obeys the nonlocal Fokker-Planck equation:
\[\partial_t \rho = L^\ast ρ + \gamma_0(x)\int_{\partial B}J_{\rho}\cdot n dS.\]
In the above expression, $\int_{\partial B}J_{\rho}\cdot n dS$ is the integrated probability flux into the set $B$. This has an associated noequilibrium steady state distribution (NESS).
To set this up to work with WE, we time discretize our process, $\tilde{X}_{k}\approx X_{t_k}$ in some fashion and work with the associated augmented process (recylcing upon reaching $B$), $\tilde{Y}_k$. The estimate of the MFPT of the original process is then
\[\text{MFPT}\approx \frac{\Delta t}{\tilde{\nu}(B)}.\]
Indeed, this is approximate even without statistical error because we have lost information on the sub $\Delta t$ time scale. But this error will typically be small in comparison to the aforementioned statistical error.
Doublewell Example
- ODE Results
- Defining the Mutation Step
- Defining the Bins
- Run WE with Uniform Selection
- Comparison with Direct Computation
Following the equilibrium example, let us first compute, via ODE methods, the NESS along with the MFPT. Here, we will take $B = [b,\infty)$, with b=0.5
and $\mu_0 =\delta_{x_0}$, with x0 = -1
. Thus, we are estimating the MFPT for a trajecotry, starting in the left basin to ge into the "heart" of the right basin. As we do not compute on an infinite domain, we impose a Neumann (reflecting) boundary condition at $x=a<x_0$ with a=-2
in the example. For low enough temperature, this is a good approximation.
ODE Results
For later comparison, we illustrate the results that are obtained usine ODE methods. The NESS density can be obtained by integration:
using QuadGK
using TestLandscapes
using Plots
β = 10.0;
V_ode(x) = SymmetricDoubleWell(x);
x0 = -1.0;
a = -2;
b = 0.5;
C0 = quadgk(t -> exp(β * V_ode(t)), x0, b)[1];
function f_(x)
if (x < x0)
return C0 * exp(-β * V_ode(x))
else
return quadgk(t -> exp(β * V_ode(t)), x, b)[1] * exp(-β * V_ode(x))
end
end
Z = quadgk(f_, a, b)[1];
xx = LinRange(a, b, 200);
ρ = f_.(xx) / Z;
plot(xx, ρ, label="Density",lw=2)
xlabel!("x")
title!("NESS")
For the MFPT, we use BVPProblem
from BoundaryValueDiffEq
:
using BoundaryValueDiffEq
using ForwardDiff
∇V_ode = x->ForwardDiff.derivative(V_ode, x);
function τ_rhs!(du, u, p, t)
du[1] = u[2]
du[2] = β * (∇V_ode(t) * u[2] - 1.0)
du
end
function τ_bc!(res, u, p, t)
res[1] = u[1][2];
res[2] = u[end][1];
res
end
tspan = (a, b)
τ_bvp = BVProblem(τ_rhs!, τ_bc!, [1., 0.], tspan)
τ_soln = solve(τ_bvp, MIRK4(), dt = 0.05);
MFPT = τ_soln(x0[1])[1];
@show MFPT;
25473.945860592437
Defining the Mutation Step
In this example we use an Euler-Maruyama integrator with the recycling condition, with the BasicMD
package:
using BasicMD
V(x) = SymmetricDoubleWell(x[1])
∇V! = (gradV, X) -> ForwardDiff.gradient!(gradV, V, X);
Δt = 1e-2; # time step
Δt_recycle = 1e-2
nΔt_recycle = Int(Δt_recycle / Δt); # number of time steps before applying recycler
nΔt_coarse = 1 * nΔt_recycle # number of time steps in a coarse step
sampler = EM(∇V!, β, Δt);
mutation_opts = MDOptions(n_iters=nΔt_coarse, n_save_iters=nΔt_coarse);
function restartA!(state::BasicMD.EMState)
if (state.x[1] > b)
@. state.x = [x0]
∇V!(state.∇V, [x0])
end
state
end
constraints = Constraints(restartA!, trivial_constraint!, nΔt_recycle, nΔt_coarse);
mutation! = x -> sample_trajectory!(x, sampler,constraints, options=mutation_opts); nothing
Defining the Bins
Bins will be defined via Voronoi cells:
using WeightedEnsemble
Δb = 0.2
x_voronoi = [[x - 0.5 * Δb] for x in -1.5:Δb:0.5+Δb]
@show n_bins = length(x_voronoi);
B0, bin_id, rebin! = setup_Voronoi_bins(x_voronoi); nothing
n_bins = length(x_voronoi) = 12
Run WE with Uniform Selection
As we will see, uniform selection will be sufficient to obtain relatively high quality results:
using Random
n_particles = 10^3;
E0 = Ensemble([[x_] for x_ in LinRange(-1.5, 0.5, n_particles)])
rebin!(E0, B0, 0);
uni_we_sampler = WEsampler(mutation!, uniform_selection!, rebin!);
n_we_steps = 10^3;
Random.seed!(100);
E_uni_trajectory, B_uni_trajectory = run_we(E0, B0, uni_we_sampler, n_we_steps); nothing
First, we verify convergence to the target NESS distribution:
using Printf
hist_bins = LinRange(-2.0, 1.5, 41);
we_anim = @animate for (t, E) in enumerate(E_uni_trajectory)
histogram([ξ_[1] for ξ_ in E.ξ], bins=hist_bins,
weights=E.ω, norm=:pdf, alpha=0.5, label="Weighted Ensemble",
yaxis=(:log10, [1e-6, 1e1]), legend=:topright)
plot!(xx, ρ, lw=2, label="ODE")
xlabel!("x")
ylabel!("Density")
xlims!(-1.5, 1)
title!(@sprintf("t = %d", t))
end
gif(we_anim, fps =30)
Next, we take a look at our quantity of interest, the estimate of $\mathbb{P}_{\tilde{\nu}}(B)$. Using the Hill relation, this should correspond to an estimate of $\Delta t/\text{MFPT}$, the value we already computed by ODE methods:
using LinearAlgebra
f_uni_trajectory = zeros(n_we_steps);
fB = X -> Float64(X[1] > b); # define observable
for (j,E) in enumerate(E_uni_trajectory)
f_uni_trajectory[j] = (E.ω ⋅ fB.(E.ξ));
end
plot(1:n_we_steps, f_uni_trajectory, lw=2,yaxis=(:log10, [1e-8, :auto]),label="WE Est.")
plot!(1:n_we_steps, cumsum(f_uni_trajectory) ./(1:n_we_steps),lw=2, label="WE Time Avg.")
plot!(1:n_we_steps, Δt / MFPT * ones(n_we_steps),
lw=2,label="Δt/MFPT", color=:black, ls=:dash)
xlabel!("Iterate")
We have a good estimate after 500 iterations. Note that the MFPT of the original continuous time process is $\approx 25473$, but we simulated our WE process out to continuous time $10$ ($10^3$ WE steps with $\Delta t = 10^{-2}$).
Comparison with Direct Computation
As in the equilibrium case, an attempt to estimate the MFPT via the Hill relation, without using WE, will result in very poor results:
Random.seed!(200)
trivial_we_sampler = WEsampler(mutation!, (E, B, t)->trivial_selection!(E), rebin!);
f_direct_trajectory = run_we_observables(E0, B0, trivial_we_sampler, n_we_steps, (fB,))[:];
plot(1:n_we_steps, f_uni_trajectory, lw=2,yaxis=(:log10, [1e-8, :auto]), label="WE Est.")
plot!(1:n_we_steps, cumsum(f_uni_trajectory) ./(1:n_we_steps),lw=2, label="WE Time Avg.")
plot!(1:n_we_steps, f_direct_trajectory, lw=2, label="Direct Est.")
plot!(1:n_we_steps, cumsum(f_direct_trajectory) ./(1:n_we_steps),lw=2, label="Direct Time Avg.")
plot!(1:n_we_steps, Δt / MFPT * ones(n_we_steps),
lw=2,label="Δt/MFPT", color=:black, ls=:dash)
xlabel!("Iterate")