Fixing the flux with Molly.jl

PASC 2024 Minisymposium: Composable Julia Software in Atomistic Materials Modeling

Noé Blassel (joint work with Gabriel Stoltz)

École des Ponts ParisTech, Inria Paris

2024-03-06

Transport coefficients: a computational challenge

  • Sensitivities of fluxes in response to nonequilibrium forcings.
  • Fourier’s law: J \propto \Delta T
  • Fourier’s law: \(J = \kappa \Delta T\) (\(\kappa=\) thermal conductivity).
  • Dynamical properties of molecular systems: thermal transport, diffusion, shear viscosity…
  • Crucial to parametrize macroscopic models (e.g. evolution PDEs like Navier–Stokes).
  • Standard methods require long simulations to get useful confidence intervals.
  • Need for innovative variance reduction techniques.
  • The package Molly.jl narrows the gap between methods development and applications, because of its low implementation overhead and affordable speed.

Setting: linear response for stochastic dynamics

  • Perturbed stochastic differential equation: \[ \mathrm{d}X^\eta_t = b(X^\eta_t)\,\mathrm{d}t+\sigma(X^\eta_t)\,\mathrm{d}W_t +\eta F(X^\eta_t)\,\mathrm{d}t \] \(X_t\in\mathcal X\) (state space) \(b\) (equilibrium drift), \(F\) (nonequilibrium forcing) vector fields on \(\mathcal X\), \(\sigma\) (diffusion matrix), \(W\) Wiener process.
  • Strength of the perturbation: \(\eta\). Equilibrium when \(\eta = 0\).
  • Nonequilibrium ensemble: \(\psi_\eta\). Probability measure (steady-state) on \(\mathcal X\), not explicit for \(\eta\neq 0\).
  • Define a response observable of flux \(R:\mathcal X \to\mathbb R\), with \(\mathbb E_{\psi_0} R = 0\)
  • Linear response is the derivative of the expected flux: \[ \begin{equation}\rho(F,R) = \underset{\eta \to 0}{\lim}\, \frac1\eta\mathbb E_{\psi_\eta} R \end{equation} \]

Standard estimators of the linear response

  • Finite-difference ergodic estimator (NEMD): \[ \begin{equation}\widehat\rho_T^{\mathrm{NEMD}}(F,R) = \frac1{T\eta}\int_0^T R(X_t^\eta)\,\mathrm{d}t,\quad \eta>0. \end{equation} \]
  • Variance scales like \(T^{-1}\eta^{-2}\): simulation times of order \(\eta^{-2}\varepsilon^{-2}\) for confidence intervals of width \(\mathcal{O}(\varepsilon)\).
  • Equilibrium methods based on the Green–Kubo formula: \[ \begin{equation} \widehat\rho_T^{\mathrm{GK}}(F,R) = \int_0^{T} \mathbb E_{\psi_0}[R(X_0^0)S(X_t^0)]\,\mathrm{d}t, \end{equation} \] where \(S\) is the so-called conjugate response, which is explicit in terms of \(F\).

Dual approach: fixing the flux

  • Nonequilibrium dynamics with a stochastic perturbation strength: \[ \mathrm{d}Y^r_t = b(Y^r_t)\,\mathrm{d}t+\sigma(Y^r_t)\,\mathrm{d}W_t + F(Y^r_t)\mathrm{d}\Lambda_t,\quad \mathrm{d}\Lambda_t = \lambda(Y_t^r)\,\mathrm{d}t + \widetilde{\lambda}(Y_t^r)\,\mathrm{d}W_t. \]
  • The perturbation magnitude is chosen to enforce the constant flux condition: \(R(Y_t^{r}) = r \in {\mathbb{{R}}}\) for all \(t>0\).
  • Fully explicit expressions for \(\lambda,\widetilde{\lambda}\) can be derived.
  • Stochastic generalization of the “Norton method” of Evans & Morriss (1980s).

Reciprocal estimators for the linear response

  • Constant-flux ensemble: steady-state for the constrained dynamics, \(\phi_r\). Probability measure on the submanifold \(\{R = r\}\).
  • Dual linear response: \[ \rho^{\mathrm{DUAL}}(F,R) = \underset{r \to 0}{\lim}\, \frac{r}{\mathbb E_{\phi_r}\left[\lambda\right]}. \]
  • Measure inverse of resistance instead of conductance. \(\rho(F,R) \overset{?}{=} \rho^{\mathrm{DUAL}}(F,R)\) not true in general (sufficient conditions still lacking).
  • Finite difference estimator: \[ \widehat{\rho}_T^{\mathrm{DUAL}} = \frac{Tr}{\int_0^T \lambda(Y_t^r)\,\mathrm{d}t} \]
  • Paper(Blassel and Stoltz 2024): in-depth presentation of the method, generalizations, time-discretization schemes, special case of underdamped Langevin dynamics, realistic numerical examples.

Molly.jl

  • Pure Julia implementation of MD, part of JuliaMolSim.
  • Creator and main contributor: Joe Greener (LMB, Cambridge). Small (but growing) community of contributors.
  • Standard features: deterministic/stochastic integrators, fast neighbor lists with CellListMap.jl, flexible logging interface, common bonded/non-bonded interactions…
  • Implements AtomsBase.jl interface.
  • Compatibilities: units with Unitful.jl, standard file formats with Chemfiles.jl, atomic cluster expansions with ACEPotentials.jl, NN potentials with Flux.jl, ab-initio forces with DFTK.jl
  • Big selling point: aims for full support of differentiable simulation (currently with Zygote.jl). Paper (Greener 2024).
  • ~13k lines of code. (LAMMPS: ~730k)
  • Some important features not implemented yet. (currently in version 0.21.0)

Example: solvated protein

using Molly

ff = OpenMMForceField("ff99SBildn.xml", "tip3p_standard.xml")
sys = System(
    "6mrr_equil.pdb",
    ff;
    loggers = (
        energy=TotalEnergyLogger(10),
        writer=StructureWriter(10, "traj_6mrr_5ps.pdb", ["HOH"])
    ),   
)

minimizer = SteepesetDescentMinizer()
simulate!(sys,minimizer; run_loggers = false)

random_velocities!(sys, 300.0u"K")
simulator = LangevinSplitting(
    dt = 0.001u"ps",
    temperature = 300.0u"K",
    friction = 1.0u"g * mol^-1 * ps^-1",
    splitting = "BAOAB"
)

simulate!(sys, simulator, 5000; threads = Threads.nthreads())

Five picoseconds of Foldit1 (PDB ID 6MRR), thanks to Joe Greener.

Transport coefficients in Molly.jl

  • Standard methods now work (almost) out of the box.
  • Finite difference estimator (NEMD): define a custom interaction type NEMDForcing and a corresponding method for AtomsCalculators.forces.
  • Record history of flux R with Molly.GeneralObservableLogger.
  • Green-Kubo method: record time-lagged correlations using Molly.TimeCorrelationLogger with flux R and conjugate flux S.

Implementation of the dual method in Molly.jl

  • Modular design allows customizable behavior: user-defined loggers, interactions and simulators.
  • Dual approach uses a user-defined simulator.
  • Type NortonSplitting representing the integrator.
  • Method for Molly.simulate! using custom type.
struct NortonSplitting{S,E,K,F,W,TF,TG}
    dt::S
    r::E
    T::K
    γ::F
    splitting::W
    F::TF
    G::TG
end

function Molly.simulate!(sys::System, sim::NortonSplitting, n_steps; n_threads::Integer=Threads.nthreads(), rng=Random.GLOBAL_RNG)
    α_eff = exp(-sim.γ * sim.dt / count('O', sim.splitting))
    σ_eff = sqrt(1 - α_eff^2)

    effective_dts=[sim.dt / count(c,sim.splitting) for c in sim.splitting]

    forces_known = true
    force_computation_steps = Bool[]

    occursin(r"^.*B[^B]*A[^B]*$", sim.splitting) && (forces_known = false)

    for op in sim.splitting
        if op == 'O'
            push!(force_computation_steps, false)
        elseif op == 'A'
            push!(force_computation_steps, false)
            forces_known = false
        elseif op == 'B'
            if forces_known
                push!(force_computation_steps, false)
            else
                push!(force_computation_steps, true)
                forces_known = true
            end
        end
    end

    sys.coords .= wrap_coords.(sys.coords, (sys.boundary,))
    neighbors = find_neighbors(sys, sys.neighbor_finder; n_threads=n_threads)
    run_loggers!(sys, neighbors, 0; n_threads=n_threads)

    accels = accelerations(sys, neighbors; n_threads=n_threads)
    
    velocities_array = reinterpret(reshape, Float64, sys.velocities)
    coords_array = reinterpret(reshape, Float64, sys.coords)

    v_x = view(velocities_array, 1, :)
    q_y = view(coords_array, 2, :)

    F_y = sim.F.(q_y)
    G_y = sim.G.(q_y)

    FdotG = dot(F_y, G_y)

    λ = (sim.r-dot(G_y,v_x))/FdotG
    v_x .+= λ .* F_y

    λ_hist=Float64[]

    for step_n=1:n_steps
        λ_A = λ_B = λ_O = 0.0

        for (i,op)=enumerate(sim.splitting)
            if op=='A'
                sys.coords .+= sys.velocities .* effective_dts[i]
                sys.coords .= Molly.wrap_coords.(sys.coords, (sys.boundary,))

                F_y .= sim.F.(q_y)
                G_y .= sim.G.(q_y)
                FdotG = dot(F_y,G_y)
                λ = (r - dot(G_y, v_x))/ FdotG
                v_x .+= λ .* F_y
                λ_A += λ

            elseif op=='B'
                ( force_computation_steps[i] ) &&  ( accels .= accelerations(sys,neighbors, n_threads=n_threads) )
                sys.velocities .+= accels .* effective_dts[i]
                λ = (r- dot(G_y, v_x)) / FdotG
                v_x .+= λ .* F_y
                λ_B += λ
                
            elseif op=='O'
                sys.velocities .= α_eff .* sys.velocities .+ σ_eff .* random_velocities(sys,sim.T; rng=rng)
                λ = (r- dot(G_y, v_x)) / FdotG
                v_x .+= λ .* F_y
                λ_O += (1-α_eff)*sim.r / FdotG

            end
        end

        λ_est= (λ_A + λ_B + λ_O) / sim.dt
        push!(λ_hist, λ_est)

        run_loggers!(sys,neighbors,step_n)

        if step_n != n_steps
            neighbors = find_neighbors(sys, sys.neighbor_finder, neighbors, step_n ; n_threads=n_threads)
        end
    end
    return λ_hist
end

Results: consistency

  • Lennard-Jones fluid with 1000 atoms (liquid argon).
  • Forcing: transverse sinusoidal forcing profile in the \(x\)-direction.
  • Flux: principal Fourier coefficient of the profile in the \(x\)-component of velocity.
  • Linear response is related to shear viscosity of the fluid.
  • \(\color{red}+\): standard method, \(\color{green}\times\): dual method.
  • Agreement in and beyond the linear response regime.

Results: fluctuations in the thermodynamic limit

  • Compare \(\mathop{\mathrm{\mathrm{Var}}}_{\psi_0}R\) and \(\mathop{\mathrm{\mathrm{Var}}}_{\phi_0}\lambda\).
  • Concentration in the thermodynamic limit \(N\to\infty\).
  • Scaling in the constant-forcing ensemble: \(\propto N^{-1}\) (standard spatial CLT).
  • Faster scaling in the constant-flux ensemble: \(\propto N^{-5/3}\).
  • Variance is still higher for \(\lambda\) for our range of system sizes.
  • But time-series are correlated: variance is only half the picture.

Results: correlation times

  • Correlation times are smaller for the constant-flux dynamics (right) than for the constant-response dynamics (left).
  • So effective sample size is larger for dual approach given a fixed simulation time.
  • Combined with scaling of variance, shows dual approach is arbitrarily more efficient (in this example) than standard approach in the limit \(N\to\infty\).
  • Here, dual approach overtakes standard approach for \(N\gtrapprox 800\).

Perspectives

  • Many theoretical open questions (sufficient conditions for equivalence of nonequilibrium ensembles, likewise for scaling of the variance property, existence of the steady-state and ergodicity).
  • Application of the dual method to dissipative particle dynamics (DPD). Work in progress by X. Shang and X. Wu (Birmingham).
  • Alternative approaches for variance reduction: coupling methods (in preparation, S. Darshan and G. Stoltz), transient substraction method (in preparation, R. Spacek and G.Stoltz)
  • Forward-mode AD with Molly.jl \[ \frac1{T\eta}\left(\int_0^T R(X_t^\eta)\,\mathrm{d}t - \int_0^T R(X_t^0)\,\mathrm{d}t\right) \xrightarrow{\eta\to 0} \frac1T\int_0^T \partial_\eta R(X_t^\eta) |_{\eta=0}\,\mathrm{d}t. \]
  • The tangent dynamics \(T_t = \partial_\eta R(X_t^\eta) |_{\eta=0}\) can be tracked through the equilibrium trajectory using dual numbers (ForwardDiff.jl).

References

Blassel, N., and G. Stoltz. 2024. “Fixing the Flux: A Dual Approach to Computing Transport Coefficients.” Journal of Statistical Physics 191 (17). https://doi.org/10.1007/s10955-024-03230-x.
Ciccotti, G., and G. Jacucci. 1975. “Direct Computation of Dynamical Response by Molecular Dynamics: The Mobility of a Charged Lennard-Jones Particle.” Phys. Rev. Lett. 35: 789–92. https://doi.org/10.1103/PhysRevLett.35.789.
Evans, D. J. 1993. “The Equivalence of Norton and Thévenin Ensembles.” Molecular Physics 80: 221–24. https://doi.org/10.1080/00268979300102221.
Evans, D. J., W. G. Hoover, B. H. Failor, B. Moran, and A. J. C. Ladd. 1983. “Nonequilibrium Molecular Dynamics via Gauss’s Principle of Least Constraint.” Physical Review A 28 (2): 1016–21. https://doi.org/10.1103/PhysRevA.28.1016.
Evans, D. J., and G. Morriss. 2008. Statistical Mechanics of Nonequilibrium Liquids. Cambridge University Press. https://doi.org/10.1017/CBO9780511535307.
Gosling, E. M., I. R. McDonald, and K. Singer. 1973. “On the Calculation by Molecular Dynamics of the Shear Viscosity of a Simple Fluid.” Molecular Physics 26: 1475–84. https://doi.org/10.1080/00268977300102631.
Greener, Joe G. 2024. Differentiable simulation to develop molecular dynamics force fields for disordered proteins.” Chemical Science 15: 4897–4909. https://doi.org/10.1039/D3SC05230C.