Shape-optimization of metastable timescales

Presentation to MatMat team, EPFL (Lausanne)

Noé Blassel (joint work with Tony Lelièvre and Gabriel Stoltz)

CERMICS lab, École des Ponts ParisTech, MATHERIALS team, Inria Paris. Funding from ERC EMC2 and ANR SINEQ.

2025-04-22

Molecular Dynamics

Molecular Dynamics (MD) probes the behavior of molecular systems at the scale of classical motion of atoms.

Five picoseconds of Foldit1 (PDB ID 6MRR) simulated with Molly.jl, thanks to Joe Greener.

  • Can replace physical experiments in extreme or expensive conditions with simulation.
  • Long trajectories are necessary to access dynamical properties, and parametrize models on larger scales.
  • Many algorithmic tricks are needed to simulate long trajectories reliably and efficiently, and sample the whole configuration space.
  • Length of sampled trajectories is hindered by metastability/stiffness/separation of timescales, the manifestation of free-energetic barriers larger than typical thermal fluctuations
  • Configurational ensemble has many modes, corresponding to metastable states

The timescale problem in MD

Accessible systems for direct MD Picture by Danny Perez (LANL)

  • Parallelization in space is “easy”, parallelization in time is much harder.
  • Accelerated dynamics methods: (Hyperdynamics (Voter 1997), Temperature Accelerated Dynamics (Sørensen and Voter 2000)) can help.
  • Metastable dynamics jump rarely and suddenly between metastable states.
  • Accelerated MD aims at predicting the sequence of states and transition times correctly by sacrificing resolution of the dynamics inside each state.
  • In the family of spatial coarse-graining methods.
  • We focus on Parallel Replica methods (Voter 1998).

The Parallel Replica Algorithm

  • Exploits local metastability inside a state \(\Omega\subset \mathcal X\) to accelerate sampling of the transition (exit event from \(\Omega\)).
  • Suppose dynamics \(X_0\) has been in \(\Omega\) for a time \(\gg 1\). Then it is in local equilibrium inside \(\Omega\). The remaining time \(T_{\mathrm{exit}}(\Omega)\) before exiting is an exponential random variable.
  • The exit point and exit time are independent: \(X_{T_{\mathrm{exit}}(\Omega)} \perp T_{\mathrm{exit}}(\Omega)\). The exit event \((T_{\mathrm{exit}}(\Omega),X_{T_{\mathrm{exit}}(\Omega)})\) can then be sampled in parallel.
  • Assume that a decorrelation timescale \(T_{\mathrm{corr}}(\Omega)\) is given.

ParRep: decorrelation

Decorrelation step: when the dynamics \(X_t\) enters a metastable state \(\Omega\) at time \(t=0\), evolve during \(T_{\mathrm{corr}}(\Omega)\).

  • If \(X_t\in\Omega\) for all \(0\leqslant t \leqslant T_{\mathrm{corr}}(\Omega)\), \(X_{T_{\mathrm{corr}}}(\Omega)\) is (almost) in local equilibrium \(\nu_\Omega \in \mathcal M_1(\Omega)\).
  • If \(X_t\) exits at a time \(T_{\mathrm{exit}}(\Omega) < T_{\mathrm{corr}}(\Omega)\), record exit event and continue with sequential dynamics until entering some other state \(\Omega'\).

ParRep: dephasing

Dephasing step: spawn \(N_{\mathrm{proc}}\) independent replicas, and evolve for \(T_{\mathrm{corr}}(\Omega)\), in parallel.

  • During this step, the physical clock is paused.
  • The replicas are asymptotically independent in the limit \(T_{\mathrm{corr}}(\Omega)\to\infty\).

ParRep: conditioning

Rejection/conditioning step: discard replicas which exited the state in the dephasing step.

  • The remaining \(N_{\mathrm{par}}\) replicas (almost) i.i.d. in local equilibrium.
  • Other variants (Flemming-Viot) exist to achieve this conditioning.

ParRep: parallel sampling

Parallel sampling step: let \(X^1_{T_{\mathrm{par}}}\) be the state of the first replica to exit \(\Omega\), after an additional time \(T_{\mathrm{par}}\).

  • Physical clock is sped up by a factor \(N_{\mathrm{par}}\).
  • Exit point is unbiased with respect to the sequential dynamics.

Efficiency of ParRep and motivating question

  • Algorithm is efficient if \(\mathbb E_{\nu_\Omega}[T_{\mathrm{exit}}(\Omega)] \gg T_{\mathrm{corr}}(\Omega)\).
  • Speedup in wall-clock time is governed by separation of timescales \(N^*(\Omega) = |\log\,\varepsilon_{\mathrm{corr}}|\mathbb{E}_{\nu_\Omega}[T_{\mathrm{exit}}(\Omega)]/T_{\mathrm{corr}}(\Omega)\), number of processors \(N_{\mathrm{proc}}\) (where \(0<\varepsilon_{\mathrm{corr}}<1\) is some bias tolerance parameter): \[ \frac{N^*(\Omega)-\log\varepsilon_{\mathrm{corr}}}{(N^*(\Omega)/N_{\mathrm{proc}})\mathrm{e}^{-(\log\varepsilon_{\mathrm{corr}})/N^*(\Omega)}-2\log\varepsilon_{\mathrm{corr}}} \]
  • Question: how to maximize wall-clock time speedup with respect to \(\Omega\) i.e. maximize \(N^*(\Omega)\) ? In more abstract terms: make \(\Omega\) as locally metastable as possible.
  • Other theoretical reasons to seek domains satisfying this property (Aristoff, Johnson, and Perez (2023)).

Efficiency landscape for ParRep Log-speedup in wall-clock time between ParRep and sequential MD

Dynamical setting

Molecular dynamics given by solutions to a stochastic differential equation: \[ \mathrm{d}X_t = -a(X_t)\nabla V(X_t)\,\mathrm{d}t + \frac1\beta\mathrm{div}\,a(X_t)\,\mathrm{d}t + \sqrt{\frac{2}{\beta}}a(X_t)^{1/2}\,\mathrm{d}W_t, \]

  • \(V:{\mathbb{{R}}}^d\to {\mathbb{{R}}}\) potential energy function, \(a:{\mathbb{{R}}}^{d\times d}\) s.p.d. diffusion coefficient, \(W\) standard Brownian motion in \({\mathbb{{R}}}^d\).

  • Ergodic and reversible with respect to the standard Gibbs measure at temperature \((k\beta)^{-1}\) \[ \pi(\mathrm{d}x) = \mathrm{e}^{-\beta V(x)}\,\mathrm{d}x/Z_\beta. \]

  • Continuous-time preconditionned Langevin Monte-Carlo / diffusion on a Riemannian manifold \(\mathcal M=({\mathbb{{R}}}^d,g)\) \[ \mathrm{d}X_t = -\nabla_g \left(V(X_t)+\frac1\beta\log\,\det\, g(X_t)\right)+\sqrt{\frac{2}{\beta}}\,\mathrm{d}W_t^g \] where \(g=a^{-1}\) and \(W^g\) is a Brownian motion on \(\mathcal M\).

  • Evolution semigroup generator is symmetric on \(L^2({\mathbb{{R}}}^d,\pi)\): \[ \mathcal{L}_\beta \varphi = \frac1\beta\mathrm{e}^{\beta V}\mathrm{div}\,\left(\mathrm{e}^{-\beta V}a\nabla\varphi\right). \]

  • We assume everywhere that \(V,a\in\mathcal W^{2,\infty}_{\mathrm{loc}}\), and \(a\) is locally elliptic: \(a^{-1}\in L^\infty_{\mathrm{loc}}\).

Local approach to metastability

  • For \(\Omega\subset{\mathbb{{R}}}^d\), local equilibrium is a quasi-stationary distribution (QSD) in \(\Omega\) i.e. a probability \(\nu_\Omega\in \mathcal M_1(\Omega)\) for which \[ \forall\,A\subset \Omega,\qquad \mathbb P_{\nu_\Omega}\left(X_t \in A\,\middle|\,t<T_{\mathrm{exit}}(\Omega)\right) = \nu_\Omega(A). \]
  • Under mild conditions on \(V\) and \(\Omega\), the QSD is unique, and related to the principal Dirichlet eigenvector: \[ \nu_\Omega(\mathrm{d}x) = \frac{u_{1,\Omega}(x)\mathrm{e}^{-\beta V(x)}\,\mathrm{d}x}{\int_\Omega u_{1,\Omega}\mathrm{e}^{-\beta V}},\qquad -\mathcal{L}_\beta u_{1,\Omega}(x) = \lambda_{1,\Omega}u_{1,\Omega}(x), \]
  • Here, we consider the generator \(\mathcal{L}_\beta\) with absorbing Dirichlet boundary conditions on \(\partial \Omega\): \[ \mathcal D(\mathcal{L}_\beta) = H_0^1(\Omega) \cap \{u\in L^2(\Omega):\,\mathcal{L}_\beta u\in L^2(\Omega)\text{ in }\mathcal D'({\mathbb{{R}}}^d)\}. \] It is self-adjoint with compact resolvent on \(L^2_\beta(\Omega):=L^2(\Omega,\mathrm{e}^{-\beta V(x)}\,\mathrm{d}x)\) for bounded \(\Omega\).

Spectral characterization of timescales

Denote \(\mu_t = \mathrm{Law}\left(X_t \middle| t<T_{\mathrm{exit}}(\Omega)\right)\) the law of the process conditioned on remaining trapped.

  • Theorem (Le Bris et al. (2012))

    \[ \mu_0=\nu_\Omega \implies T_{\mathrm{exit}}(\Omega)\sim \mathcal E(\color{blue}{\lambda_{1,\Omega}}),\qquad T_{\mathrm{exit}}(\Omega) \perp\!\!\!\!\perp X_{T_{\mathrm{exit}(\Omega)}}. \]

  • This result justifies the use of the Parallel Replica algorithm.

  • The exit rate is given by the principal Dirichlet eigenvalue \(\lambda_{1,\Omega} = 1/\mathbb E_{\nu_\Omega}[T_{\mathrm{exit}}(\Omega)]\).

  • Theorem (Le Bris et al. (2012))

    Convergence to the QSD and bias on the exit event are exponentially decaying. There exist \(C_1,C_2>0\) depending on \(\mu_0\) such that \[ \left\|\mu_t-\nu_\Omega\right\|_{\mathrm{TV}} \leqslant C_1 \mathrm{e}^{-(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}})t}, \] \[ \left\|\mathrm{Law}_{\mu_t}(T_{\mathrm{exit}(\Omega)}-t,X_{T_{\mathrm{exit}}(\Omega)}) - \mathrm{Law}_{\nu_\Omega}(T_{\mathrm{exit}(\Omega)},X_{T_{\mathrm{exit}}(\Omega)})\right\|_{\mathrm{TV}}, \leqslant C_2 \mathrm{e}^{-(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}})t}. \]

  • The decorrelation rate is given by the spectral gap of the Dirichlet generator \(\lambda_{2,\Omega}-\lambda_{1,\Omega}\).

  • Set \(T_{\mathrm{corr}}(\Omega) = -\log\,\varepsilon_{\mathrm{corr}}/(\lambda_{2,\Omega}-\lambda_{1,\Omega})\) in ParRep, for some fixed tolerance parameter \(0<\varepsilon_{\mathrm{corr}}<1\).

  • In practice, raises the need for quantitative or statistical estimates of \(\lambda_{2,\Omega}-\lambda_{1,\Omega}\) (and also \(C_1,C_2\)).

A spectral shape optimization problem

  • The goal becomes to maximize the efficiency of ParRep by maximizing the separation of timescales: \[ \underset{\Omega \subset {\mathbb{{R}}}^d\text{ open, Lipschitz, bounded}}{\max}\, \left(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}}\right)/\color{blue}{\lambda_{1,\Omega}}. \]
  • Make the exit rate as small as possible compared to the decorrelation rate. Quantitative measure of local metastability / timescale separation inside \(\Omega\).
  • Standard choice takes \(\Omega=\mathcal A(z_0)\), the basin of attraction of some local minimum \(z_0\) for \(V\) under the steepest-descent dynamics \(\dot X(t) = -a(X(t))\nabla V(X(t))\).
  • The case of the Dirichlet Laplacian \(a=\mathrm{Id}\), \(V=0\) is a famous question in spectral geometry (Payne-Polya-Weinberger conjecture).
  • Question has been identified by MD practitioners, e.g. in Perez, Uberuaga, and Voter (2015), but not studied in depth.
  • For confining potentials, the problem is ill-posed. We therefore focus on local optimisation strategies around single energy wells.

First approach: ascent algorithm

  • Need expressions for shape-variations of the Dirichlet eigenvalues, using Hadamard calculus.
  • Given a space of perturbations: \(\theta\in\mathcal W^{1,\infty}({\mathbb{{R}}}^d,{\mathbb{{R}}}^d)\), consider the transported domain \[ \Omega_\theta = \{x+\theta(x),\,x\in\Omega\}. \]
  • Study properties of the eigenvalue perturbation maps \[ \lambda_k:\theta\mapsto \lambda_{k,\Omega_\theta}. \]
  • The spectrum (of \(-\mathcal{L}_\beta\)) is enumerated with multiplicity in order of increasing magnitude \[ 0<\lambda_{1,\Omega}<\lambda_{2,\Omega}\leqslant\lambda_{3,\Omega}\leqslant\cdots \] with an associated basis of eigenvectors \[ \int_{\Omega}u_{i,\Omega}u_{j,\Omega}\mathrm{e}^{-\beta V} = \delta_{ij},\qquad \forall\, i,j\geqslant 1. \]

A domain \(\Omega\) and its perturbation \(\Omega_\theta\).

Shape perturbation results

Assume \(\Omega\subset{\mathbb{{R}}}^d\) is a bounded \(\mathcal C^{1,1}\) (or convex Lipschitz) domain and \(\lambda_{k,\Omega}\) has multiplicity \(m_k\geqslant 1\): \[ \lambda_{k-1,\Omega}<\lambda_{k,\Omega}=\cdots=\lambda_{k+m_k-1,\Omega}<\lambda_{k+m_k,\Omega}. \]

Theorem (Blassel, Lelièvre, and Stoltz (2025), in preparation)

If \(m_k=1\), \(\lambda_k\) is \(\mathcal C^1(\mathcal W^{1,\infty})\) in a neighborhood of \(\theta=0\), with Fréchet derivative \[ D\lambda_k \theta = -\frac1\beta \int_{\partial\Omega}\left(\frac{\partial u_{k,\Omega}}{\partial\mathrm{n}}\right)^2\mathrm{n}^\top a\mathrm{n}\theta^\top\mathrm{n}\mathrm{e}^{-\beta V}. \]

If \(m_k>1\), \(\lambda_k\) is only Gateaux semi-differentiable: \(t\mapsto\lambda_{k,\Omega_{t\theta}}\) is right-differentiable at \(t=0\). Its derivative is the bottom eigenvalue of the matrix \[ M^{\Omega,k}_{ij}(\theta) = -\frac1\beta\int_{\partial\Omega}\frac{\partial u_{k+i-1,\Omega}}{\partial\mathrm{n}}\frac{\partial u_{k+j-1,\Omega}}{\partial\mathrm{n}}\mathrm{n}^\top a\mathrm{n}\theta^\top\mathrm{n}\mathrm{e}^{-\beta V},\qquad\forall\, 1\leqslant i,j\leqslant m_k. \]

Idea of proof

Transport eigenproblem on \(\Omega_{\theta} = \Phi_\theta\Omega\) back to \(\Omega\) via inverse homeomorphism \(\Phi_\theta^{-1} = (\mathrm{Id}+\theta)^{-1}\). One gets the eigenvalue problem on \(\Omega\): \[ -\frac1\beta \mathrm{div}\left(\left|\det \nabla \Phi_\theta\right|\mathrm{e}^{-\beta V\circ \Phi_\theta} \nabla \Phi_\theta^{-\top}a\circ \Phi_\theta \nabla \Phi_\theta^{-1}\nabla u_{k,\Omega_\theta}\circ\Phi_\theta\right) = \lambda_{k,\Omega_\theta}u_{k,\Omega}\circ\Phi_\theta. \] The results can then be obtained via perturbation theory (developed in Kato (2013)).

Perturbation of the eigenvalue \(\lambda_{k,\Omega}\) in the direction \(\theta\) (here \(m_k=3\))

Slopes of the right-tangents are the eigenvalues of \(M^{\Omega,k}(\theta)\) (with multiplicity).

Choice of ascent direction : case of simple eigenvalues

Shape gradient of objective function

Assume \(\lambda_{1,\Omega},\lambda_{2,\Omega}\) are simple, write \(J(\Omega) = \lambda_{2,\Omega}/\lambda_{1,\Omega}\). From the chain rule, \(\theta\mapsto J(\Omega_\theta)\) is Fréchet-differentiable around \(\theta=0\), and \[ DJ(\Omega)\cdot\theta = \int_{\partial\Omega}\phi_J(\Omega)\theta^\top\mathrm{n} \] where \[ \phi_J(\Omega) = -\frac1\beta\left[\lambda_{1,\Omega}\left(\frac{\partial u_{2,\Omega}}{\partial\mathrm{n}}\right)^2-\frac{\lambda_{2,\Omega}}{\lambda_{1,\Omega}^2}\left(\frac{\partial u_{1,\Omega}}{\partial\mathrm{n}}\right)^2\right]\mathrm{n}^\top a\mathrm{n}\mathrm{e}^{-\beta V} \] defined on \(\partial\Omega\) is the shape-gadient of \(J\).

Any perturbation of steepest ascent \(\theta^*\) satisfies \[ \theta^*\in \underset{\|\theta\|_{L^2(\partial\Omega)}=1}{\mathrm{Argmax}} DJ(\Omega)\cdot\theta \implies \theta|_{\partial\Omega}\propto \phi_J(\Omega)\mathrm{n}. \]

Discretization with finite elements: \(\Omega\) is a simplicial mesh, \(u_{k,\Omega}\) is a Rayleigh-Ritz eigenvector for some finite-dimensional space \(V_h(\Omega)\subset L^2(\Omega)\).

Numerical issues:

  • \(\theta \propto \phi_J(\Omega)\mathrm{n}\) is a poor choice because \(\mathrm{n}\) is highly irregular on \(\partial\Omega\).
  • \(\theta\) needs to be defined on \(\Omega\) and not just \(\partial\Omega\).

Extension-regularization procedure

Given a regularization parameter \(\varepsilon_{\mathrm{reg}}>0\), find \(\theta\in H^1(\Omega)^d\) such that \[ \int_{\Omega}\left(\varepsilon^2_{\mathrm{reg}} \nabla\theta:\nabla\psi +\theta^\top\psi\right) = \int_{\partial\Omega}\phi_J(\Omega)\psi^\top\mathrm{n},\qquad\forall\,\psi\in H^1(\Omega)^d. \] such a \(\theta\) is a guaranteed ascent direction.

Formally, Euler-Lagrange equation for the energy functional \[ \mathcal{E}(\theta) = \frac{\varepsilon^2_{\mathrm{reg}}}{2}\int_{\Omega}|\nabla\theta|_{\mathrm{F}}^2 + \frac12\int_{\Omega}|\theta-\phi_J(\Omega)\mathrm{n}\boldsymbol{1}_{\partial\Omega}|^2. \]

Weak form of the regularizing vectorial Neumann problem \[ \left\{\begin{aligned} -\varepsilon^2_{\mathrm{reg}}\Delta \theta + \theta &= 0\text{ in $\Omega$},\\ \varepsilon^2_{\mathrm{reg}}\nabla\theta\mathrm{n}&= \phi_J(\Omega)\mathrm{n}\text{ on $\partial\Omega$}. \end{aligned}\right. \]

Write \(\theta = R_{\varepsilon_{\mathrm{reg}}}\phi_J(\Omega)\) for the (FEM) solution operator associated with this problem.

  • Other choices of regularizations, multi-mesh approaches, etc…
  • Discretization of ascent direction \(\neq\) ascent direction for discretization. Standard line search conditions (Armijo/Wolfe) fail.
  • We normalize \(\theta\) in \(L^2(\partial\Omega)\), and select \(\eta>0\) s.t. \(\Omega' = (\mathrm{Id}+\eta\theta)\Omega\) is a valid mesh.

Case of degenerate eigenvalues

Objective function is non-smooth at \(\Omega\) if \(\lambda_{2,\Omega}\) has multiplicity \(m_2>1\) (say \(m_2=2\)).

Write \[ J(\Omega) = \lambda_{2,\Omega}/\lambda_{1,\Omega},\qquad \tilde K(\Omega) = \lambda_{3,\Omega}/\lambda_{1,\Omega} \] these two shape functionals “cross” at \(\Omega\).

Define the natural shape gradients \(\phi_{J}(\Omega)\), \(\phi_{K}(\Omega)\) as in the simple case.

Extended-regularized Clarke derivative

By computations for the Clarke generalized derivative of operator eigenvalues (Cox (1995)), direction of steepest ascent \(\theta^*\) is proportional to an element of \[ \partial J(\Omega) = \mathrm{conv}\left\{\phi_{J}(\Omega)\mathrm{n},\phi_{K}(\Omega)\mathrm{n}\right\}. \] Apply the extension-regularization operator \(\varepsilon_{\mathrm{reg}}\) from above to define \[ \partial_{\varepsilon_{\mathrm{reg}}} J(\Omega) := R_{\varepsilon_{\mathrm{reg}}} \partial J(\Omega) = \mathrm{conv}\left\{R_{\varepsilon_{\mathrm{reg}}}\phi_{J}(\Omega),R_{\varepsilon_{\mathrm{reg}}}\phi_{K}(\Omega)\right\}. \]

  • In practice, fix \(\varepsilon_{\mathrm{degen}}>0\) and decide that \(m_2=2\) when \((\lambda_{3,\Omega}-\lambda_{2,\Omega})/\lambda_{2,\Omega}\leqslant\varepsilon_{\mathrm{degen}}<(\lambda_{4,\Omega}-\lambda_{2,\Omega})/\lambda_{2,\Omega}\).
  • Replaces true objective with surrogate objective based on assumption of exact degeneracy of eigenvalues.

Choice of ascent direction

Solve the inner optimisation problem \[ \theta \in \mathrm{Argmax}\,\left\{\mu_1(\psi):\,\psi\in\partial_{\varepsilon_{\mathrm{reg}}} J(\Omega),\,\|\psi\|_{L^2(\partial\Omega)}=1\right\}, \] where \(\mu_1(\psi)\) is the Gateaux right-derivative of \(J\) in the direction \(\theta\).

In the end, evaluating this only involves computing the bottom eigenvalue of a \(m_2\times m_2\) matrix which depends linearly on \(\psi\).

Akin to gradient-sampling method.

Effect of nearly-degenerate eigenvalues

Galerkin objective for high-dimensional systems

For systems of interest, \(d\gg 1\). In MD, common to use a collective variable (CV) \(\xi:{\mathbb{{R}}}^d\to{\mathbb{{R}}}^m\) with \(m\ll d\) for interpretation and biasing.

It is natural to consider states defined in CV space: \(\Omega = \xi^{-1}(\Omega^\xi)\) for some \(\Omega^\xi\subset{\mathbb{{R}}}^m\)

Coarse-grained spectrum

Consider the test space \[ \mathcal V_\xi(\Omega^\xi) = \left\{\varphi\circ\xi,\,\varphi\in H_0^1(\Omega_\xi)\right\} \subset H_0^1(\xi^{-1}(\Omega_\xi)), \] introduce the Rayleigh-Ritz (or “coarse-grained”) eigenpairs: \[ Q_\xi(\varphi,\psi;\Omega_\xi)=Q(\varphi\circ\xi,\psi\circ\xi;\xi^{-1}(\Omega_\xi)),\qquad Q_\xi(u_{k,\Omega_\xi}^\xi,u_{\ell,\Omega_\xi}^\xi) = \lambda_{k,\Omega_\xi}^\xi\delta_{k\ell}, \] where \(Q\) is the Rayleigh quotient for \(-\mathcal{L}_\beta\) \[ Q(u,v;\Omega)=\frac1\beta\frac{\int_\Omega \nabla u^\top a \nabla v \mathrm{e}^{-\beta V}}{\int_\Omega uv\mathrm{e}^{-\beta V}}. \]

Proposition (Blassel, Lelièvre, and Stoltz (2025))

\[ \forall\,\varphi,\psi\in\mathcal V_\xi(\Omega^\xi),\qquad Q_\xi(\varphi,\psi;\Omega_\xi) = \frac1\beta\frac{\int_{\Omega_\xi}\nabla\varphi^\top a_\xi \nabla\psi \mathrm{e}^{-\beta F_\xi}}{\int_{\Omega_\xi}\varphi\psi\mathrm{e}^{-\beta F_\xi}}, \] where \(F_\xi,a_\xi\) are the free-energy and effective diffusion through \(\xi\): \[ F_\xi(z) = -\frac1\beta\log\,\int_{\xi^{-1}(z)}\mathrm{e}^{-\beta V}(\det [\nabla\xi^\top\nabla\xi])^{-1/2}\mathrm{d}\sigma,\qquad a_\xi(z) = \int_{\xi^{-1}(z)}\nabla\xi^\top a\nabla\xi \mathrm{d}\mu_z, \] where \(\mu_z\) is the conditional measure of \(\mu\) on the level-set \(\xi^{-1}(z)\).

Error estimate for all \(k\geqslant 1\): \[ \lambda_{k,\xi^{-1}(\Omega_\xi)} - \lambda^\xi_{k,\Omega^\xi} \leqslant\frac1\beta\int_{\xi^{-1}(\Omega)}\nabla\left(u_{k,\xi^{-1}(\Omega_\xi)}-u_{k,\Omega_\xi}^\xi\circ\xi\right)^\top a \nabla\left(u_{k,\xi^{-1}(\Omega_\xi)}-u_{k,\Omega_\xi}^\xi\circ\xi\right)\,\mathrm{e}^{-\beta V} \]

  • Under regularity assumptions on \(\xi\), the shape perturbations formulas for \(\lambda_{k,\Omega}\) also apply to \(\lambda_{k,\Omega_\xi}^\xi\) verbatim!
  • Tractable objective: maximize \(\lambda_{2,\Omega_\xi}^\xi/\lambda_{1,\Omega_\xi}^\xi\).
  • Dynamical interpretation of rates \(\lambda_{1,\Omega_\xi}^\xi\) is dubious, but we are optimizing a ratio.

Example: solvated Alanine dipeptide

Consider alanine-dipeptide in solution \((d=1857)\).

CV \(\xi=(\phi,\psi)\) standard dihedral angles.

Associated free-energy (\(F_\xi\)) landscape at \(T=300K\).

Effective diffusion tensor (\(a_\xi\)).

As \(F_\xi,a_\xi\) are thermodynamic quantities, can be computed efficiently with enhanced sampling (no need for long trajectories). Here, umbrella sampling in Tinker-HP.

Shape-optimisation results

Shape-gradient ascent algorithm implemented in FreeFem++.

Convergence of the Galerkin objective.

Initial and optimized domains.

Back to the original objective

Goal: quantify the gain in timescale separation with respect to free-energy basin (standard state definition).

Simulate process conditioned on not exiting the state using a Fleming-Viot process (in a modded Tinker-HP) with \(N_{\mathrm{proc}}=50\) replicas.

\(\xi\)-marginal of the QSD for the free-energy basin around minimum 5.

\(\xi\)-marginal of the QSD for the optimized around minimum 5.

Decorrelation timescales

For various initial conditions, estimate the law of the process conditioned on not exiting the state by an empirical occupation measure for the Fleming-Viot: \[ \mu_t \approx \frac{1}{N_{\mathrm{proc}}} \sum_{i=1}^{N_\mathrm{proc}} \int_{t-\Delta}^{t+\Delta} \delta_{X_s^{(i)}},\,\mathrm{d}s. \]

Convergence to the QSD: free-energy basin.

Convergence to the QSD: optimized state.

Comparison

\(\Omega\) Exit rate from QSD \((,\,\pm 1.96\sigma\))$ Decorrelation rate Ratio
Free-energy basin \(2.0\times 10^{-3}\pm 1.2\times 10^{-4} \mathrm{ps}\) \(8.8\times10^{-2}\,\mathrm{ps}^{-1}\) \(44.0\pm2.6\)
Optimized state \(3.3\times 10^{-4}\pm 4.7\times10^{-5}\,\mathrm{ps}^{-1}\) \(7.7\times10^{-2}\,\mathrm{ps}^{-1}\) \(233.0\pm33.0\)

Comparison of free-energy basin and optimized state. Shape optimization led to a \(5.3\pm0.8\) improvement in the original objective value.

Some perspectives

  • Methodology can be extended to any reversible dynamics in \(\xi\). Data-driven approaches to obtain better effective dynamics are interesting.
  • Other more theoretical approach: rigorous analytical estimates in the low-temperature/high barrier regime (done in Blassel, Lelièvre, and Stoltz (2024)), for low-dimensional parametrizations of the state. Quantitative guarantees on timescales using semiclassical methods.

Thank you! (Some bibliography on the next slide).

Bibliography

Aristoff, D., M. Johnson, and D. Perez. 2023. “Arbitrarily Accurate, Nonparametric Coarse Graining with Markov Renewal Processes and the Mori–Zwanzig Formulation.” AIP Advances 13 (9): 095131-1 - 11.
Blassel, N., T. Lelièvre, and G. Stoltz. 2024. “Quantitative Low-Temperature Spectral Asymptotics for Reversible Diffusions in Temperature-Dependent Domains.” ArXiv Preprint: 2501.16082.
———. 2025. “Shape Optimization of Metastable States.”
Cox, S. J. 1995. “The Generalized Gradient at a Multiple Eigenvalue.” Journal of Functional Analysis 133 (1): 30–40.
Kato, T. 2013. Perturbation Theory for Linear Operators. Vol. 132. Springer Science & Business Media.
Le Bris, C., T. Lelievre, M. Luskin, and D. Perez. 2012. “A Mathematical Formalization of the Parallel Replica Dynamics.” Monte–Carlo Methods and Applications 18: 119–46.
Perez, D., B. P. Uberuaga, and A. F. Voter. 2015. “The Parallel Replica Dynamics Method – Coming of Age.” Computational Materials Science, Special Issue on Advanced Simulation Methods, 100: 90–103.
Sørensen, M.., and A. F. Voter. 2000. “Temperature-Accelerated Dynamics for Simulation of Infrequent Events.” The Journal of Chemical Physics 112 (21): 9599–9606.
Voter, A. F. 1997. “Hyperdynamics: Accelerated Molecular Dynamics of Infrequent Events.” Physical Review Letters 78 (20): 3908–11.
———. 1998. “Parallel Replica Method for Dynamics of Infrequent Events.” Physical Review B 57 (22): 9599–9606.