Shape-optimization of metastable timescales

Workshop: QSD and related fields

Noé Blassel (joint work with Tony Lelièvre and Gabriel Stoltz)

CERMICS lab, École des Ponts ParisTech, MATHERIALS team, Inria Paris. Funding from ERC EMC2 and ANR SINEQ.

2025-04-22

Molecular Dynamics

Molecular Dynamics (MD) probes the behavior of molecular systems at the scale of classical motion of atoms.

Five picoseconds of Foldit1 (PDB ID 6MRR) simulated with Molly.jl, thanks to Joe Greener.

  • Can replace physical experiments in extreme or expensive conditions with simulation.
  • Long trajectories are necessary to access dynamical properties, and parametrize models on larger scales.
  • Many algorithmic tricks are needed to simulate long trajectories reliably and efficiently, and sample the whole configuration space.
  • Length of sampled trajectories is hindered by metastability/stiffness/separation of timescales, the manifestation of free-energetic barriers larger than typical thermal fluctuations
  • Configurational ensemble has many modes, corresponding to metastable states

The timescale problem in MD

Accessible systems for direct MD Picture by Danny Perez (LANL)

  • Parallelization in space is “easy”, parallelization in time is much harder.
  • Accelerated dynamics methods: (Hyperdynamics (Voter 1997), Temperature Accelerated Dynamics (Sørensen and Voter 2000)) can help.
  • Metastable dynamics jump rarely and suddenly between metastable states.
  • Accelerated MD aims at predicting the sequence of states and transition times correctly by sacrificing resolution of the dynamics inside each state.
  • In the family of spatial coarse-graining methods.
  • We focus on Parallel Replica methods (Voter 1998).

The Parallel Replica Algorithm

  • Exploits local metastability inside a state \(\Omega\subset \mathcal X\) to accelerate sampling of the transition (exit event from \(\Omega\)).
  • Suppose dynamics \(X_0\) has been in \(\Omega\) for a time \(\gg 1\). Then it is in local equilibrium inside \(\Omega\). The remaining time \(T_{\mathrm{exit}}(\Omega)\) before exiting is an exponential random variable.
  • The exit point and exit time are independent: \(X_{T_{\mathrm{exit}}(\Omega)} \perp T_{\mathrm{exit}}(\Omega)\). The exit event \((T_{\mathrm{exit}}(\Omega),X_{T_{\mathrm{exit}}(\Omega)})\) can then be sampled in parallel.
  • Assume that a decorrelation timescale \(T_{\mathrm{corr}}(\Omega)\) is given.

ParRep: decorrelation

Decorrelation step: when the dynamics \(X_t\) enters a metastable state \(\Omega\) at time \(t=0\), evolve during \(T_{\mathrm{corr}}(\Omega)\).

  • If \(X_t\in\Omega\) for all \(0\leqslant t \leqslant T_{\mathrm{corr}}(\Omega)\), \(X_{T_{\mathrm{corr}}}(\Omega)\) is approximately \(\sim\nu_\Omega \in \mathcal M_1(\Omega)\) (the QSD).
  • If \(X_t\) exits at a time \(T_{\mathrm{exit}}(\Omega) < T_{\mathrm{corr}}(\Omega)\), record exit event and continue with sequential dynamics until entering some other state \(\Omega'\).

ParRep: dephasing

Dephasing step: spawn \(N_{\mathrm{proc}}\) independent replicas, and evolve for \(T_{\mathrm{corr}}(\Omega)\), in parallel.

  • During this step, the physical clock is paused.
  • The replicas are asymptotically independent in the limit \(T_{\mathrm{corr}}(\Omega)\to\infty\).

ParRep: conditioning

Rejection/conditioning step: discard replicas which exited during the dephasing step (with or without branching).

  • The remaining \(N_{\mathrm{par}}\) replicas (almost) i.i.d. in local equilibrium.

ParRep: parallel sampling

Parallel sampling step: let \(X^1_{T_{\mathrm{par}}}\) be the state of the first replica to exit \(\Omega\), after an additional time \(T_{\mathrm{par}}\).

  • Physical clock is sped up by a factor \(N_{\mathrm{par}}\).
  • Exit point is unbiased with respect to the sequential dynamics.

Efficiency of ParRep and motivating question

  • Algorithm is efficient if \(\mathbb E_{\nu_\Omega}[T_{\mathrm{exit}}(\Omega)] \gg T_{\mathrm{corr}}(\Omega)\).
  • Natural to set \(T_{\mathrm{corr}}(\Omega) = -\log{\varepsilon_{\mathrm{corr}}}/(\lambda_{2,\Omega}-\lambda_{1,\Omega})\) back to this later.
  • With this choice, speedup in wall-clock time is governed by separation of timescales \(N^*(\Omega) = (\lambda_{2,\Omega}-\lambda_{1,\Omega})/\lambda_{1,\Omega}\) and number of processors \(N_{\mathrm{proc}}\): \[ \frac{N^*(\Omega)-\log\varepsilon_{\mathrm{corr}}}{(N^*(\Omega)/N_{\mathrm{proc}})\mathrm{e}^{-(\log\varepsilon_{\mathrm{corr}})/N^*(\Omega)}-2\log\varepsilon_\mathrm{corr}} \]
  • How to maximize wall-clock time speedup with respect to \(\Omega\)/ maximize \(N^*(\Omega)\) / make \(\Omega\) as locally metastable as possible ?
  • Various theoretical motivations to seek domains satisfying \(N^*(\Omega)\gg 1\) (Aristoff, Johnson, and Perez (2023)).

Efficiency landscape for ParRep Log-speedup in wall-clock time between ParRep and sequential MD

Dynamical setting

Molecular dynamics given by solutions to a stochastic differential equation: \[ \mathrm{d}X_t = -a(X_t)\nabla V(X_t)\,\mathrm{d}t + \frac1\beta\mathrm{div}\,a(X_t)\,\mathrm{d}t + \sqrt{\frac{2}{\beta}}a(X_t)^{1/2}\,\mathrm{d}W_t, \]

  • \(V:{\mathbb{{R}}}^d\to {\mathbb{{R}}}\) potential energy function, \(a:{\mathbb{{R}}}^{d\times d}\) s.p.d. diffusion coefficient, \(W\) standard Brownian motion in \({\mathbb{{R}}}^d\).

  • Ergodic and reversible with respect to the standard Gibbs measure at temperature \((k\beta)^{-1}\) \[ \pi(\mathrm{d}x) = \mathrm{e}^{-\beta V(x)}\,\mathrm{d}x/Z_\beta. \]

  • Continuous-time preconditionned Langevin Monte-Carlo / diffusion on the Riemannian manifold \(\mathcal M=({\mathbb{{R}}}^d,a^{-1})\)

  • Evolution semigroup generator is symmetric on \(L^2({\mathbb{{R}}}^d,\pi)\): \[ \mathcal{L}_\beta \varphi = \frac1\beta\mathrm{e}^{\beta V}\mathrm{div}\,\left(\mathrm{e}^{-\beta V}a\nabla\varphi\right). \]

  • We assume everywhere that \(V,a\in\mathcal W^{2,\infty}_{\mathrm{loc}}\), and \(a\) is locally elliptic: \(a^{-1}\in L^\infty_{\mathrm{loc}}\).

Local approach to metastability

  • For \(\Omega\subset{\mathbb{{R}}}^d\), local equilibrium is given by a quasi-stationary distribution (QSD) in \(\Omega\) i.e. a probability \(\nu_\Omega\in \mathcal M_1(\Omega)\) for which \[ \forall\,A\subset \Omega,\qquad \mathbb P_{\nu_\Omega}\left(X_t \in A\,\middle|\,t<T_{\mathrm{exit}}(\Omega)\right) = \nu_\Omega(A). \]
  • Since \(V\), \(\Omega\) are bounded and \(a\) is uniformly elliptic on \(\Omega\), the QSD is unique, and related to the principal Dirichlet eigenvector: \[ \nu_\Omega(\mathrm{d}x) = \frac{u_{1,\Omega}(x)\mathrm{e}^{-\beta V(x)}\,\mathrm{d}x}{\int_\Omega u_{1,\Omega}\mathrm{e}^{-\beta V}},\qquad -\mathcal{L}_\beta u_{1,\Omega}(x) = \lambda_{1,\Omega}u_{1,\Omega}(x), \]
  • Here, we consider the generator \(\mathcal{L}_\beta\) with absorbing Dirichlet boundary conditions on \(\partial \Omega\): \[ \mathcal D(\mathcal{L}_\beta) = H_0^1(\Omega) \cap \{u\in L^2(\Omega):\,\mathcal{L}_\beta u\in L^2(\Omega)\subset\mathcal D'({\mathbb{{R}}}^d)\}. \] It is self-adjoint with compact resolvent on \(L^2_\beta(\Omega):=L^2(\Omega,\mathrm{e}^{-\beta V(x)}\,\mathrm{d}x)\) for bounded \(\Omega\).

Spectral characterization of timescales

Denote \(\mu_t = \mathrm{Law}\left(X_t \middle| t<T_{\mathrm{exit}}(\Omega)\right)\) the law of the process conditioned on remaining trapped.

  • Theorem (Le Bris et al. (2012))

    \[ \mu_0=\nu_\Omega \implies T_{\mathrm{exit}}(\Omega)\sim \mathcal E(\color{blue}{\lambda_{1,\Omega}}),\qquad T_{\mathrm{exit}}(\Omega) \perp\!\!\!\!\perp X_{T_{\mathrm{exit}(\Omega)}}. \]

  • This result justifies the use of the Parallel Replica algorithm.

  • The exit rate is given by the principal Dirichlet eigenvalue \(\lambda_{1,\Omega} = 1/\mathbb E_{\nu_\Omega}[T_{\mathrm{exit}}(\Omega)]\).

  • Theorem (Le Bris et al. (2012))

    Convergence to the QSD and bias on the exit event are exponentially decaying. There exist \(C_1,C_2>0\) depending on \(\mu_0\) such that \[ \left\|\mu_t-\nu_\Omega\right\|_{\mathrm{TV}} \leqslant C_1 \mathrm{e}^{-(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}})t}, \] \[ \left\|\mathrm{Law}_{\mu_t}(T_{\mathrm{exit}(\Omega)}-t,X_{T_{\mathrm{exit}}(\Omega)}) - \mathrm{Law}_{\nu_\Omega}(T_{\mathrm{exit}(\Omega)},X_{T_{\mathrm{exit}}(\Omega)})\right\|_{\mathrm{TV}}, \leqslant C_2 \mathrm{e}^{-(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}})t}. \]

  • The decorrelation rate is given by the spectral gap of the Dirichlet generator \(\lambda_{2,\Omega}-\lambda_{1,\Omega}\).

  • Set \(T_{\mathrm{corr}}(\Omega) = -\log\,\varepsilon_{\mathrm{corr}}/(\lambda_{2,\Omega}-\lambda_{1,\Omega})\) in ParRep, for some fixed tolerance parameter \(0<\varepsilon_{\mathrm{corr}}<1\).

  • In practice, raises the need for quantitative or statistical estimates of \(\lambda_{2,\Omega}-\lambda_{1,\Omega}\) (and also \(C_1,C_2\)).

A spectral shape optimization problem

  • The goal becomes to maximize the efficiency of ParRep by maximizing the separation of timescales: \[ \underset{\Omega \subset {\mathbb{{R}}}^d\text{ open, Lipschitz, bounded}}{\max}\, \left(\color{red}{\lambda_{2,\Omega}-\lambda_{1,\Omega}}\right)/\color{blue}{\lambda_{1,\Omega}}. \]
  • Make the exit rate as small as possible compared to the decorrelation rate. Quantitative measure of local metastability / timescale separation inside \(\Omega\).
  • Standard choice takes \(\Omega=\mathcal A(z_0)\), the basin of attraction of some local minimum \(z_0\) for \(V\) under the steepest-descent dynamics \(\dot X(t) = -a(X(t))\nabla V(X(t))\).
  • The case of the Dirichlet Laplacian \(a=\mathrm{Id}\), \(V=0\) is a famous question in spectral geometry (Payne-Polya-Weinberger conjecture).
  • Question has been identified by MD practitioners, e.g. in Perez, Uberuaga, and Voter (2015), but not studied in depth.
  • Problem is ill-posed, e.g., for \(V(x) = |x|^2/2 + \delta V(x)\), \(\delta V\in \mathcal C^\infty_c({\mathbb{{R}}}^d)\): \[ \lambda_{2,B(0,R)}/\lambda_{1,B(0,R)}\xrightarrow{R\to +\infty} +\infty. \]
  • Focus on local optimization around single energy wells.

Ascent algorithm

  • Need expressions for shape-variations of the Dirichlet eigenvalues, using Hadamard calculus.
  • Given a space of perturbations: \(\theta\in\mathcal W^{1,\infty}({\mathbb{{R}}}^d,{\mathbb{{R}}}^d)\), consider the transported domain \[ \Omega_\theta = \{x+\theta(x),\,x\in\Omega\}. \]
  • Study properties of the eigenvalue perturbation maps \[ \lambda_k:\theta\mapsto \lambda_{k,\Omega_\theta}. \]
  • The spectrum (of \(-\mathcal{L}_\beta\) with Dirichlet b.c. on \(\Omega\)) is enumerated with multiplicity in order of increasing magnitude \[ 0<\lambda_{1,\Omega}<\lambda_{2,\Omega}\leqslant\lambda_{3,\Omega}\leqslant\cdots \] with an associated basis of eigenvectors \[ \int_{\Omega}u_{i,\Omega}u_{j,\Omega}\mathrm{e}^{-\beta V} = \delta_{ij},\qquad \forall\, i,j\geqslant 1. \]

A domain \(\Omega\) and its perturbation \(\Omega_\theta\).

Shape perturbation results

Assume \(\Omega\subset{\mathbb{{R}}}^d\) is a bounded \(\mathcal C^{1,1}\) (or convex Lipschitz) domain and \(\lambda_{k,\Omega}\) has multiplicity \(m_k\geqslant 1\): \[ \lambda_{k-1,\Omega}<\lambda_{k,\Omega}=\cdots=\lambda_{k+m_k-1,\Omega}<\lambda_{k+m_k,\Omega}. \]

Theorem (Blassel, Lelièvre, and Stoltz (2025), in preparation)

If \(m_k=1\), \(\lambda_k\) is \(\mathcal C^1(\mathcal W^{1,\infty})\) in a neighborhood of \(\theta=0\), with Fréchet derivative \[ D\lambda_k \theta = -\frac1\beta \int_{\partial\Omega}\left(\frac{\partial u_{k,\Omega}}{\partial\mathrm{n}}\right)^2\mathrm{n}^\top a\mathrm{n}\theta^\top\mathrm{n}\mathrm{e}^{-\beta V}. \]

If \(m_k>1\), \(\lambda_k\) is only Gateaux semi-differentiable: \(t\mapsto\lambda_{k,\Omega_{t\theta}}\) is right-differentiable at \(t=0\). Its derivative is the bottom eigenvalue of the matrix \[ M^{\Omega,k}_{ij}(\theta) = -\frac1\beta\int_{\partial\Omega}\frac{\partial u_{k+i-1,\Omega}}{\partial\mathrm{n}}\frac{\partial u_{k+j-1,\Omega}}{\partial\mathrm{n}}\mathrm{n}^\top a\mathrm{n}\theta^\top\mathrm{n}\mathrm{e}^{-\beta V},\qquad\forall\, 1\leqslant i,j\leqslant m_k. \]

Idea of proof

Transport variational formulation of eigenvalue PDE on \(\Omega_{\theta} = \Phi_\theta\Omega\) back to \(\Omega\).

Now, \(\theta\)-dependent coefficients on a fixed domain.

Regularity results follow from perturbation theory (developed in Kato (2013)), adapting methods from Haug and Rousselet (1980), Rousselet (1983) for problems in structural mechanics.

Perturbation of the eigenvalue \(\lambda_{k,\Omega}\) in the direction \(\theta\) (here \(m_k=3\))

Slopes of the right-tangents are the eigenvalues of \(M^{\Omega,k}(\theta)\) (with multiplicity).

Sketch of proof

Transport to fixed domain

For \(\|\theta\|_{\mathcal W^{1,\infty}({\mathbb{{R}}}^d)}\) small, \(\Phi_\theta = \mathrm{Id}+\theta\) is a bi-Lipschitz homeomorphism. From weak form of the eigenvalue problem \[ \frac1\beta\int_{\Phi_\theta\Omega} \nabla u_{k,\Omega_\theta}^\top a \nabla v\,\mathrm{e}^{-\beta V} = \lambda_{k,\Omega_\theta}\int_{\Phi_\theta\Omega}u_{k,\Omega_\theta}v\,\mathrm{e}^{-\beta V},\,\forall v\in H_0^1(\Omega_\theta) \] deduce that \(v_\theta:=u_{k,\Omega_\theta}\circ \Phi_\theta\) satisfies \(\forall v\in H_0^1(\Omega)\): \[ \frac1\beta\int_{\Omega}\nabla v_\theta \nabla \Phi_\theta^{-\top} a\circ\Phi_\theta \nabla\Phi^{-1}\nabla v \mathrm{e}^{-\beta V\circ\Phi_\theta}|\det\,\nabla \Phi_\theta| =\lambda_{k,\Omega_\theta}\int_{\Omega}v_\theta v \mathrm{e}^{-\beta V\circ\Phi_\theta}|\det \nabla \Phi_\theta|. \]

Perturbation estimates

In abstract form, write \[ \langle A_\theta v_\theta,v\rangle_{L^2(\Omega)} = \lambda_{k,\Omega_\theta}\langle B_\theta v_\theta,v\rangle_{L^2(\Omega)},\,\forall v\in H_0^1(\Omega), \] with \[ A_\theta = A_0^{1/2}\left(\mathrm{Id} + D^{(1)}A_0\theta + R_{A_0}(\theta)\right)A_0^{1/2},\quad B_\theta = B_0^{1/2}(\mathrm{Id} +D^{(1)}B_0\theta + R_{B_0}(\theta))B_0^{1/2}, \] with \(A_0\), \(B_0\) self-adjoint on \(L^2(\Omega)\) and relative bounds: \[ \|D^{(1)}A_0\theta\|_{A_0},\|D^{(1)}B_0\theta\|_{L^2(\Omega)} = \mathcal O(\|\theta\|_{\mathcal W^{1,\infty}}),\, \|R_{A_0}(\theta)\|_{A_0},\|R_{B_0}(\theta)\|_{L^2(\Omega)}=\mathcal O(\|\theta\|^2_{\mathcal W^{1,\infty}}). \] Taylor expansions on bilinear forms, and representation theorems for (relatively bounded) quadratic forms.

Regularity of auxiliary operators

From these estimates, deduce \(W^{1,\infty}({\mathbb{{R}}}^d)-\mathcal B(L^2_\beta(\Omega))\) Fréchet differentiability for:

The (compact) inverse operators \[ S(\theta) = A_\theta^{-1}B_\theta, \] the Riesz projectors \[ \Pi_\theta = -\frac1{2 i\pi}\oint_{\Gamma}(S(\theta)-\zeta)\,\mathrm{d}\zeta \] for the eigenvalues \(1/\lambda_{k,\Omega_\theta},\dots,1/\lambda_{k+m_k-1,\Omega_\theta}\), and the operator \[ U(\theta) = (\Pi_0\Pi_\theta + (\mathrm{Id}-\Pi_0)(\mathrm{Id}-\Pi_\theta))(\mathrm{Id}-(\Pi_\theta-\Pi_0)^2)^{-1/2}:\Pi_\theta L^2_\beta(\Omega)\to \Pi_0 L^2_\beta(\Omega), \] which satisfies \(DU(0)\cdot\theta=0\).

Computation of Gateaux-derivatives

Apply finite-dimensional perturbation theory to the conjugated operator \[ U(\theta)S(\theta)U(\theta)^{-1}:\Pi_0 L^2_\beta(\Omega)\to \Pi_0 L^2_\beta(\Omega) \] to express the Gateaux right-derivatives of \(\theta\mapsto\lambda_{k+\ell,\Omega_\theta}\) for \(0\leqslant\ell\leqslant m_k -1\) as the eigenvalues of an explicit \(m_k\times m_k\) matrix \(M^{k,\Omega}(\theta)\)

\[ \begin{aligned} M^{\Omega,k}_{ij}(\theta) = &\frac1\beta\int_\Omega \nabla u_{k,\Omega}^{(i)\top} \left(\nabla a^\top \theta -a\nabla \theta -\nabla\theta^\top a\right)\nabla u_{k,\Omega}^{(j)}\mathrm{e}^{-\beta V} \\&+ \int_\Omega \left(\frac1\beta\nabla u_{k,\Omega}^{(i)\top} a \nabla u_{k,\Omega}^{(j)}-\lambda_{k,\Omega}u_{k,\Omega}^{(i)}u_{k,\Omega}^{(j)}\right)\mathrm{div}\left(\theta\mathrm{e}^{-\beta V}\right) \end{aligned} \]

Choice of ascent direction : case of simple eigenvalues

Shape gradient of objective function

Assume \(\lambda_{1,\Omega},\lambda_{2,\Omega}\) are simple, write \(J(\Omega) = \lambda_{2,\Omega}/\lambda_{1,\Omega}\). From the chain rule, \(\theta\mapsto J(\Omega_\theta)\) is Fréchet-differentiable around \(\theta=0\), and \[ DJ(\Omega)\cdot\theta = \int_{\partial\Omega}\phi_J(\Omega)\theta^\top\mathrm{n} \] where \[ \phi_J(\Omega) = -\frac1\beta\left[\lambda_{1,\Omega}\left(\frac{\partial u_{2,\Omega}}{\partial\mathrm{n}}\right)^2-\frac{\lambda_{2,\Omega}}{\lambda_{1,\Omega}^2}\left(\frac{\partial u_{1,\Omega}}{\partial\mathrm{n}}\right)^2\right]\mathrm{n}^\top a\mathrm{n}\mathrm{e}^{-\beta V} \] defined on \(\partial\Omega\) is the shape-gadient of \(J\).

Any perturbation of steepest ascent \(\theta^*\) satisfies \[ \theta^*\in \underset{\|\theta\|_{L^2(\partial\Omega)}=1}{\mathrm{Argmax}} DJ(\Omega)\cdot\theta \implies \theta|_{\partial\Omega}\propto \phi_J(\Omega)\mathrm{n}. \]

Discretization with finite elements: \(\Omega\) is a simplicial mesh, \(u_{k,\Omega}\) is a Rayleigh-Ritz eigenvector for some finite-dimensional space \(V_h(\Omega)\subset L^2(\Omega)\).

Numerical issues:

  • \(\theta \propto \phi_J(\Omega)\mathrm{n}\) is a poor choice because \(\mathrm{n}\) is highly irregular on \(\partial\Omega\).
  • \(\theta\) needs to be defined on \(\Omega\) and not just \(\partial\Omega\).

Extension-regularization procedure

Given a regularization parameter \(\varepsilon_{\mathrm{reg}}>0\), find \(\theta\in H^1(\Omega)^d\) such that \[ \int_{\Omega}\left(\varepsilon^2_{\mathrm{reg}} \nabla\theta:\nabla\psi +\theta^\top\psi\right) = \int_{\partial\Omega}\phi_J(\Omega)\psi^\top\mathrm{n},\qquad\forall\,\psi\in H^1(\Omega)^d. \] such a \(\theta\) is a guaranteed ascent direction.

Formally, Euler-Lagrange equation for the energy functional \[ \mathcal{E}(\theta) = \frac{\varepsilon^2_{\mathrm{reg}}}{2}\int_{\Omega}|\nabla\theta|_{\mathrm{F}}^2 + \frac12\int_{\Omega}|\theta-\phi_J(\Omega)\mathrm{n}\boldsymbol{1}_{\partial\Omega}|^2. \]

Weak form of the regularizing vectorial Neumann problem \[ \left\{\begin{aligned} -\varepsilon^2_{\mathrm{reg}}\Delta \theta + \theta &= 0\text{ in $\Omega$},\\ \varepsilon^2_{\mathrm{reg}}\nabla\theta\mathrm{n}&= \phi_J(\Omega)\mathrm{n}\text{ on $\partial\Omega$}. \end{aligned}\right. \]

Write \(\theta = R_{\varepsilon_{\mathrm{reg}}}\phi_J(\Omega)\) for the (FEM) solution operator associated with this problem.

  • Other choices of regularizations, multi-mesh approaches, etc…
  • Solve then discretize approach: ascent direction \(\neq\) ascent direction for discretization.
  • Normalize \(\theta\) in \(L^2(\partial\Omega)\), and select \(\eta>0\) s.t. \(\Omega' = (\mathrm{Id}+\eta\theta)\Omega\) is a valid mesh.

Case of degenerate eigenvalues

Objective function is non-smooth at \(\Omega\) if \(\lambda_{2,\Omega}\) has multiplicity \(m_2>1\) (say \(m_2=2\)).

Write \[ J(\Omega) = \lambda_{2,\Omega}/\lambda_{1,\Omega},\qquad \tilde K(\Omega) = \lambda_{3,\Omega}/\lambda_{1,\Omega} \] these two shape functionals “cross” at \(\Omega\).

Define the natural shape gradients \(\phi_{J}(\Omega)\), \(\phi_{K}(\Omega)\) as in the simple case.

Extended-regularized Clarke derivative

Clarke generalized subgradient for multiple eigenvalues (Cox (1995)): \[ \partial \lambda_k(A(0)) = \mathrm{conv}\,\left\{\theta\mapsto \left\langle DA\cdot\theta u,u\right\rangle_{\mathcal H},\,A(0)u=\lambda_k(A(0))u,\, \|u\|_{\mathcal H}=1 \right\} \] Look for \(\theta^*\) proportional to an element of the simplex \[ \partial J(\Omega) = \mathrm{conv}\left\{\phi_{J}(\Omega)\mathrm{n},\phi_{K}(\Omega)\mathrm{n}\right\}. \] Apply the extension-regularization operator \(\varepsilon_{\mathrm{reg}}\) from above to define \[ \partial_{\varepsilon_{\mathrm{reg}}} J(\Omega) := R_{\varepsilon_{\mathrm{reg}}} \partial J(\Omega) = \mathrm{conv}\left\{R_{\varepsilon_{\mathrm{reg}}}\phi_{J}(\Omega),R_{\varepsilon_{\mathrm{reg}}}\phi_{K}(\Omega)\right\}. \]

  • In practice, fix \(\varepsilon_{\mathrm{degen}}>0\) and decide that \(m_2=2\) when \((\lambda_{3,\Omega}-\lambda_{2,\Omega})/\lambda_{2,\Omega}\leqslant\varepsilon_{\mathrm{degen}}<(\lambda_{4,\Omega}-\lambda_{2,\Omega})/\lambda_{2,\Omega}\).
  • Replaces true objective with surrogate objective based on assumption of exact degeneracy of eigenvalues.

Choice of ascent direction

Solve the inner optimisation problem \[ \theta \in \mathrm{Argmax}\,\left\{\mu_1(\psi):\,\psi\in\partial_{\varepsilon_{\mathrm{reg}}} J(\Omega),\,\|\psi\|_{L^2(\partial\Omega)}=1\right\}, \] where \(\mu_1(\psi)\) is the Gateaux right-derivative of \(J\) in the direction \(\psi\).

In the end, evaluating this only involves computing the bottom eigenvalue of a \(m_2\times m_2\) matrix which depends linearly on \(\psi\).

Related to gradient-sampling.

Effect of nearly-degenerate eigenvalues

Galerkin objective for high-dimensional systems

For systems of interest, \(d\gg 1\). In MD, common to use a collective variable (CV) \(\xi:{\mathbb{{R}}}^d\to{\mathbb{{R}}}^m\) with \(m\ll d\) for interpretation and biasing.

It is natural to consider states defined in CV space: \(\Omega = \xi^{-1}(\Omega^\xi)\) for some \(\Omega^\xi\subset{\mathbb{{R}}}^m\)

Coarse-grained spectrum

Consider the test space \[ \mathcal V_\xi(\Omega^\xi) = \left\{\varphi\circ\xi,\,\varphi\in H_0^1(\Omega_\xi)\right\} \subset H_0^1(\xi^{-1}(\Omega_\xi)), \] introduce the Rayleigh-Ritz (or “coarse-grained”) eigenpairs: \[ Q_\xi(\varphi,\psi;\Omega_\xi)=Q(\varphi\circ\xi,\psi\circ\xi;\xi^{-1}(\Omega_\xi)),\qquad Q_\xi(u_{k,\Omega_\xi}^\xi,u_{\ell,\Omega_\xi}^\xi) = \lambda_{k,\Omega_\xi}^\xi\delta_{k\ell}, \] where \(Q\) is the Rayleigh quotient for \(-\mathcal{L}_\beta\) \[ Q(u,v;\Omega)=\frac1\beta\frac{\int_\Omega \nabla u^\top a \nabla v \mathrm{e}^{-\beta V}}{\int_\Omega uv\mathrm{e}^{-\beta V}}. \]

Co-area formula gives (Blassel, Lelièvre, and Stoltz (2025))

\[ \forall\,\varphi,\psi\in\mathcal V_\xi(\Omega^\xi),\qquad Q_\xi(\varphi,\psi;\Omega_\xi) = \frac1\beta\frac{\int_{\Omega_\xi}\nabla\varphi^\top a_\xi \nabla\psi \mathrm{e}^{-\beta F_\xi}}{\int_{\Omega_\xi}\varphi\psi\mathrm{e}^{-\beta F_\xi}}, \] where \(F_\xi,a_\xi\) are the free-energy and effective diffusion through \(\xi\): \[ F_\xi(z) = -\frac1\beta\log\,\int_{\xi^{-1}(z)}\mathrm{e}^{-\beta V}(\det [\nabla\xi^\top\nabla\xi])^{-1/2}\mathrm{d}\sigma,\quad a_\xi(z) = \int_{\xi^{-1}(z)}\nabla\xi^\top a\nabla\xi \mathrm{d}\mu_z, \] where \(\mu_z\) is the conditional measure of \(\mu\) on the level-set \(\xi^{-1}(z)\).

Error estimate for all \(k\geqslant 1\): \[ 0 \leqslant\lambda^\xi_k-\lambda_k \leqslant\frac1\beta\int_{\xi^{-1}(\Omega_\xi)}\nabla\left[u_j-u_j^\xi\circ\xi\right]^\top a\nabla\left[u_j-u_j^\xi\circ\xi\right]\mathrm{e}^{-\beta V}. \]

Dynamical interpretation

Rates \(\lambda_{k,\Omega_\xi}^\xi\) correspond to rates for effective dynamics: \[ \mathrm{d}Z_t^\xi = \left(-a_\xi\nabla F_\xi + \frac1\beta\mathrm{div}\,a_\xi\right)(Z_t^\xi)\,\mathrm{d}t + \sqrt{\frac{2}{\beta}}a_\xi(Z_t^\xi)^{1/2}\,\mathrm{d}B_t. \] Shape perturbation results apply verbatim for \(\xi\in\mathcal W^{3,\infty}({\mathbb{{R}}}^d,{\mathbb{{R}}}^m)\).

Example: solvated Alanine dipeptide

Consider alanine-dipeptide in solution \((d=1857)\).

CV \(\xi=(\phi,\psi)\) standard dihedral angles.

Associated free-energy (\(F_\xi\)) landscape at \(T=300K\).

Effective diffusion tensor (\(a_\xi\)).

As \(F_\xi,a_\xi\) are thermodynamic quantities, can be computed efficiently with enhanced sampling methods. Here, umbrella sampling using Tinker-HP and Colvars.

Shape-optimisation results

Shape-gradient ascent with FreeFem++.

Convergence of the Galerkin objective.

Behavior of eigenvalues.

Shape-optimisation results

Shape-gradient ascent algorithm implemented in FreeFem++.

Initial and optimized domains, associated effective QSDs.

Back to the original objective

Quantify the gain in timescale separation with respect to reference state definition, in a practical dynamical setting.

Simulate Fleming-Viot process (in a modded Tinker-HP with 50 replicas), sampling initial configurations in transition regions.

\(\xi\)-marginal of the QSD for \(\mathcal A(z_5)\).

\(\xi\)-marginal for \(\Omega_\xi^*\) around \(z_5\).

Inference of decorrelation timescale

For the various ensembles of initial conditions, infer decorrelation rate at the level of histogram convergence.

Free-energy basin (\(\gamma=2\,\mathrm{ps}^{-1}\)).

Optimized state (\(\gamma=2\,\mathrm{ps}^{-1}\)).

Comparison of free-energy basin and optimized state

State Exit rate \(\pm 1.96\sigma\) (\(\mathrm{ps}^{-1}\)) Decorrelation rate (\(\mathrm{ps}^{-1}\)) Ratio \(\pm 1.96\sigma\) Mixing time (\(\mathrm{ps}\), \(\mathrm{tol}=0.05\))
\(\mathcal{A}(z_5)\ (\gamma = 1\,\mathrm{ps}^{-1})\) \(5.6 \times 10^{-3} \pm 1.6 \times 10^{-4}\) \(0.49\) \(88.1 \pm 2.5\) \(5.8\)
\(\Omega_\xi^*\ (\gamma = 1\,\mathrm{ps}^{-1})\) \(1.8 \times 10^{-3} \pm 8 \times 10^{-5}\) \(0.33\) \(177.5 \pm 7.7\) \(7.8\)
\(\mathcal{A}(z_5)\ (\gamma = 2\,\mathrm{ps}^{-1})\) \(5.4 \times 10^{-3} \pm 1.4 \times 10^{-4}\) \(0.46\) \(85.3 \pm 2.3\) \(7.1\)
\(\Omega_\xi^*\ (\gamma = 2\,\mathrm{ps}^{-1})\) \(1.8 \times 10^{-3} \pm 8.5 \times 10^{-5}\) \(0.33\) \(187.0 \pm 9.0\) \(8.4\)
\(\mathcal{A}(z_5)\ (\gamma = 5\,\mathrm{ps}^{-1})\) \(5.1 \times 10^{-3} \pm 1.4 \times 10^{-4}\) \(0.39\) \(76.9 \pm 2.2\) \(9.0\)
\(\Omega_\xi^*\ (\gamma = 5\,\mathrm{ps}^{-1})\) \(1.5 \times 10^{-3} \pm 8 \times 10^{-5}\) \(0.34\) \(233.0 \pm 13.0\) \(10.8\)
\(\mathcal{A}(z_5)\ (\gamma = 50\,\mathrm{ps}^{-1})\) \(2.0 \times 10^{-3} \pm 1.2 \times 10^{-4}\) \(0.088\) \(44.0 \pm 2.6\) \(41.2\)
\(\Omega_\xi^*\ (\gamma = 50\,\mathrm{ps}^{-1})\) \(3.3 \times 10^{-4} \pm 4.7 \times 10^{-5}\) \(0.077\) \(233.0 \pm 33.0\) \(55.0\)

Use of the optimized state leads to \(2.0\pm 0.1, 2.19\pm 0.12, 3.03\pm 0.19 \times\) improvement in timescale separation with respect to reference state.

Some perspectives

  • Methodology can be extended to any reversible dynamics in \(\xi\). Data-driven approaches to obtain better effective dynamics are interesting.
  • Rigorous analytical estimates in the low-temperature/high barrier regime (done in Blassel, Lelièvre, and Stoltz (2024)), for low-dimensional parametrizations of the state. Quantitative guarantees, using semiclassical methods.

Thank you! (Some bibliography on the next slide).

Bibliography

Aristoff, D., M. Johnson, and D. Perez. 2023. “Arbitrarily Accurate, Nonparametric Coarse Graining with Markov Renewal Processes and the Mori–Zwanzig Formulation.” AIP Advances 13 (9): 095131-1 - 11.
Blassel, N., T. Lelièvre, and G. Stoltz. 2024. “Quantitative Low-Temperature Spectral Asymptotics for Reversible Diffusions in Temperature-Dependent Domains.” ArXiv Preprint: 2501.16082.
———. 2025. “Shape Optimization of Metastable States.”
Cox, S. J. 1995. “The Generalized Gradient at a Multiple Eigenvalue.” Journal of Functional Analysis 133 (1): 30–40.
Haug, E. J., and B. Rousselet. 1980. “Design Sensitivity Analysis in Structural Mechanics. II. Eigenvalue Variations.” Journal of Structural Mechanics 8 (2): 161–86.
Kato, T. 2013. Perturbation Theory for Linear Operators. Vol. 132. Springer Science & Business Media.
Le Bris, C., T. Lelievre, M. Luskin, and D. Perez. 2012. “A Mathematical Formalization of the Parallel Replica Dynamics.” Monte–Carlo Methods and Applications 18: 119–46.
Perez, D., B. P. Uberuaga, and A. F. Voter. 2015. “The Parallel Replica Dynamics Method – Coming of Age.” Computational Materials Science, Special Issue on Advanced Simulation Methods, 100: 90–103.
Rousselet, B. 1983. “Shape Design Sensitivity of a Membrane.” Journal of Optimization Theory and Applications 40: 595–623.
Sørensen, M.., and A. F. Voter. 2000. “Temperature-Accelerated Dynamics for Simulation of Infrequent Events.” The Journal of Chemical Physics 112 (21): 9599–9606.
Voter, A. F. 1997. “Hyperdynamics: Accelerated Molecular Dynamics of Infrequent Events.” Physical Review Letters 78 (20): 3908–11.
———. 1998. “Parallel Replica Method for Dynamics of Infrequent Events.” Physical Review B 57 (22): 9599–9606.