# Fluctuations in the heterogeneous multiscale methods for fast–slow systems

- David Kelly
^{1}Email authorView ORCID ID profile and - Eric Vanden-Eijnden
^{1}

**4**:23

https://doi.org/10.1186/s40687-017-0112-2

© The Author(s) 2017

**Received: **11 April 2016

**Accepted: **22 May 2017

**Published: **1 December 2017

## Abstract

How heterogeneous multiscale methods (HMM) handle fluctuations acting on the slow variables in fast–slow systems is investigated. In particular, it is shown via analysis of central limit theorem (CLT) and large deviation principle (LDP) that the standard version of HMM artificially amplifies these fluctuations. A simple modification of HMM, termed parallel HMM, is introduced and is shown to remedy this problem, capturing fluctuations correctly both at the level of the CLT and the LDP. All results in this article assume the HMM speedup factor \(\lambda \) to be constant and in particular independent of the scale parameter \(\varepsilon \). Similar type of arguments can also be used to justify that the \(\tau \)-leaping method used in the context of Gillespie’s stochastic simulation algorithm for Markov jump processes also captures the right CLT and LDP for these processes.

## 1 Background

*x*variable. This averaging principle is akin to the law of large number (LLN) in the present context and it suggests to simulate the evolution of the slow variables using (1.2) rather than (1.1) when \(\varepsilon \) is small. This requires to estimate

*F*(

*x*), which typically has to be done on-the-fly given the current value of the slow variables. To this end, note that if Euler’s method with time step \(\Delta t\) is used as integrator for the slow variables in (1.1), we can approximate \(X^\varepsilon (n\Delta t)\) by \(x_n\) satisfying the recurrence

*x*. If \(\varepsilon \) is small enough that \(\Delta t / \varepsilon \) is larger than the mixing time of \(Y_x^\varepsilon \), the Birkhoff integral in (1.4) is in fact close to the averaged coefficient in (1.2), in the sense that

*F*(

*x*) using only a fraction of the macro-time step. In particular, we expect that

The approximations in (1.5) or (1.7) are perfectly reasonable if we are only interested in staying faithful to the averaged equation (1.2)—that is to say, HMM-type approximations will have the correct law of large numbers (LLN) behavior. However, the fluctuations about that average will be enhanced by a factor of \(\lambda \). This is quite clear from the interpretation (1.7), since in the original model (1.1), the local fluctuations about the average are of order \(\sqrt{\varepsilon }\) and in (1.7) they are of order \(\sqrt{\varepsilon \lambda }\). The large fluctuations about the average caused by rare events are similarly inflated by a factor of \(\lambda \). This can be an issue, for example, in metastable fast–slow systems, where the large fluctuations about the average determine the waiting times for transitions between metastable states. In particular, we shall see that an HMM-type scheme drastically decreases these waiting times due to the enhanced fluctuations.

*essentially*replacing a sum of \(\lambda \) weakly correlated random variables with one random variable, multiplied by \(\lambda \). This introduces correlations that should not be there and in particular results in enhanced fluctuations. In (1.8), we instead replace the sum of \(\lambda \) weakly correlated random variables with a sum of \(\lambda \) independent random variables. This is a much more reasonable approximation to make, since these random variables are becoming less and less correlated as \(\varepsilon \) gets smaller. Since the terms appearing on the right-hand side are independent of each other, they can be computed in parallel. Thus, if one has \(\lambda \) CPUs available; then, the real time of the computations is identical to HMM. For this reason, we call the modification the parallelized HMM (PHMM). Note that, in analogy to (1.7), one can interpret PHMM as approximating (1.1) by the system

*O*(1) timescales), we do investigate numerically what happens on much larger timescales and find that the PHMM performs quite well in the particular examples we study. We stress, however, that such an extension will not be possible in general; see the discussion for further details.

We also stress that the theoretical results of this article are all obtained under the approximation scenario above, namely that we have discretized the slow variables and worked with fast variables \(Y_x^\varepsilon \) that solve the exact evolution equations, but with frozen *x* variables. In practice, one must in general discretize the fast variables, which adds another layer of complexity to the analysis of fluctuations. We restrict ourselves to the simpler theoretical setting for the sake of simplicity and to ensure ‘proof of concept.’ In the numerical investigations, we find that the theoretical results derived in the above scenario are robust even with relatively crude fast integrators. It has been shown [20] that numerical schemes do not always inherit the mixing properties of the underlying evolution equation. Thus, in some situations, it is advisable to use sophisticated methods that are known to capture longtime statistics [5]. Due to the relative simplicity of the numerical models studied in this article, we do not encounter this problem and hence can employ simple methods.

It is important to note that the averaging approximation in (1.5) still holds for a class of \(\lambda \) that is \(\varepsilon \)-dependent, provided that \(\varepsilon \lambda \rightarrow 0\) as \(\varepsilon \rightarrow 0\). This is clearly a computational benefit, with greater timescale separation leading to greater computational speedup. In this article, we will assume for simplicity that \(\lambda \) does not depend on \(\varepsilon \), and will comment on the results in the \(\varepsilon \) dependent case in Appendix 1.

The outline of the remainder of this article is as follows. In Section 2, we recall the averaging principle for stochastic fast–slow systems and describe how to characterize the fluctuations about this average, including local Gaussian fluctuations and large deviation principles. In Sect. 3, we recall the HMM-type methods. In Sect. 4, we show that they lead to enhanced fluctuations. In Sect. 5, we introduce the PHMM modification, and in Sect. 6, we show that this approximation yields the correct fluctuations, both in terms of local Gaussian fluctuations and large deviations. In Sect. 7, we test PHMM for a variety of simple models and conclude in Sect. 8 with a discussion.

## 2 Average and fluctuations in fast–slow systems

*W*is a standard Wiener process in \(\mathbb {R}^e\). We assume that for every \(x \in \mathbb {R}^d\), the Markov process described by the SDE

In this section, we briefly recall the averaging principle for stochastic fast–slow systems and discuss two results that characterize the fluctuations about the average, the central limit theorem (CLT) and the large deviations principle (LDP).

### 2.1 Averaging principle

*x*and almost surely every initial condition \(Y^\varepsilon (0)\) (a.s. with respect to \(\mu _x\)) as well as every realization of the Brownian paths driving the fast variables. Details of this convergence result in the setting above are given in (for instance) [15, Chapter 7.2].

### 2.2 Small fluctuations: CLT

*small*fluctuations of \(X^\varepsilon \) about the averaged system \(\bar{X}\) can be understood by characterizing the limiting behavior of

*Z*defined by the SDE

*V*is a standard Wiener process, \(B_0 := B_1 + B_2\) with

*y*. By the Fredholm alternative, the \(O(\varepsilon ^{-1/2})\) identity has a solution \(u_1\) which has the Feynman–Kac representation

*O*(1) identity against the invariant measure corresponding to \(\mathscr {L}_0\), we obtain

### 2.3 Large fluctuations: LDP

A large deviation principle (LDP) for the fast–slow system (2.1) quantifies probabilities of *O*(1) fluctuations of \(X^\varepsilon \) away from the averaged trajectory \(\bar{X}\). The probability of such events vanishes exponentially quickly and as a consequence is not accounted for by the CLT fluctuations; hence, a LDP accounts for the *rare events*.

*large deviation principle*(LDP) with action functional \(\mathscr {S}_{[0,T]}\) if for any set \(\Gamma \subset \{ \gamma \in C([0,T], \mathbb {R}^d) : \gamma (0) = x \}\) we have

*O*(1) fluctuations that occur on large timescales, such as the probability of transition from one metastable set to another. For example, suppose that \(X^\varepsilon \) is known to satisfy an LDP with action functional \(\mathscr {S}_{[0,T]}\). Let

*D*be an open domain in \(\mathbb {R}^d\) with smooth boundary \(\partial D\) and let \(x^* \in D\) be an asymptotically stable equilibrium for the averaged system \(\dot{\bar{X}} = F(\bar{X})\). When \(\varepsilon \ll 1\), we expect that a trajectory of \(X^\varepsilon \) that starts in

*D*will tend toward the equilibrium \(x^*\) and exhibit \(O(\sqrt{\varepsilon })\) fluctuations about the equilibrium—these fluctuations are described by the CLT. On very large timescales, these small fluctuations have a chance to ‘pile up’ into an

*O*(1) fluctuation, producing behavior of the trajectory that would be considered impossible for the averaged system. Such fluctuations are not accurately described by the CLT and require the LDP instead. For example, the asymptotic behavior of escape time from the domain

*D*,

*quasi-potential*defined by

*Varadhan’s lemma*states that if a process \(X^\varepsilon \) is known to satisfy an LDP with some associated Hamiltonian , then for any \(\varphi : \mathbb {R}^d \rightarrow \mathbb {R}\) we have the generalized Laplace method-type result

*x*,

*t*) and a suitable class of \(\varphi \), then the inverse Varadhan’s lemma states that \(X^\varepsilon \) satisfies an LDP with action functional given by (2.8), (2.9). Hence, we can use (2.11) to determine the action functional for a given process.

In the next few sections, we will exploit both sides of Varadhan’s lemma when investigating the large fluctuations of the HMM and related schemes. More complete discussions on Varadhan’s lemma can be found in [13, Chapters 4.3, 4.4].

## 3 HMM for fast–slow systems

When applied to the stochastic fast–slow system (2.1), HMM-type schemes rely on the fact that the slow \(X^\varepsilon \) variables, and the coefficients that govern them, converge to a set of reduced variables as \(\varepsilon \) tends to zero. We will describe a simplest version of the method below, which is more convenient to deal with mathematically.

Before proceeding, we digress briefly on notation. When referring to continuous time variables, we will always use uppercase symbols (\(X^\varepsilon ,Y^\varepsilon \), etc.), and when referring to discrete time approximations, we will always use lowercase symbols (\(x^\varepsilon _n\), \(y^\varepsilon _n\), etc.). We will also encounter continuous time variables whose definition depends on the integer *n* for which we have \(t \in [n\Delta t, (n+1)\Delta t)\). We will see below that such continuous time variables are used to define discrete time approximations. In this situation, we will use uppercase symbols with a subscript *n* (e.g., \(X^\varepsilon _n\)).

- 1.(Micro-step) Integrate the fast variables over the interval \(I_{n,\Delta t}\), with the slow variable frozen at \(X^\varepsilon = x^\varepsilon _n\). That is, the fast variables are approximated byfor \(n\Delta t \le t \le (n+ 1/\lambda )\Delta t \) with some \(\lambda \ge 1\) (that is, we do not necessarily integrate the \(Y_n^\varepsilon \) variables over the whole time window). Due to the ergodicity of \(Y_x\), the initialization of \(Y^\varepsilon _n\) is not crucial to the performance of the algorithm. It is, however, convenient to use \(Y^\varepsilon _{n+1}(0) = Y^\varepsilon _n((n+ 1/\lambda )\Delta t)\), since this reinitialization leads to the interpretation of the HMM scheme given in (3.5) below.$$\begin{aligned} Y^\varepsilon _{n}(t) = Y^\varepsilon _n(n\Delta t) + \frac{1}{\varepsilon } \int _{n\Delta t}^t g(x^\varepsilon _n,Y_{n}^\varepsilon (s))\text {d}s + {\frac{1}{\sqrt{\varepsilon }}}\int _{n\Delta t}^t \sigma (x^\varepsilon _n, Y^{\varepsilon }_n(s)) \text {d}W(s) \end{aligned}$$(3.1)
- 2.(Macro-step) Use the time series from the micro-step to update \(x^\varepsilon _n\) to \(x^\varepsilon _{n+1}\) viaNote that we do not require \(Y^\varepsilon _n\) over the whole \(\Delta t\) time step, but only a fraction of the step large enough for \(Y^\varepsilon _n\) to mix. Indeed, if \(\varepsilon \) is small enough, we have the approximate equality$$\begin{aligned} x^{\varepsilon }_{n+1} = x^\varepsilon _n + \lambda \int _{n\Delta t}^{(n+1/\lambda )\Delta t} f(x^\varepsilon _n,Y^\varepsilon _n(s)) \text {d}s. \end{aligned}$$(3.2)since both sides are close the ergodic mean \(\int f(x^\varepsilon _n , y) d\mu _{x^\varepsilon _n}(y )\).$$\begin{aligned} \frac{\lambda }{ \Delta t}\int _{n\Delta t}^{(n+1/\lambda )\Delta t} f(x^\varepsilon _n,Y^\varepsilon _n(s)) \text {d}s \approx \frac{1}{\Delta t}\int _{n\Delta t}^{(n+1)\Delta t} f(x^\varepsilon _n,Y^\varepsilon _n(s)) \text {d}s \end{aligned}$$

*speedup factor*of HMM. Note that \(\lambda \) can only take moderate values for the above method to be justifiable; in particular, we require that \(\varepsilon / \lambda \gg 1\).

## 4 Average and fluctuations in HMM methods

In this section, we investigate whether the limit theorems discussed in Sect. 2, i.e., the averaging principle, the CLT fluctuations and the LDP fluctuations, are also valid in the HMM approximation for a fast–slow system. We will see that the averaging principle is the only property that holds, and that both types of fluctuations are *inflated* by the HMM method. It is important to note that the theory developed in this section (and likewise in Sect. 6) is to understand the averaging and fluctuation properties of the HMM approximation (3.2), where the slow variables have been discretized, but the fast variables have not. We do not make any theoretical claims about the fully discretized case. We also note that the LLN, CLT and LDP results derived for (3.2) can be used to derive the same results for the non-discretized system (3.5). In particular, the CLT and LDP of (3.5) are not the same as the original fast slow system (2.1).

### 4.1 Averaging

By construction, HMM-type schemes capture the correct averaging principle. More precisely, if we take \(\varepsilon \rightarrow 0\), then the sequence \(x^\varepsilon _n\) converges to some \(\bar{x}_n\), where \(\bar{x}_n\) is a numerical approximation of the true averaged system \(\bar{X}\). If this numerical approximation is well posed, the limits \(\varepsilon \rightarrow 0\) and \(\Delta t \rightarrow 0\) commute with one another. Hence, the HMM approximation \(x^\varepsilon _n\) is consistent, in that it features approximately the same averaging behavior as the original fast–slow system.

Introducing an integrator into the micro-step will make things more complicated, as the invariant measures appearing will be those of the discretized fast variables. In [20], it is shown that discretizations of SDEs often do not possess the ergodic properties of the original system. For those situations where no such issues arise, rigorous arguments concerning this scenario, including rates of convergence for the schemes, are given in [25].

### 4.2 Small fluctuations

As above, by consistency we mean that when we take \(\varepsilon \rightarrow 0\), the sequence \(\{z^\varepsilon _n\}_{n\ge 0}\) converges to some well-posed discretization of the SDE (4.1). Since \(Z(0) = 0\), it is easy to see that the solution to this equation is simply \(\sqrt{\lambda }\) times the solution of (2.4). Hence, the fluctuations of the HMM-type scheme are inflated by a factor of \(\sqrt{\lambda }\).

*x*as a variable. In doing so, we obtain \(\widehat{Z}^\varepsilon _n \rightarrow \widehat{Z}_n\) (in distribution) as \(\varepsilon \rightarrow 0\), where

### 4.3 Large fluctuations

*O*(1) fluctuations of HMM were consistent with those of the fast–slow system, we would expect \(u_{\lambda ,\Delta }\) to converge to the solution of (2.10) as \(\Delta t \rightarrow 0\). Instead, we find that as \(\Delta t\rightarrow 0\), \(u_{\lambda ,\Delta t}(t,x)\) converges to the solution to the Hamilton–Jacobi equationIn light of the discussion in Sect. 2.3, the reverse Varadhan lemma suggests that the HMM scheme is consistent with the wrong LDP. Before proving this claim, we first discuss some implications.

*x*variable in the fast process is frozen to its value at the left endpoint of the interval and hence is treated as a parameter on each interval. We also introduce the operator \(S_{t} \psi (x) = S^{(\alpha )}_{t} \psi (x) |_{\alpha = x}\) and also \(S_{\lambda , t} = \lambda ^{-1} S_{ t} (\lambda \cdot )\). In this notation, it is simple to show that

*k*with \(n \ge k \ge 1\), then

### Remark 4.1

Regarding the operation of taking the log-asymptotic result inside the expectation, one can find such calculations done rigorously in (for instance) [15, Lemma 4.3].

### Remark 4.2

From the discussion above, it appears that the mean transition time can be estimated from HMM upon exponential rescaling; see (4.3). This is true, but only at the level of the (rough) log-asymptotic estimate of this time. How to rescale the prefactor is by no means obvious. As we will see below, PHMM avoids this issue altogether since it does not necessitate any rescaling.

## 5 Parallelized HMM

*j*, the time series \(Y^\varepsilon _{n}\) on the interval \([(n+j/\lambda )\Delta t ,(n+(j+1)/\lambda )\Delta t]\) is replaced with an identical copy of the time series from the interval \([n \Delta t ,(n+1/\lambda )\Delta t]\). This introduces strong correlations between random variables that should be essentially independent. Parallelized HMM avoids this issue by employing the approximation

*j*independent copies of the time series computed in (5.1). Due to their independence, each copy of the fast variables can be computed in parallel; hence, we refer to the method as parallel HMM (PHMM). The method is summarized below.

- 1.(Micro-step) On the interval \(I_{n,\Delta t}\), simulate \(\lambda \) independent copies of the fast variables, each copy simulated precisely as in the usual HMM. That is, letfor \(j=1,\dots ,\lambda \) with \(W_j\) independent Brownian motions. As with ordinary HMM, we will not require the time series of the whole interval \(I_{n,\Delta t}\) but only over the subset \([n\Delta t, (n + 1/\lambda )\Delta t )\).$$\begin{aligned} Y^{\varepsilon ,j}_n = Y^{\varepsilon ,j}_n(n\Delta t) + \frac{1}{\varepsilon } \int _{n\Delta t}^t g(x^\varepsilon _n,Y^{\varepsilon ,j}_n(s))\text {d}s + {\frac{1}{\sqrt{\varepsilon }}}\int _{n\Delta t}^t \sigma (x^\varepsilon _n, Y^{\varepsilon ,j}_n(s)) \text {d}W_j(s) \end{aligned}$$(5.2)
- 2.(Macro-step) Use the time series from the micro-step to update \(x^\varepsilon _n\) to \(x^\varepsilon _{n+1}\) by$$\begin{aligned} x^\varepsilon _{n+1} = x^\varepsilon _n + \sum _{j=1}^\lambda \int _{n\Delta t}^{(n+1/\lambda )\Delta t} f(x^\varepsilon _n,Y^{\varepsilon ,j}_n(s)) \text {d}s. \end{aligned}$$(5.3)

## 6 Average and fluctuations in parallelized HMM

In this section, we check that the averaged behavior and the fluctuations in the PHMM method are consistent with those in the original fast slow system. Just as noted at the beginning of Sect. 4.1, the LLN, CLT and LDP results derived for (5.3) can be extended to the non-discretized system (3.5). In particular, the CLT and LDP of the PHMM approximation (5.4) are the same as the original fast slow system (2.1).

### 6.1 Averaging

### 6.2 Small fluctuations

*j*, the third term becomes

### 6.3 Large fluctuations

*u*solves the correct Hamilton–Jacobi equation (2.10).

*j*, the Hamiltonian reduces toIt follows thatand hence \(\lambda ^{-1} \widehat{S}^{(\alpha )}_{\Delta t} (\lambda \varphi ) = S_{\Delta t}^{(\alpha )}\varphi \). Combining this with (6.8) completes the claim for \(n=1\). The proof of the inductive step for arbitrary \(n\ge 1\) follows identically to Sect. 4.3.

## 7 Numerical evidence

In this section, we investigate the performance of the standard HMM and PHMM methods for systems with well-understood fluctuations and metastability properties. These simple experiments confirm that HMM amplifies fluctuations, which can drastically change the system’s metastable behavior, and that the PHMM succeeds in avoiding these problems. In Sect. 7.1, we investigate simple CLT fluctuations for a simple quadratic potential systems; in Sect. 7.2, we look at large deviation fluctuations for a quartic double-well potential. Finally in Sect. 7.3, we look at fluctuations for a non-diffusive double-well potential, which has large deviation properties that cannot be captured by a so-called ‘small noise’ diffusion.

In all of the experiments below, we use the numerical approximation (3.3), (3.4) with macro-step \(\Delta t\) and micro-step \(\delta t\) as specified for each experiment. The number of micro-steps is accordingly \(M = \lfloor \Delta t / (\lambda \delta t) \rfloor \). At the start of each micro-integration, the fast variables are initialized using their final value at the previous micro-step. As stated in the introduction, with the specific choice of \(\Delta t\) and \(\delta t\) for which \(M=1\), this initialization corresponds to performing an Euler–Maruyama approximation of the inflated system of the type (1.7). This choice is used for the experiment in Sect. 7.3.

We also note that the Euler–Maruyama scheme was chosen due to the relative simplicity of the underlying fast–slow system. In general, to ensure that numerical CLT and LDP results are faithful to the original fast–slow system, it may be advisable to use more sophisticated integrators for the fast variables, such as Störmer–Verlet-type methods [5].

### 7.1 Small fluctuations

*X*for different speedup factors \(\lambda \). It is clear that the spread of the invariant distribution is increasing with \(\lambda \). The profile remains Gaussian, but the variance is greatly inflated. In Fig. 2, we plot the variance of the stationary time series for

*X*as a function of \(\lambda \). The blue line is computed using HMM, and the red line is computed using PHMM. As predicted by the theory in Sect. 4.2, in the case of HMM the variance is increasing linearly with \(\lambda \) and in the case of PHMM the variance is approximately constant. Note that in this example, the CLT captures the large deviations as well. This is because, to leading order in \(\varepsilon \), the fluctuations above the limiting behavior can be captured at all times \(t>0\) by the SDE

*T*is

### 7.2 Large fluctuations

*O*(1) deviations not captured by the CLT, we will look at a fast–slow system which exhibits metastability. Hence, it is natural to take

In Fig. 3, we compare the mean first passage time for HMM and PHMM as a function of \(\lambda \). Even for \(\lambda = 2\), the distinction between the two methods is vast, with the mean first passage time for HMM rapidly dropping off and for PHMM staying approximately constant.

In Fig. 5, we plot the cumulative distributions function (CDF) for the first passage time, comparing that of the true fast–slow system, with HMM (\(\lambda =5\)) and PHMM (\(\lambda =5\)). We see that the HMM first passage times are supported on a much faster timescale than that of the true fast–slow system. In contrast, the CDF of PHMM is practically indistinguishable from that of the true fast–slow system. Hence, PHMM is not just replicating the mean first passage time, but also the entire distribution of first passage times.

### 7.3 Asymmetric, non-diffusive fluctuations

## 8 Discussion

We have investigated HMM methods for fast–slow systems, in particular their ability (or lack thereof) to capture fluctuations, both small (CLT) and large (LDP). We found, both theoretically (Sect. 4) and numerically (Sect. 7), that the amplitude of fluctuations is enhanced by an HMM-type method. In particular with an HMM speedup factor \(\lambda \), in the CLT the variance of Gaussian fluctuations about the average is increased by a factor \(\lambda \) as well. In the LDP, the quasi-potential is decreased by a factor \(\lambda \), leading to the first passage times being supported on a timescale \(\lambda \) orders of magnitude smaller than in the true fast slow system. This inability to correctly capture fluctuations about the average suggests that HMM can be a poor approximation of fast–slow systems, particularly when metastable behavior is important. As noted in Sect. 4.3, although the fluctuations of HMM are enhanced, the large deviation transition *pathways* remain faithful to the true model. Thus, we stress that, typically, HMM is a reliable method of finding transition pathways in metastable systems, but not for simulating their dynamics.

We have introduced a simple modification of HMM, called parallel HMM (PHMM), which avoids these fluctuation issues. In particular, the PHMM method yields fluctuations that are consistent with the true fast slow system for any speedup factor \(\lambda \) (provided that we still have \(\varepsilon \lambda \ll 1\)), as was shown both theoretically (Sect. 6) and numerically (Sect. 7). The HMM method relies on computing one short burst of the fast variables, and inferring the statistical behavior of the fast variables by extrapolating this short burst over a large time window. PHMM on the other hand computes an ensemble of \(\lambda \) short bursts and infers the statistics of the fast variables using the ensemble. Since the ensemble members are independent, they can be computed in parallel. Hence, if one has \(\lambda \) CPUs available, then the real computational time required in PHMM is identical to that in HMM.

*O*(1) timescale, but they either cannot be extended to longer timescale (in the case of the CLT) or leads to trivial prediction on these timescales (in the case of the LDP). To clarify this point, take, for example, the fast–slow Langevin system

### 8.1 Appendix 1: Non-constant speedup factor

### Dedication

Dedicated with admiration and friendship to Bjorn Engquist on the occasion of his 70th birthday.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Declarations

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Abdulle, A., Weinan, E., Engquist, B., Vanden-Eijnden, E.: The heterogeneous multiscale method. Acta Numer.
**21**, 1–87 (2012)MathSciNetView ArticleMATHGoogle Scholar - Ariel, G., Engquist, B., Kim, S., Lee, Y., Tsai, R.: A multiscale method for highly oscillatory dynamical systems using a Poincaré map type technique. J. Sci. Comput.
**54**(2–3), 247–268 (2013)MathSciNetView ArticleMATHGoogle Scholar - Anderson, D.F., Ganguly, A., Kurtz, T.G.: Error analysis of tau-leap simulation methods. Ann. Appl. Probab.
**21**(6), 2226–2262 (2011)MathSciNetView ArticleMATHGoogle Scholar - Ariel, G., Sanz-Serna, J., Tsai, R.: A multiscale technique for finding slow manifolds of stiff mechanical systems. Multiscale Model. Simul.
**10**(4), 1180–1203 (2012)MathSciNetView ArticleMATHGoogle Scholar - Brünger, A., Brooks, C.L., Karplus, M.: Stochastic boundary conditions for molecular dynamics simulations of ST2 water. Chem. Phys. Lett.
**105**(5), 495–500 (1984)View ArticleGoogle Scholar - Bouchet, F., Grafke, T., Tangarife, T., Vanden-Eijnden, E.: Large deviations in fast–slow systems. J. Stat. Phys.
**162**(4), 793–812 (2016)MathSciNetView ArticleMATHGoogle Scholar - Bal, G., Jing, W.: Corrector theory for MsFEM and HMM in random media. Multiscale Model. Simul.
**9**, 1549–1587 (2011)MathSciNetView ArticleMATHGoogle Scholar - Bal, G., Jing, W.: Corrector analysis of a heterogeneous multi-scale scheme for elliptic equations with random potential. M2AN
**48**(2), 387–409 (2014)MathSciNetView ArticleMATHGoogle Scholar - Chorin, A.: A numerical method for solving incompressible viscous flow problems. J. Comput. Phys
**2**, 12–26 (1967)View ArticleMATHGoogle Scholar - Car, R., Parrinello, M.: Unified approach for molecular dynamics and density functional theory. Phys. Rev. Lett.
**55**(22), 2471–2475 (1985)View ArticleGoogle Scholar - Del Moral, P., Garnier, J., et al.: Genealogical particle analysis of rare events. Ann. Appl. Probab.
**15**(4), 2496–2534 (2005)MathSciNetView ArticleMATHGoogle Scholar - Dolgopyat, D.: Limit theorems for partially hyperbolic systems. Trans. Am. Math. Soc.
**356**(4), 1637–1689 (2004)MathSciNetView ArticleMATHGoogle Scholar - Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, vol. 38. Springer, Berlin (2009)MATHGoogle Scholar
- Fatkullin, I., Vanden-Eijnden, E.: A computational strategy for multiscale systems with applications to Lorenz 96 model. J. Comput. Phys.
**200**(2), 605–638 (2004)MathSciNetView ArticleMATHGoogle Scholar - Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems, vol. 260. Springer, Berlin (2012)MATHGoogle Scholar
- Gillespie, D.T.: Approximate accelerated stochastic simulation of chemically reaction systems. J. Chem. Phys.
**115**(4), 1716–1733 (2000)View ArticleGoogle Scholar - Kifer, Y.: Averaging in dynamical systems and large deviations. Invent. Math.
**110**(1), 337–370 (1992)MathSciNetView ArticleMATHGoogle Scholar - Kelly, D., Melbourne, I.: Deterministic homogenization for fast-slow systems with chaotic noise. J. Funct. Anal 272(10), 4063–4102 (2017)Google Scholar
- Kelly, D., Melbourne, I.: Smooth approximations of stochastic differential equations. Ann. Probab.
**44**, 479–520 (2016)MathSciNetView ArticleMATHGoogle Scholar - Mattingly, J.C., Stuart, A.M., Higham, D.J.: Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise. Stoch. Process. Appl.
**101**(2), 185–232 (2002)MathSciNetView ArticleMATHGoogle Scholar - Vanden-Eijnden, E.: Numerical techniques for multi-scale dynamical systems with stochastic effects. Commun. Math. Sci.
**1**(2), 385–391 (2003)MathSciNetView ArticleMATHGoogle Scholar - Vanden-Eijnden, E.: On hmm-like integrators and projective integration methods for systems with multiple time scales. Commun. Math. Sci.
**5**(2), 495–505 (2007)MathSciNetView ArticleMATHGoogle Scholar - E, W., Engquist, B.: The heterogeneous multiscale methods. Commun. Math. Sci. 1(1), 87–132 (2003)Google Scholar
- E, W., Engquist, B., Li, X., Ren, W., Vanden-Eijnden, E.: Heterogeneous multiscale methods: a review. Commun. Comput. Phys. 2(3), 367–450 (2007)Google Scholar
- E, W., Liu, D., Vanden-Eijnden, E.: Analysis of multiscale methods for stochastic differential equations. Commun. Pure Appl. Math. 58(11), 1544–1585 (2005)Google Scholar
- Weinan, E., Ren, W., Vanden-Eijnden, E.: A general strategy for designing seamless multiscale methods. J. Comput. Phys.
**228**(15), 5437–5453 (2009)MathSciNetView ArticleMATHGoogle Scholar