Modeling Shocks in COVID 19 with Stochastic Differential Equations


Coauthored with Chris Rackauckas

This lecture continues the analyzing of the COVID-19 pandemic established in this lecture.

As before, the model is inspired by

Here we extend the model to include policy-relevant aggregate shocks.

Continuous-Time Stochastic Processes

In continuous-time, there is an important distinction between randomness that leads to continuous paths vs. those which have (almost surely right-continuous) jumps in their paths. The most tractable of these includes the theory of Levy Processes.

Among the appealing features of Levy Processes is that they fit well into the sorts of Markov modeling techniques that economists tend to use in discrete time, and usually fulfill the measurability required for calculating expected present discounted values.

Unlike in discrete-time, where a modeller has license to be creative, the rules of continuous-time stochastic processes are much stricter. In practice, there are only two types of Levy Processes that can be used without careful measure theory.

  1. Weiner Processes (as known as Brownian Motion) which leads to a diffusion equations, and is the only continuous-time Levy process with continuous paths
  2. Poisson Processes with an arrival rate of jumps in the variable.

Every other Levy Process can be represented by these building blocks (e.g. a Diffusion Process such as Geometric Brownian Motion is a transformation of a Weiner process, a jump diffusion is a diffusion process with a Poisson arrival of jumps, and a continuous-time markov chain (CMTC) is a Poisson process jumping between a finite number of states).

In this lecture, we will examine shocks driven by transformations of Brownian motion, as the prototypical Stochastic Differential Equation (SDE).


In [1]:
using InstantiateFromURL
# optionally add arguments to force installation: instantiate = true, precompile = true
github_project("QuantEcon/quantecon-notebooks-julia", version = "0.8.0")
In [2]:
using LinearAlgebra, Statistics, Random, SparseArrays

In addition, we will be exploring packages within the SciML ecosystem and others covered in previous lectures

In [3]:
using OrdinaryDiffEq, StochasticDiffEq
using Parameters, Plots

The Basic SIR/SIRD Model

To demonstrate another common compartmentalized model we will change the previous SEIR model to remove the exposed state, and more carefully manage the death state, D.

The states are are now: susceptible (S), infected (I), resistant (R), or dead (D).


  • Unlike the previous SEIR model, the R state is only for those recovered, alive, and currently resistant.
  • As before, we start by assuming those have recovered have acquired immunity.
  • Later, we could consider transitions from R to S if resistance is not permanent due to virus mutation, etc.

Transition Rates

See the previous lecture, for a more detailed development of the model.

  • $ \beta(t) $ is called the transmission rate or effective contact rate (the rate at which individuals bump into others and expose them to the virus)
  • $ \gamma $ is called the resolution rate (the rate at which infected people recover or die)
  • $ \delta(t) \in [0, 1] $ is the death probability
  • As before, we re-parameterize as $ R_0(t) := \beta(t) / \gamma $, where $ R_0 $ has previous interpretation

Jumping directly to the equations in $ s, i, r, d $ already normalized by $ N $,

$$ \begin{aligned} d s & = - \gamma \, R_0 \, s \, i \, dt \\ d i & = \left(\gamma \, R_0 \, s \, i - \gamma \, i \right) dt \\ d r & = (1-\delta) \gamma \, i \, dt \\ d d & = \delta \, \gamma \, i \, dt \\ \end{aligned} \tag{1} $$

Note that the notation has changed to heuristically put the $ dt $ on the right hand side, which will be used when adding the stochastic shocks.

Introduction to SDEs

We start by extending our model to include randomness in $ R_0(t) $ and then the mortality rate $ \delta(t) $.

The result is a system of Stochastic Differential Equations (SDEs).

Shocks to Transmission Rates

As before, we assume that the basic reproduction number, $ R_0(t) $, follows a process with a reversion to a value $ \bar{R}_0(t) $ which could conceivably be influenced by policy. The intuition is that even if the targeted $ \bar{R}_0(t) $ was changed through social distancing/etc., lags in behavior and implementation would smooth out the transition, where $ \eta $ governs the speed of $ R_0(t) $ moves towards $ \bar{R}_0(t) $.

Beyond changes in policy, randomness in $ R_0(t) $ may come from shocks to the $ \beta(t) $ process. For example,

  • Misinformation on Facebook spreading non-uniformly.
  • Large political rallies, elections, or protests.
  • Deviations in the implementation and timing of lockdown policy between demographics, locations, or businesses within the system.
  • Aggregate shocks in opening/closing industries.

To implement these sorts of randomness, we will add on a diffusion term with an instantaneous volatility of $ \sigma \sqrt{R_0} $.

  • This equation is used in the Cox-Ingersoll-Ross and Heston models of interest rates and stochastic volatility.
  • The scaling by the $ \sqrt{R_0} $ ensure that the process stays weakly positive. The heuristic explanation is that the variance of the shocks converges to zero as R₀ goes to zero, enabling the upwards drift to dominate.
  • See here for a heuristic description of when the process is weakly and strictly positive.

The notation for this SDE is then

$$ \begin{aligned} d R_{0t} &= \eta (\bar{R}_{0t} - R_{0t}) dt + \sigma \sqrt{R_{0t}} dW_t\\ \end{aligned} \tag{2} $$

where $ W $ is standard Brownian motion (i.e a Weiner Process.

Heuristically, if $ \sigma = 0 $, divide this equation by $ dt $ and it nests the original ODE used in the previous lecture.

While we do not consider any calibration for the $ \sigma $ parameter, empirical studies such as Estimating and Simulating a SIRD Model of COVID-19 for Many Countries, States, and Cities (Figure 6) show highly volatile $ R_0(t) $ estimates over time.

Even after lockdowns are first implemented, we see variation between 0.5 and 1.5. Since countries are made of interconnecting cities with such variable contact rates, a high $ \sigma $ seems reasonable both intuitively and empirically.

Mortality Rates

Unlike the previous lecture, we will build up towards mortality rates which change over time.

Imperfect mixing of different demographic groups could lead to aggregate shocks in mortality (e.g. if a retirement home is afflicted vs. an elementary school). These sorts of relatively small changes might be best modeled as a continuous path process.

Let $ \delta(t) $ be the mortality rate and in addition,

  • Assume that the base mortality rate is $ \bar{\delta} $, which acts as the mean of the process, reverting at rate $ \theta $. In more elaborate models, this could be time-varying.
  • The diffusion term has a volatility $ \xi\sqrt{\delta (1 - \delta)} $.
  • As the process gets closer to either $ \delta = 1 $ or $ \delta = 0 $, the volatility goes to 0, which acts as a force to allow the mean reversion to keep the process within the bounds
  • Unlike the well-studied Cox-Ingersoll-Ross model, we make no claims on the long-run behavior of this process, but will be examining the behavior on a small timescale so this is not an issue.

Given this, the stochastic process for the mortality rate is,

$$ \begin{aligned} d \delta_t & = \theta (\bar{\delta} - \delta_t) dt + \xi \sqrt{(\delta_t (1-\delta_t)} d W_t\\ \end{aligned} \tag{3} $$

Where the $ W_t $ Brownian motion is independent from the previous process.

System of SDEs

The system (1) can be written in vector form $ x := [s, i, r, d, R₀, \delta] $ with parameter tuple parameter tuple $ p := (\gamma, \eta, \sigma, \theta, \xi, \bar{R}_0(\cdot), \bar{ \delta}) $

The general form of the SDE is.

$$ \begin{aligned} d x_t &= F(x_t,t;p)dt + G(x_t,t;p) dW \end{aligned} $$

With the drift,

$$ F(x,t;p) := \begin{bmatrix} -\gamma \, R_0 \, s \, i \\ \gamma \, R_0 \, s \, i - \gamma i \\ (1-\delta) \gamma i \\ \delta \gamma i \\ \eta (\bar{R}_0(t) - R_0) \\ \theta (\bar{\delta} - \delta) \\ \end{bmatrix} \tag{4} $$

Here, it is convenient but not necessary for $ d W $ to have the same dimension as $ x $. If so, then we can use a square matrix $ G(x,t;p) $ to associate the shocks with the appropriate $ x $ (e.g. diagonal noise, or using a covariance matrix).

As the two independent sources of Brownian motion only affect the $ d R_0 $ and $ d \delta $ terms (i.e. the 5th and 6th equations), define the covariance matrix as

$$ \begin{aligned} G(x, t) &:= diag\left(\begin{bmatrix} 0 & 0 & 0 & 0 & \sigma \sqrt{R_0} & \xi \sqrt{\delta (1-\delta)} \end{bmatrix}\right) \end{aligned} \tag{5} $$


First, construct our $ F $ from (4) and $ G $ from (5)

In [4]:
function F(x, p, t)
    s, i, r, d, R₀, δ = x
    @unpack γ, R̄₀, η, σ, ξ, θ, δ_bar = p

    return [-γ*R₀*s*i;        # ds/dt
            γ*R₀*s*i - γ*i;   # di/dt
            (1-δ)*γ*i;        # dr/dt
            δ*γ*i;            # dd/dt
            η*(R̄₀(t, p) - R₀);# dR₀/dt
            θ*(δ_bar - δ);    # dδ/dt

function G(x, p, t)
    s, i, r, d, R₀, δ = x
    @unpack γ, R̄₀, η, σ, ξ, θ, δ_bar = p

    return [0; 0; 0; 0; σ*sqrt(R₀); ξ*sqrt(δ * (1-δ))]
G (generic function with 1 method)

Next create a settings generator, and then define a SDEProblem with Diagonal Noise.

In [5]:
p_gen = @with_kw (T = 550.0, γ = 1.0 / 18, η = 1.0 / 20,
                  R₀_n = 1.6, R̄₀ = (t, p) -> p.R₀_n, δ_bar = 0.01,
                  σ = 0.03, ξ = 0.004, θ = 0.2, N = 3.3E8)
p =  p_gen()  # use all defaults
i_0 = 25000 / p.N
r_0 = 0.0
d_0 = 0.0
s_0 = 1.0 - i_0 - r_0 - d_0
R̄₀_0 = 0.5  # starting in lockdown
δ_0 = p.δ_bar
x_0 = [s_0, i_0, r_0, d_0, R̄₀_0, δ_0]

prob = SDEProblem(F, G, x_0, (0, p.T), p)
SDEProblem with uType Array{Float64,1} and tType Float64. In-place: false
timespan: (0.0, 550.0)
u0: [0.9999242424242424, 7.575757575757576e-5, 0.0, 0.0, 0.5, 0.01]

We solve the problem with the SOSRI algorithm (Adaptive strong order 1.5 methods for diagonal noise Ito and Stratonovich SDEs).

In [6]:
sol_1 = solve(prob, SOSRI());
@show length(sol_1.t);
length(sol_1.t) = 593

As in the deterministic case of the previous lecture, we are using an adaptive time-stepping method. However, since this is an SDE, (1) you will tend to see more timesteps required due to the greater curvature; and (2) the number of timesteps will change with different shock realizations.

With stochastic differential equations, a “solution” is akin to a simulation for a particular realization of the noise process.

If we take two solutions and plot the number of infections, we will see differences over time:

In [7]:
sol_2 = solve(prob, SOSRI())
plot(sol_1, vars=[2], title = "Number of Infections", label = "Trajectory 1",
     lm = 2, xlabel = "t", ylabel = "i(t)")
plot!(sol_2, vars=[2], label = "Trajectory 2", lm = 2, ylabel = "i(t)")

The same holds for other variables such as the cumulative deaths, mortality, and $ R_0 $:

In [8]:
plot_1 = plot(sol_1, vars=[4], title = "Cumulative Death Proportion", label = "Trajectory 1",
              lw = 2, xlabel = "t", ylabel = "d(t)", legend = :topleft)
plot!(plot_1, sol_2, vars=[4], label = "Trajectory 2", lw = 2)
plot_2 = plot(sol_1, vars=[3], title = "Cumulative Recovered Proportion", label = "Trajectory 1",
              lw = 2, xlabel = "t", ylabel = "d(t)", legend = :topleft)
plot!(plot_2, sol_2, vars=[3], label = "Trajectory 2", lw = 2)
plot_3 = plot(sol_1, vars=[5], title = "R_0 transition from lockdown", label = "Trajectory 1",
              lw = 2, xlabel = "t", ylabel = "R_0(t)")
plot!(plot_3, sol_2, vars=[5], label = "Trajectory 2", lw = 2)
plot_4 = plot(sol_1, vars=[6], title = "Mortality Rate", label = "Trajectory 1",
              lw = 2, xlabel = "t", ylabel = "delta(t)", ylim = (0.006, 0.014))
plot!(plot_4, sol_2, vars=[6], label = "Trajectory 2", lw = 2)
plot(plot_1, plot_2, plot_3, plot_4, size = (900, 600))

See here for comments on finding the appropriate SDE algorithm given the structure of $ F(x, t) $ and $ G(x, t) $

  • If $ G $ has diagonal noise (i.e. $ G(x, t) $ is a diagonal, and possibly a function of the state), then SOSRI is the typical choice.
  • If $ G $ has additive and diagonal noise (i.e. $ G(t) $ is a diagonal and independent from the state), then SOSRA is usually the best algorithm for even mildly stiff $ F $.
  • If adaptivity is not required, then EM (i.e. Euler-Maruyama method typically used by economists) is flexible in its ability to handle different noise processes.


While individual simulations are useful, you often want to look at an ensemble of trajectories of the SDE in order to get an accurate picture of how the system evolves.

To do this, use the EnsembleProblem in order to have the solution compute multiple trajectories at once. The returned EnsembleSolution acts like an array of solutions but is imbued to plot recipes to showcase aggregate quantities.

For example:

In [9]:
ensembleprob = EnsembleProblem(prob)
sol = solve(ensembleprob, SOSRI(), EnsembleSerial(), trajectories = 10)
plot(sol, vars = [2], title = "Infection Simulations", ylabel = "i(t)", xlabel = "t", lm = 2)

Or, more frequently, you may want to run many trajectories and plot quantiles, which can be automatically run in parallel using multiple threads, processes, or GPUs. Here we showcase EnsembleSummary which calculates summary information from an ensemble and plots the mean of the solution along with calculated quantiles of the simulation:

In [10]:
trajectories = 100  # choose larger for smoother quantiles
sol = solve(ensembleprob, SOSRI(), EnsembleThreads(), trajectories = trajectories)
summ = EnsembleSummary(sol) # defaults to saving 0.05, 0.95 quantiles
plot(summ, idxs = (2,), title = "Quantiles of Infections Ensemble", ylabel = "i(t)",
     xlabel = "t", labels = "Middle 95% Quantile", legend = :topright)

In addition, you can calculate more quantiles and stack graphs

In [11]:
sol = solve(ensembleprob, SOSRI(), EnsembleThreads(), trajectories = trajectories)
summ = EnsembleSummary(sol) # defaults to saving 0.05, 0.95 quantiles
summ2 = EnsembleSummary(sol, quantiles = (0.25, 0.75))

plot(summ, idxs = (2,4,5,6),
    title = ["Proportion Infected" "Proportion Dead" "R_0" "delta"],
    ylabel = ["i(t)" "d(t)" "R_0(t)" "delta(t)"], xlabel = "t",
    legend = [:topleft :topleft :bottomright :bottomright],
    labels = "Middle 95% Quantile", layout = (2, 2), size = (900, 600))
plot!(summ2, idxs = (2,4,5,6),
    labels = "Middle 50% Quantile", legend =  [:topleft :topleft :bottomright :bottomright])