45. Differentiating Models of Economic Dynamics #

Contents

Differentiating Models of Economic Dynamics

45.1. Overview #

Many problems in macroeconomics involve simulating dynamic models — whether to explore quantitative counterfactuals and perform numerical experiments (e.g., impulse response and conditional moments) or to estimate structural parameters.

Typically these models are either analyzed as linearized systems, where derivatives and equations for estimation are available in closed form, or simulated as a black box with derivative-free optimizers used to calibrate and estimate parameters.

This lecture shows a third path: using automatic differentiation (AD) to differentiate the entire simulation, enabling counterfactual experiments and estimation with gradient-based methods (e.g., optimization or gradient-based samplers such as Hamiltonian Monte Carlo (HMC)).

Building on the linear state space models from linear models and the kalman filter, we use Enzyme.jl to differentiate simulations end-to-end. See auto-differentiation for background on in-place patterns and Enzyme wrappers.

Along the way, you will learn practical patterns for high-performance differentiable code:

Writing allocation-free, in-place simulations that AD can differentiate — a pattern that also provides maximum performance for many non-differentiable use cases
Choosing between forward mode (efficient for impulse responses and few-parameter sensitivities) and reverse mode (efficient for gradients of all structural model parameters with scalar objectives like loss functions)
Using batch tangents to compute multiple derivatives in a single pass
Verifying correctness with type-stability checks (@inferred), allocation profiling (@btime), and finite-difference testing (EnzymeTestUtils) — essential in practice for any high-performance code
Calibrating economic models by plugging AD gradients into optimization routines (e.g., L-BFGS via Optimization.jl)

These techniques extend naturally to richer settings — nonlinear dynamics, heterogeneous agent models, and full structural estimation — wherever you need fast, exact gradients of simulated outcomes with respect to parameters.

Caution: The code in this lecture is significantly more advanced than some of the other lectures, and requires some experience with both auto-differentiation concepts and a more detailed understanding of type-safety and memory management in Julia.

Enzyme.jl is under active development and while state-of-the-art, it is often bleeding-edge. Some of the patterns shown here may change in future releases. See auto-differentiation for the latest best practices.

In practice, you may find using an LLM valuable for navigating the perplexing error messages of Enzyme.jl. Compilation times can be very slow, and performance intuition is not always straightforward.

using Pkg; pkgs = ["BenchmarkTools", "Enzyme", "EnzymeTestUtils", "Optimization", "OptimizationOptimJL", "Plots"]; all(haskey.(Ref(Pkg.project().dependencies), pkgs)) || Pkg.add(pkgs)
using LinearAlgebra, Random, Test, Enzyme, Statistics
using Optimization, OptimizationOptimJL, EnzymeTestUtils
using BenchmarkTools
using Plots

45.2. Simulating a Linear State Space Model #

Take the following parameterization of a linear state space model.

\[ x_{t+1} = A x_t + C w_{t+1}, \qquad y_t = G x_t + H v_t \]

where \(w_{t+1}\) and \(v_t\) are i.i.d. standard normal shocks. States \(x_t \in \mathbb R^N\) and observations \(y_t \in \mathbb R^M\) are stored column-wise.

Note

The implementation below uses mul! in two forms: the 3-argument version mul!(Y, A, X) computes \(Y = A X\) in place, and the 5-argument version mul!(Y, A, X, α, β) computes \(Y = \alpha A X + \beta Y\) in place. With \(\alpha = \beta = 1\) this gives \(Y \mathrel{+}= A X\), which lets us accumulate the shock term without a temporary. These operations dispatch to highly optimized, allocation-free, BLAS routines that Julia calls automatically for dense arrays.

See the auto-differentiation lecture for more on efficient in-place operations and mul!.

function simulate_lss!(x, y, model, x_0, w, v)
    (; A, C, G, H) = model
    N, T1 = size(x)
    M, T1y = size(y)
    T = size(w, 2)

    @assert T1 == T + 1 == T1y
    @assert size(v, 2) == T + 1
    @assert size(A) == (N, N)
    @assert size(G) == (M, N)
    @assert size(C, 1) == N
    @assert size(H, 1) == M
    @assert length(x_0) == N

    # Enzyme has challenges with activity analysis on copyto!/broadcasting assignments
    @inbounds for i in 1:N
        x[i, 1] = x_0[i]
    end
    # Apply evolution and observation equations
    @inbounds for t in 1:T
        @views mul!(x[:, t + 1], A, x[:, t])             # x_{t+1} = A x_t
        @views mul!(x[:, t + 1], C, w[:, t], 1.0, 1.0)   # + C w_{t+1}

        @views mul!(y[:, t], G, x[:, t])                 # y_t = G x_t
        @views mul!(y[:, t], H, v[:, t], 1.0, 1.0)       # + H v_t
    end
    # Apply observation equation at T+1
    @views mul!(y[:, T + 1], G, x[:, T + 1])
    @views mul!(y[:, T + 1], H, v[:, T + 1], 1.0, 1.0)

    return nothing
end

simulate_lss! (generic function with 1 method)

Note

We use a manual loop instead of copyto! or broadcasting for the initial condition assignment because Enzyme has challenges with activity analysis on these operations. When x_0 is Const but x is Duplicated, copyto! triggers a “Detected potential need for runtime activity” error. The explicit loop avoids this issue. See the auto-differentiation lecture for details.

Crucially, this function modifies the preallocated x and y arrays in place without any allocations.

We can use this function to simulate from example matrices and a sequence of shocks.

Random.seed!(1234)

N, M, K, L = 3, 2, 2, 2
T = 10

A = [0.8 0.1 0.0
     0.0 0.7 0.1
     0.0 0.0 0.6]
C = 0.1 .* randn(N, K)
G = [1.0 0.0 0.0
     0.0 1.0 0.3]
H = 0.05 .* randn(M, L)
model = (; A, C, G, H)

x_0 = randn(N)
w = randn(K, T)
v = randn(L, T + 1)

x = zeros(N, T + 1)
y = zeros(M, T + 1)

simulate_lss!(x, y, model, x_0, w, v)

time = 0:T
plot(time, x', lw = 2, xlabel = "t", ylabel = "state", label = ["x1" "x2" "x3"],
     title = "State Paths")

plot(time, y', lw = 2, xlabel = "t", ylabel = "observation",
     label = ["y1" "y2"], title = "Observation Paths")

45.2.1. Differentiating the Simulation#

Forward-mode in Enzyme is convenient for impulse-style effects: for example, here we perturb only \(w_1\), leaving everything else fixed, and can see the change in the \(x\) and \(y\) paths.

# forward-mode on w[1]
x = zeros(N, T + 1)
y = zeros(M, T + 1)
dx = Enzyme.make_zero(x)
dx_0 = Enzyme.make_zero(x_0)
dy = Enzyme.make_zero(y)
dw = Enzyme.make_zero(w)
dw[1] = 1.0                         # unit perturbation to first shock

autodiff(Forward,
         simulate_lss!,
         Duplicated(x, dx),
         Duplicated(y, dy),
         Const(model), # leaving model fixed
         Duplicated(x_0, dx_0), # won't perturb
         Duplicated(w, dw), # perturbing w
         Const(v))

dx[:, 1:3]   # early-state sensitivities (impulse response flavor)

3×3 Matrix{Float64}:
0  -0.0359729  -0.0179062
0   0.108721    0.0719087
0  -0.041959   -0.0251754

Batch tangents let us reuse one primal evaluation while seeding multiple partials. Below we differentiate with respect to two entries of \(A\) in one call; Enzyme accumulates the tangents into separate shadow arrays.

dx_batch = (Enzyme.make_zero(x), Enzyme.make_zero(x))
dy_batch = (Enzyme.make_zero(y), Enzyme.make_zero(y))
dmodels = (Enzyme.make_zero(model), Enzyme.make_zero(model))
dmodels[1].A[1] = 1.0
dmodels[2].A[2] = 1.0

autodiff(Forward,
         simulate_lss!,
         BatchDuplicated(x, dx_batch), # batch duplicated to match dmodels
         BatchDuplicated(y, dy_batch),
         BatchDuplicated(model, dmodels),
         Const(x_0),
         Const(w),
         Const(v))
@show dy_batch[1], dy_batch[2];

(dy_batch[1], dy_batch[2]) = ([0.0 -0.8449577163759787 -1.2511430013904163 -1.3687125396869666 -1.2862527573807614 -1.127182293237571 -0.9065081617463366 -0.7031373354359588 -0.41070713205552856 -0.22893611820925336 0.0181107103754109; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0], [0.0 0.0 -0.08449577163759787 -0.18426134028536015 -0.26585419216844874 -0.3147232102559902 -0.3330244765029502 -0.32376794972669876 -0.296951298352285 -0.24893662205215233 -0.19714924725743194; 0.0 -0.8449577163759787 -1.1666472297528183 -1.1844511994016063 -1.0203985652123122 -0.8124590829815805 -0.5734836852433861 -0.37936938570925977 -0.1137558337032433 0.02000050384289924 0.21525995763284306])

Note that in these cases we are able to see the effect of the perturbation on all calculated quantities through the simulation, and not simply a single scalar output.

45.2.2. Differentiating Functions of Simulation#

Often we care about scalars of the simulated paths. For example, the average of the first observable is

\[ g(w, v, \theta) = \frac{1}{T+1} \sum_{t=0}^T y_{1,t}. \]

Reverse-mode gives the gradient with respect to all shocks in one sweep while holding parameters fixed.

function mean_first_observation(y)
    return mean(@view y[1, :]) # view to avoid allocation
end

function g(x, y, model, x_0, w, v)
    simulate_lss!(x, y, model, x_0, w, v)
    return mean_first_observation(y)
end

x_rev = zeros(N, T + 1)
y_rev = zeros(M, T + 1)
dx_rev = Enzyme.make_zero(x_rev)
dy_rev = Enzyme.make_zero(y_rev)
dw_rev = Enzyme.make_zero(w)
dv_rev = Enzyme.make_zero(v)

autodiff(Reverse,
         g,
         Duplicated(x_rev, dx_rev), # output/buffer
         Duplicated(y_rev, dy_rev), # output/buffer
         Const(model),
         Const(x_0),
         Duplicated(w, dw_rev),   # active shocks
         Duplicated(v, dv_rev))

@show g(x, y, model, x_0, w, v)

dw_rev # sensitivity wrt evolution shock

g(x, y, model, x_0, w, v) = -0.14416898118888172

2×10 Matrix{Float64}:
 -0.00339415  -0.00376167  -0.00416933  …  -0.0048981  -0.00327026
  0.0323812    0.0313436    0.0300423       0.012146    0.00653554

These examples mirror the larger workflow: write allocation-free, in-place simulations, seed tangents with Duplicated, and use forward or reverse mode depending on whether you want many outputs per input (forward) or many inputs to one scalar (reverse). In all cases, mutated arguments must be passed as Duplicated.

Functions which internally allocate buffers will need to ensure that those buffers have the appropriate element type (i.e., dual numbers when active). This can be achieved by propagating eltype from an active argument. The gradient convenience wrapper handles the Duplicated bookkeeping automatically, so the calling code is simpler than the manual autodiff pattern above.

function g2(model, x_0, w, v)
    x = zeros(eltype(x_0), N, T + 1)
    y = zeros(eltype(x_0), M, T + 1)
    simulate_lss!(x, y, model, x_0, w, v)
    return mean_first_observation(y)
end
g2(model, x_0, w, v)
gradient(Reverse, g2, model, x_0, w, v)

((A = [-0.7372205768312146 1.5196565304240475 -0.32150792461896827; -0.20446946438535596 0.3321736601370444 -0.05152341036087862; -0.037254605869844144 0.052907253636102144 -0.006038474038815886], C = [1.8213798041590914 1.9974233310121956; 0.3400772654166607 0.35938265277705433; 0.04757952126324298 0.05076566679947853], G = [-0.12103807771237807 0.4662756524329939 -0.1434392076888752; 0.0 0.0 0.0], H = [-0.22308685072343135 0.013864981851811503; 0.0 0.0]), [0.415500297309091, 0.1184618935, 0.023935839100000002], [-0.003394145246121948 -0.0037616670331938515 … -0.0048981016639946225 -0.003270262789304379; 0.032381219205478144 0.03134361649886693 … 0.012146023683800623 0.006535544886053993], [0.009339832073018348 0.009339832073018348 … 0.009339832073018348 0.009339832073018348; -0.0013859147963474192 -0.0013859147963474192 … -0.0013859147963474192 -0.0013859147963474192])

45.2.3. Calibration#

Reverse-mode in particular can be useful for calibration with simulated dynamics.

For example, consider if our \(A\) matrix was parameterized by a scalar \(a\) in its upper left corner, and we wanted to calibrate \(a\) so that the time average of the first observation matched a target value \(y^* = 0\).

For some technical reasons discussed below, we add a second method of simulate_lss! that takes the model parameters as individual matrices rather than a named tuple. A version that works with the original named-tuple method is given in the next subsection.

function simulate_lss!(x, y, A, C, G, H, x_0, w, v)
    N, T1 = size(x)
    M, T1y = size(y)
    T = size(w, 2)

    @assert T1 == T + 1 == T1y
    @assert size(v, 2) == T + 1
    @assert size(A) == (N, N)
    @assert size(G) == (M, N)
    @assert size(C, 1) == N
    @assert size(H, 1) == M
    @assert length(x_0) == N

    # Enzyme has challenges with activity analysis on broadcasting assignments
    @inbounds for i in 1:N
        x[i, 1] = x_0[i]
    end
    # Apply evolution and observation equations
    @inbounds for t in 1:T
        @views mul!(x[:, t + 1], A, x[:, t])             # x_{t+1} = A x_t
        @views mul!(x[:, t + 1], C, w[:, t], 1.0, 1.0)   # + C w_{t+1}

        @views mul!(y[:, t], G, x[:, t])                 # y_t = G x_t
        @views mul!(y[:, t], H, v[:, t], 1.0, 1.0)       # + H v_t
    end
    # Apply observation equation at T+1
    @views mul!(y[:, T + 1], G, x[:, T + 1])
    @views mul!(y[:, T + 1], H, v[:, T + 1], 1.0, 1.0)

    return nothing
end

function parameterized_A(a)
    return [a 0.1 0.0; 0.0 0.7 0.1; 0.0 0.0 0.6]
end

function loss(u, p)
    (; x_0, w, v, y_target, C, G, H) = p
    T = size(w, 2)
    a = u[1]

    A = parameterized_A(a)

    # Allocate buffers and simulate
    x = zeros(eltype(A), length(x_0), T + 1)
    y = zeros(eltype(A), size(G, 1), T + 1)
    simulate_lss!(x, y, A, C, G, H, x_0, w, v)

    y_mean = mean_first_observation(y)
    return (y_mean - y_target)^2
end

loss (generic function with 1 method)

There are a few tricks to note here, which work around challenges with using Enzyme in its current state.

With the SciML packages, such as Optimization.jl, the AutoEnzyme() automatically determines which variables to mark as Active following certain patterns
In particular, p holds parameters assumed to be constant during optimization, while u holds the optimization variables.
Enzyme generally avoids allocations, but allocation is necessary here since we need to create a new model with the appropriate a value.
For the buffers, note the use of eltype(A) to ensure the correct type, since the A matrix itself will be differentiable.

Using this setup, we can create the p parameter named tuple, none of which will be differentiable, and then use the LBFGS optimizer in OptimizationOptimJL to find the optimal a value.

y_target = 0.0

# Bundle parameters into a named tuple
p = (; x_0, w, v, y_target, C, G, H)

# Initial Guess for 'a'
u_0 = [0.8]

println("Initial a=$(u_0[1]), Loss: $(loss(u_0, p))")

# Define the problem type, associate initial condition and parameters
# AutoEnzyme() will handle the AD setup
optf = OptimizationFunction(loss, AutoEnzyme())
prob = OptimizationProblem(optf, u_0, p)

sol = solve(prob, OptimizationOptimJL.LBFGS())
println("Final a=$(sol.u[1]), Loss: $(loss(sol.u, p))")

Initial a=0.8, Loss: 0.02078469513704013

Final a=-0.37935758108295203, Loss: 0.0006086293962801088

45.2.4. Calibration with More Complicated Types#

The previous calibration used the individual-matrix method of simulate_lss! to sidestep a subtlety: when some fields of a named tuple are active (depend on the optimization variable) and others are constant, Enzyme needs help distinguishing them. The following version shows how to use the original named-tuple method by “laundering” the constant matrices through copy, so that Enzyme sees all fields as local variables with consistent activity.

function loss_2(u, p)

    # Unpack constants
    (; x_0, w, v, y_target, C, G, H) = p
    T = size(w, 2)
    a = u[1]

    A = parameterized_A(a) # A is Active (depends on u)

    # Trick: "Launder" the constants by copying them.
    # Creates new locals that Enzyme sees as "Local Active Variables"
    # Now build the struct with homogeneous (all local) variables
    model = (; A, C = copy(C), G = copy(G), H = copy(H))

    # Allocate buffers and simulate
    x = zeros(eltype(A), length(x_0), T + 1)
    y = zeros(eltype(A), size(G, 1), T + 1)
    simulate_lss!(x, y, model, x_0, w, v)

    y_mean = mean_first_observation(y)
    return (y_mean - y_target)^2
end

loss_2 (generic function with 1 method)

This version uses our original simulate_lss! function, which takes a named tuple for the model parameters. Note that it contains a trick where the constant parameters C, G, and H are “laundered” by copying them into new local variables before building the named tuple. This ensures that Enzyme can properly analyze which variables are active versus constant when creating the named tuple.

The simulate_lss! approach that splits out the A matrix might be more efficient for large \(C, G, H\) since it avoids the copies, but it may not matter in practice.

The other code is identical with this new loss.

y_target = 0.0

# Bundle parameters into a named tuple
p = (; x_0, w, v, y_target, C, G, H)

# Initial Guess for 'a'
u_0 = [0.8]

println("Initial a=$(u_0[1]), Loss: $(loss_2(u_0, p))")

# Define the problem type, associate initial condition and parameters
# AutoEnzyme() will handle the AD setup
optf = OptimizationFunction(loss_2, AutoEnzyme())
prob = OptimizationProblem(optf, u_0, p)

sol = solve(prob, OptimizationOptimJL.LBFGS())
println("Final a=$(sol.u[1]), Loss: $(loss_2(sol.u, p))")

Initial a=0.8, Loss: 0.02078469513704013
Final a=-0.37935758108295203, Loss: 0.0006086293962801088

45.3. Verifying High-Performance Differentiable Code #

For any high-performance differentiable simulation, it is important to verify that the code is type-stable, allocation-free, and that the AD produces correct gradients. This section collects the key checks in one place.

45.3.1. Type Stability#

Returning to the simulate_lss! function from Simulating a Linear State Space Model, we first verify that the code is type-stable. The following call is silent, which indicates there are no type-stability issues.

@inferred simulate_lss!(x, y, model, x_0, w, v)

45.3.2. Allocation Checking#

Next we check that the simulation does not allocate any memory during execution.

using BenchmarkTools
function count_allocs()
    return simulate_lss!(x, y, model, x_0, w, v)
end
@btime count_allocs()

  1.520 μs

 (0 allocations: 0 bytes)

45.3.3. AD Correctness with Finite Differences#

For complicated functions such as simulations, we cannot assume that Enzyme (or any AD system) will necessarily produce correct derivatives. The EnzymeTestUtils package provides utilities that automatically check AD results against finite-difference approximations using the appropriate seeding.

45.3.3.1. Forward Mode#

Returning to the forward-mode differentiation from Differentiating the Simulation, we verify the tangents are correct. When using in-place functions that mutate their arguments, test_forward requires that the mutated arguments are returned as output, which can be done with a small wrapper.

function test_forward_simulate_lss!(x, y, model, x_0, w, v)
    simulate_lss!(x, y, model, x_0, w, v)
    return x, y
end
test_forward(test_forward_simulate_lss!,
             Duplicated,
             (x, Duplicated),
             (y, Duplicated),
             (model, Const),
             (x_0, Duplicated),
             (w, Const),
             (v, Const))

Test Summary:                                                                                                                                                                                                                                                                                                                                 | Pass  

Total  Time
test_forward: test_forward_simulate_lss! with return activity Duplicated on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const) |   18     18  3.8s

Test.DefaultTestSet("test_forward: test_forward_simulate_lss! with return activity Duplicated on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const)", Any[], 18, false, false, true, 1.771893872300886e9, 1.77189387610052e9, false, "/home/runner/.julia/packages/EnzymeTestUtils/yGBt1/src/test_forward.jl", Xoshiro(0x9951797c85a704f1, 0xb9d66be14dfba82b, 0xb170153285fd9556, 0xe90a07f7bdd1fd77, 0x9d4b5ee33e4bd661))

45.3.3.2. Reverse Mode#

Unlike test_forward, the test_reverse utility requires a scalar-valued function. We can apply it to the g function defined in the reverse-mode section above.

x_rev = zeros(N, T + 1)
y_rev = zeros(M, T + 1)

test_reverse(g,
             Active,
             (x_rev, Duplicated),
             (y_rev, Duplicated),
             (model, Const),
             (x_0, Const),
             (w, Duplicated),
             (v, Duplicated))

# Or with a different set of active arguments
test_reverse(g,
             Active,
             (x_rev, Duplicated),
             (y_rev, Duplicated),
             (model, Duplicated),
             (x_0, Duplicated),
             (w, Const),
             (v, Const))

Test Summary:                                                                                                                                                                                                                                                                                                         | Pass  

Total  Time
test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Const), (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated) |   25     25  4.2s
Test Summary:                                                                                                                                                                                                                                                                                                         | 

Pass  Total  Time
test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Duplicated), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const) |   29     29  3.7s

Test.DefaultTestSet("test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Duplicated), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const)", Any[], 29, false, false, true, 1.771893880742064e9, 1.771893884481176e9, false, "/home/runner/.julia/packages/EnzymeTestUtils/yGBt1/src/test_reverse.jl", Xoshiro(0x9951797c85a704f1, 0xb9d66be14dfba82b, 0xb170153285fd9556, 0xe90a07f7bdd1fd77, 0x9d4b5ee33e4bd661))