45. Differentiating Models of Economic Dynamics#
Contents
45.1. Overview#
Many problems in macroeconomics involve simulating dynamic models — whether to explore quantitative counterfactuals and perform numerical experiments (e.g., impulse response and conditional moments) or to estimate structural parameters.
Typically these models are either analyzed as linearized systems, where derivatives and equations for estimation are available in closed form, or simulated as a black box with derivative-free optimizers used to calibrate and estimate parameters.
This lecture shows a third path: using automatic differentiation (AD) to differentiate the entire simulation, enabling counterfactual experiments and estimation with gradient-based methods (e.g., optimization or gradient-based samplers such as Hamiltonian Monte Carlo (HMC)).
Building on the linear state space models from linear models and the kalman filter, we use Enzyme.jl to differentiate simulations end-to-end. See auto-differentiation for background on in-place patterns and Enzyme wrappers.
Along the way, you will learn practical patterns for high-performance differentiable code:
Writing allocation-free, in-place simulations that AD can differentiate — a pattern that also provides maximum performance for many non-differentiable use cases
Choosing between forward mode (efficient for impulse responses and few-parameter sensitivities) and reverse mode (efficient for gradients of all structural model parameters with scalar objectives like loss functions)
Using batch tangents to compute multiple derivatives in a single pass
Verifying correctness with type-stability checks (
@inferred), allocation profiling (@btime), and finite-difference testing (EnzymeTestUtils) — essential in practice for any high-performance codeCalibrating economic models by plugging AD gradients into optimization routines (e.g., L-BFGS via
Optimization.jl)
These techniques extend naturally to richer settings — nonlinear dynamics, heterogeneous agent models, and full structural estimation — wherever you need fast, exact gradients of simulated outcomes with respect to parameters.
Caution: The code in this lecture is significantly more advanced than some of the other lectures, and requires some experience with both auto-differentiation concepts and a more detailed understanding of type-safety and memory management in Julia.
Enzyme.jl is under active development and while state-of-the-art, it is often bleeding-edge. Some of the patterns shown here may change in future releases. See auto-differentiation for the latest best practices.
In practice, you may find using an LLM valuable for navigating the perplexing error messages of Enzyme.jl. Compilation times can be very slow, and performance intuition is not always straightforward.
using Pkg; pkgs = ["BenchmarkTools", "Enzyme", "EnzymeTestUtils", "Optimization", "OptimizationOptimJL", "Plots"]; all(haskey.(Ref(Pkg.project().dependencies), pkgs)) || Pkg.add(pkgs)
using LinearAlgebra, Random, Test, Enzyme, Statistics
using Optimization, OptimizationOptimJL, EnzymeTestUtils
using BenchmarkTools
using Plots
45.2. Simulating a Linear State Space Model#
Take the following parameterization of a linear state space model.
where \(w_{t+1}\) and \(v_t\) are i.i.d. standard normal shocks. States \(x_t \in \mathbb R^N\) and observations \(y_t \in \mathbb R^M\) are stored column-wise.
Note
The implementation below uses mul! in two forms: the 3-argument version mul!(Y, A, X) computes \(Y = A X\) in place, and the 5-argument version mul!(Y, A, X, α, β) computes \(Y = \alpha A X + \beta Y\) in place. With \(\alpha = \beta = 1\) this gives \(Y \mathrel{+}= A X\), which lets us accumulate the shock term without a temporary. These operations dispatch to highly optimized, allocation-free, BLAS routines that Julia calls automatically for dense arrays.
See the auto-differentiation lecture for more on efficient in-place operations and mul!.
function simulate_lss!(x, y, model, x_0, w, v)
(; A, C, G, H) = model
N, T1 = size(x)
M, T1y = size(y)
T = size(w, 2)
@assert T1 == T + 1 == T1y
@assert size(v, 2) == T + 1
@assert size(A) == (N, N)
@assert size(G) == (M, N)
@assert size(C, 1) == N
@assert size(H, 1) == M
@assert length(x_0) == N
# Enzyme has challenges with activity analysis on copyto!/broadcasting assignments
@inbounds for i in 1:N
x[i, 1] = x_0[i]
end
# Apply evolution and observation equations
@inbounds for t in 1:T
@views mul!(x[:, t + 1], A, x[:, t]) # x_{t+1} = A x_t
@views mul!(x[:, t + 1], C, w[:, t], 1.0, 1.0) # + C w_{t+1}
@views mul!(y[:, t], G, x[:, t]) # y_t = G x_t
@views mul!(y[:, t], H, v[:, t], 1.0, 1.0) # + H v_t
end
# Apply observation equation at T+1
@views mul!(y[:, T + 1], G, x[:, T + 1])
@views mul!(y[:, T + 1], H, v[:, T + 1], 1.0, 1.0)
return nothing
end
simulate_lss! (generic function with 1 method)
Note
We use a manual loop instead of copyto! or broadcasting for the initial condition assignment because Enzyme has challenges with activity analysis on these operations. When x_0 is Const but x is Duplicated, copyto! triggers a “Detected potential need for runtime activity” error. The explicit loop avoids this issue. See the auto-differentiation lecture for details.
Crucially, this function modifies the preallocated x and y arrays in place without any allocations.
We can use this function to simulate from example matrices and a sequence of shocks.
Random.seed!(1234)
N, M, K, L = 3, 2, 2, 2
T = 10
A = [0.8 0.1 0.0
0.0 0.7 0.1
0.0 0.0 0.6]
C = 0.1 .* randn(N, K)
G = [1.0 0.0 0.0
0.0 1.0 0.3]
H = 0.05 .* randn(M, L)
model = (; A, C, G, H)
x_0 = randn(N)
w = randn(K, T)
v = randn(L, T + 1)
x = zeros(N, T + 1)
y = zeros(M, T + 1)
simulate_lss!(x, y, model, x_0, w, v)
time = 0:T
plot(time, x', lw = 2, xlabel = "t", ylabel = "state", label = ["x1" "x2" "x3"],
title = "State Paths")
plot(time, y', lw = 2, xlabel = "t", ylabel = "observation",
label = ["y1" "y2"], title = "Observation Paths")
45.2.1. Differentiating the Simulation#
Forward-mode in Enzyme is convenient for impulse-style effects: for example, here we perturb only \(w_1\), leaving everything else fixed, and can see the change in the \(x\) and \(y\) paths.
# forward-mode on w[1]
x = zeros(N, T + 1)
y = zeros(M, T + 1)
dx = Enzyme.make_zero(x)
dx_0 = Enzyme.make_zero(x_0)
dy = Enzyme.make_zero(y)
dw = Enzyme.make_zero(w)
dw[1] = 1.0 # unit perturbation to first shock
autodiff(Forward,
simulate_lss!,
Duplicated(x, dx),
Duplicated(y, dy),
Const(model), # leaving model fixed
Duplicated(x_0, dx_0), # won't perturb
Duplicated(w, dw), # perturbing w
Const(v))
dx[:, 1:3] # early-state sensitivities (impulse response flavor)
3×3 Matrix{Float64}:
0.0 -0.0359729 -0.0179062
0.0 0.108721 0.0719087
0.0 -0.041959 -0.0251754
Batch tangents let us reuse one primal evaluation while seeding multiple partials. Below we differentiate with respect to two entries of \(A\) in one call; Enzyme accumulates the tangents into separate shadow arrays.
dx_batch = (Enzyme.make_zero(x), Enzyme.make_zero(x))
dy_batch = (Enzyme.make_zero(y), Enzyme.make_zero(y))
dmodels = (Enzyme.make_zero(model), Enzyme.make_zero(model))
dmodels[1].A[1] = 1.0
dmodels[2].A[2] = 1.0
autodiff(Forward,
simulate_lss!,
BatchDuplicated(x, dx_batch), # batch duplicated to match dmodels
BatchDuplicated(y, dy_batch),
BatchDuplicated(model, dmodels),
Const(x_0),
Const(w),
Const(v))
@show dy_batch[1], dy_batch[2];
(dy_batch[1], dy_batch[2]) = ([0.0 -0.8449577163759787 -1.2511430013904163 -1.3687125396869666 -1.2862527573807614 -1.127182293237571 -0.9065081617463366 -0.7031373354359588 -0.41070713205552856 -0.22893611820925336 0.0181107103754109; 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0], [0.0 0.0 -0.08449577163759787 -0.18426134028536015 -0.26585419216844874 -0.3147232102559902 -0.3330244765029502 -0.32376794972669876 -0.296951298352285 -0.24893662205215233 -0.19714924725743194; 0.0 -0.8449577163759787 -1.1666472297528183 -1.1844511994016063 -1.0203985652123122 -0.8124590829815805 -0.5734836852433861 -0.37936938570925977 -0.1137558337032433 0.02000050384289924 0.21525995763284306])
Note that in these cases we are able to see the effect of the perturbation on all calculated quantities through the simulation, and not simply a single scalar output.
45.2.2. Differentiating Functions of Simulation#
Often we care about scalars of the simulated paths. For example, the average of the first observable is
Reverse-mode gives the gradient with respect to all shocks in one sweep while holding parameters fixed.
function mean_first_observation(y)
return mean(@view y[1, :]) # view to avoid allocation
end
function g(x, y, model, x_0, w, v)
simulate_lss!(x, y, model, x_0, w, v)
return mean_first_observation(y)
end
x_rev = zeros(N, T + 1)
y_rev = zeros(M, T + 1)
dx_rev = Enzyme.make_zero(x_rev)
dy_rev = Enzyme.make_zero(y_rev)
dw_rev = Enzyme.make_zero(w)
dv_rev = Enzyme.make_zero(v)
autodiff(Reverse,
g,
Duplicated(x_rev, dx_rev), # output/buffer
Duplicated(y_rev, dy_rev), # output/buffer
Const(model),
Const(x_0),
Duplicated(w, dw_rev), # active shocks
Duplicated(v, dv_rev))
@show g(x, y, model, x_0, w, v)
dw_rev # sensitivity wrt evolution shock
g(x, y, model, x_0, w, v) = -0.14416898118888172
2×10 Matrix{Float64}:
-0.00339415 -0.00376167 -0.00416933 … -0.0048981 -0.00327026
0.0323812 0.0313436 0.0300423 0.012146 0.00653554
These examples mirror the larger workflow: write allocation-free, in-place simulations, seed tangents with Duplicated, and use forward or reverse mode depending on whether you want many outputs per input (forward) or many inputs to one scalar (reverse). In all cases, mutated arguments must be passed as Duplicated.
Functions which internally allocate buffers will need to ensure that those buffers have the appropriate element type (i.e., dual numbers when active). This can be achieved by propagating eltype from an active argument. The gradient convenience wrapper handles the Duplicated bookkeeping automatically, so the calling code is simpler than the manual autodiff pattern above.
function g2(model, x_0, w, v)
x = zeros(eltype(x_0), N, T + 1)
y = zeros(eltype(x_0), M, T + 1)
simulate_lss!(x, y, model, x_0, w, v)
return mean_first_observation(y)
end
g2(model, x_0, w, v)
gradient(Reverse, g2, model, x_0, w, v)
((A = [-0.7372205768312146 1.5196565304240475 -0.32150792461896827; -0.20446946438535596 0.3321736601370444 -0.05152341036087862; -0.037254605869844144 0.052907253636102144 -0.006038474038815886], C = [1.8213798041590914 1.9974233310121956; 0.3400772654166607 0.35938265277705433; 0.04757952126324298 0.05076566679947853], G = [-0.12103807771237807 0.4662756524329939 -0.1434392076888752; 0.0 0.0 0.0], H = [-0.22308685072343135 0.013864981851811503; 0.0 0.0]), [0.415500297309091, 0.1184618935, 0.023935839100000002], [-0.003394145246121948 -0.0037616670331938515 … -0.0048981016639946225 -0.003270262789304379; 0.032381219205478144 0.03134361649886693 … 0.012146023683800623 0.006535544886053993], [0.009339832073018348 0.009339832073018348 … 0.009339832073018348 0.009339832073018348; -0.0013859147963474192 -0.0013859147963474192 … -0.0013859147963474192 -0.0013859147963474192])
45.2.3. Calibration#
Reverse-mode in particular can be useful for calibration with simulated dynamics.
For example, consider if our \(A\) matrix was parameterized by a scalar \(a\) in its upper left corner, and we wanted to calibrate \(a\) so that the time average of the first observation matched a target value \(y^* = 0\).
For some technical reasons discussed below, we add a second method of simulate_lss! that takes the model parameters as individual matrices rather than a named tuple. A version that works with the original named-tuple method is given in the next subsection.
function simulate_lss!(x, y, A, C, G, H, x_0, w, v)
N, T1 = size(x)
M, T1y = size(y)
T = size(w, 2)
@assert T1 == T + 1 == T1y
@assert size(v, 2) == T + 1
@assert size(A) == (N, N)
@assert size(G) == (M, N)
@assert size(C, 1) == N
@assert size(H, 1) == M
@assert length(x_0) == N
# Enzyme has challenges with activity analysis on broadcasting assignments
@inbounds for i in 1:N
x[i, 1] = x_0[i]
end
# Apply evolution and observation equations
@inbounds for t in 1:T
@views mul!(x[:, t + 1], A, x[:, t]) # x_{t+1} = A x_t
@views mul!(x[:, t + 1], C, w[:, t], 1.0, 1.0) # + C w_{t+1}
@views mul!(y[:, t], G, x[:, t]) # y_t = G x_t
@views mul!(y[:, t], H, v[:, t], 1.0, 1.0) # + H v_t
end
# Apply observation equation at T+1
@views mul!(y[:, T + 1], G, x[:, T + 1])
@views mul!(y[:, T + 1], H, v[:, T + 1], 1.0, 1.0)
return nothing
end
function parameterized_A(a)
return [a 0.1 0.0; 0.0 0.7 0.1; 0.0 0.0 0.6]
end
function loss(u, p)
(; x_0, w, v, y_target, C, G, H) = p
T = size(w, 2)
a = u[1]
A = parameterized_A(a)
# Allocate buffers and simulate
x = zeros(eltype(A), length(x_0), T + 1)
y = zeros(eltype(A), size(G, 1), T + 1)
simulate_lss!(x, y, A, C, G, H, x_0, w, v)
y_mean = mean_first_observation(y)
return (y_mean - y_target)^2
end
loss (generic function with 1 method)
There are a few tricks to note here, which work around challenges with using Enzyme in its current state.
With the SciML packages, such as Optimization.jl, the
AutoEnzyme()automatically determines which variables to mark asActivefollowing certain patternsIn particular,
pholds parameters assumed to be constant during optimization, whileuholds the optimization variables.Enzyme generally avoids allocations, but allocation is necessary here since we need to create a new model with the appropriate
avalue.For the buffers, note the use of
eltype(A)to ensure the correct type, since theAmatrix itself will be differentiable.
Using this setup, we can create the p parameter named tuple, none of which will be differentiable, and then use the LBFGS optimizer in OptimizationOptimJL to find the optimal a value.
y_target = 0.0
# Bundle parameters into a named tuple
p = (; x_0, w, v, y_target, C, G, H)
# Initial Guess for 'a'
u_0 = [0.8]
println("Initial a=$(u_0[1]), Loss: $(loss(u_0, p))")
# Define the problem type, associate initial condition and parameters
# AutoEnzyme() will handle the AD setup
optf = OptimizationFunction(loss, AutoEnzyme())
prob = OptimizationProblem(optf, u_0, p)
sol = solve(prob, OptimizationOptimJL.LBFGS())
println("Final a=$(sol.u[1]), Loss: $(loss(sol.u, p))")
Initial a=0.8, Loss: 0.02078469513704013
Final a=-0.37935758108295203, Loss: 0.0006086293962801088
45.2.4. Calibration with More Complicated Types#
The previous calibration used the individual-matrix method of simulate_lss! to sidestep a subtlety: when some fields of a named tuple are active (depend on the optimization variable) and others are constant, Enzyme needs help distinguishing them. The following version shows how to use the original named-tuple method by “laundering” the constant matrices through copy, so that Enzyme sees all fields as local variables with consistent activity.
function loss_2(u, p)
# Unpack constants
(; x_0, w, v, y_target, C, G, H) = p
T = size(w, 2)
a = u[1]
A = parameterized_A(a) # A is Active (depends on u)
# Trick: "Launder" the constants by copying them.
# Creates new locals that Enzyme sees as "Local Active Variables"
# Now build the struct with homogeneous (all local) variables
model = (; A, C = copy(C), G = copy(G), H = copy(H))
# Allocate buffers and simulate
x = zeros(eltype(A), length(x_0), T + 1)
y = zeros(eltype(A), size(G, 1), T + 1)
simulate_lss!(x, y, model, x_0, w, v)
y_mean = mean_first_observation(y)
return (y_mean - y_target)^2
end
loss_2 (generic function with 1 method)
This version uses our original simulate_lss! function, which takes a named tuple for the model parameters. Note that it contains a trick where the constant parameters C, G, and H are “laundered” by copying them into new local variables before building the named tuple. This ensures that Enzyme can properly analyze which variables are active versus constant when creating the named tuple.
The simulate_lss! approach that splits out the A matrix might be more efficient for large \(C, G, H\) since it avoids the copies, but it may not matter in practice.
The other code is identical with this new loss.
y_target = 0.0
# Bundle parameters into a named tuple
p = (; x_0, w, v, y_target, C, G, H)
# Initial Guess for 'a'
u_0 = [0.8]
println("Initial a=$(u_0[1]), Loss: $(loss_2(u_0, p))")
# Define the problem type, associate initial condition and parameters
# AutoEnzyme() will handle the AD setup
optf = OptimizationFunction(loss_2, AutoEnzyme())
prob = OptimizationProblem(optf, u_0, p)
sol = solve(prob, OptimizationOptimJL.LBFGS())
println("Final a=$(sol.u[1]), Loss: $(loss_2(sol.u, p))")
Initial a=0.8, Loss: 0.02078469513704013
Final a=-0.37935758108295203, Loss: 0.0006086293962801088
45.3. Verifying High-Performance Differentiable Code#
For any high-performance differentiable simulation, it is important to verify that the code is type-stable, allocation-free, and that the AD produces correct gradients. This section collects the key checks in one place.
45.3.1. Type Stability#
Returning to the simulate_lss! function from Simulating a Linear State Space Model, we first verify that the code is type-stable. The following call is silent, which indicates there are no type-stability issues.
@inferred simulate_lss!(x, y, model, x_0, w, v)
45.3.2. Allocation Checking#
Next we check that the simulation does not allocate any memory during execution.
using BenchmarkTools
function count_allocs()
return simulate_lss!(x, y, model, x_0, w, v)
end
@btime count_allocs()
1.520 μs
(0 allocations: 0 bytes)
45.3.3. AD Correctness with Finite Differences#
For complicated functions such as simulations, we cannot assume that Enzyme (or any AD system) will necessarily produce correct derivatives. The EnzymeTestUtils package provides utilities that automatically check AD results against finite-difference approximations using the appropriate seeding.
45.3.3.1. Forward Mode#
Returning to the forward-mode differentiation from Differentiating the Simulation, we verify the tangents are correct. When using in-place functions that mutate their arguments, test_forward requires that the mutated arguments are returned as output, which can be done with a small wrapper.
function test_forward_simulate_lss!(x, y, model, x_0, w, v)
simulate_lss!(x, y, model, x_0, w, v)
return x, y
end
test_forward(test_forward_simulate_lss!,
Duplicated,
(x, Duplicated),
(y, Duplicated),
(model, Const),
(x_0, Duplicated),
(w, Const),
(v, Const))
Test Summary: | Pass
Total Time
test_forward: test_forward_simulate_lss! with return activity Duplicated on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const) | 18 18 3.8s
Test.DefaultTestSet("test_forward: test_forward_simulate_lss! with return activity Duplicated on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const)", Any[], 18, false, false, true, 1.771893872300886e9, 1.77189387610052e9, false, "/home/runner/.julia/packages/EnzymeTestUtils/yGBt1/src/test_forward.jl", Xoshiro(0x9951797c85a704f1, 0xb9d66be14dfba82b, 0xb170153285fd9556, 0xe90a07f7bdd1fd77, 0x9d4b5ee33e4bd661))
45.3.3.2. Reverse Mode#
Unlike test_forward, the test_reverse utility requires a scalar-valued function. We can apply it to the g function defined in the reverse-mode section above.
x_rev = zeros(N, T + 1)
y_rev = zeros(M, T + 1)
test_reverse(g,
Active,
(x_rev, Duplicated),
(y_rev, Duplicated),
(model, Const),
(x_0, Const),
(w, Duplicated),
(v, Duplicated))
# Or with a different set of active arguments
test_reverse(g,
Active,
(x_rev, Duplicated),
(y_rev, Duplicated),
(model, Duplicated),
(x_0, Duplicated),
(w, Const),
(v, Const))
Test Summary: | Pass
Total Time
test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Const), (::Vector{Float64}, Const), (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated) | 25 25 4.2s
Test Summary: |
Pass Total Time
test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Duplicated), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const) | 29 29 3.7s
Test.DefaultTestSet("test_reverse: g with return activity Active on (::Matrix{Float64}, Duplicated), (::Matrix{Float64}, Duplicated), (::@NamedTuple{A::Matrix{Float64}, C::Matrix{Float64}, G::Matrix{Float64}, H::Matrix{Float64}}, Duplicated), (::Vector{Float64}, Duplicated), (::Matrix{Float64}, Const), (::Matrix{Float64}, Const)", Any[], 29, false, false, true, 1.771893880742064e9, 1.771893884481176e9, false, "/home/runner/.julia/packages/EnzymeTestUtils/yGBt1/src/test_reverse.jl", Xoshiro(0x9951797c85a704f1, 0xb9d66be14dfba82b, 0xb170153285fd9556, 0xe90a07f7bdd1fd77, 0x9d4b5ee33e4bd661))