36. Optimal Growth III: The Endogenous Grid Method#

36.1. Overview#

We solved the stochastic optimal growth model using

  1. value function iteration

  2. Euler equation based time iteration

We found time iteration to be significantly more accurate at each step.

In this lecture we’ll look at an ingenious twist on the time iteration technique called the endogenous grid method (EGM).

EGM is a numerical method for implementing policy iteration invented by Chris Carroll.

It is a good example of how a clever algorithm can save a massive amount of computer time.

(Massive when we multiply saved CPU cycles on each implementation times the number of implementations worldwide)

The original reference is [Car06].

36.2. Key Idea#

Let’s start by reminding ourselves of the theory and then see how the numerics fit in.

36.2.1. Theory#

Take the model set out in the time iteration lecture, following the same terminology and notation.

The Euler equation is

(36.1)#\[(u'\circ c^*)(y) = \beta \int (u'\circ c^*)(f(y - c^*(y)) z) f'(y - c^*(y)) z \phi(dz)\]

As we saw, the Coleman operator is a nonlinear operator \(K\) engineered so that \(c^*\) is a fixed point of \(K\).

It takes as its argument a continuous strictly increasing consumption policy \(g \in \Sigma\).

It returns a new function \(Kg\), where \((Kg)(y)\) is the \(c \in (0, \infty)\) that solves

(36.2)#\[u'(c) = \beta \int (u' \circ g) (f(y - c) z ) f'(y - c) z \phi(dz)\]

36.2.2. Exogenous Grid#

As discussed in the lecture on time iteration, to implement the method on a computer we need numerical approximation.

In particular, we represent a policy function by a set of values on a finite grid.

The function itself is reconstructed from this representation when necessary, using interpolation or some other method.

Previously, to obtain a finite representation of an updated consumption policy we

  • fixed a grid of income points \(\{y_i\}\)

  • calculated the consumption value \(c_i\) corresponding to each \(y_i\) using (36.2) and a root finding routine

Each \(c_i\) is then interpreted as the value of the function \(K g\) at \(y_i\).

Thus, with the points \(\{y_i, c_i\}\) in hand, we can reconstruct \(Kg\) via approximation.

Iteration then continues…

36.2.3. Endogenous Grid#

The method discussed above requires a root finding routine to find the \(c_i\) corresponding to a given income value \(y_i\).

Root finding is costly because it typically involves a significant number of function evaluations.

As pointed out by Carroll [Car06], we can avoid this if \(y_i\) is chosen endogenously.

The only assumption required is that \(u'\) is invertible on \((0, \infty)\).

The idea is this:

First we fix an exogenous grid \(\{k_i\}\) for capital (\(k = y - c\)).

Then we obtain \(c_i\) via

(36.3)#\[c_i = (u')^{-1} \left\{ \beta \int (u' \circ g) (f(k_i) z ) \, f'(k_i) \, z \, \phi(dz) \right\}\]

where \((u')^{-1}\) is the inverse function of \(u'\).

Finally, for each \(c_i\) we set \(y_i = c_i + k_i\).

It is clear that each \((y_i, c_i)\) pair constructed in this manner satisfies (36.2).

With the points \(\{y_i, c_i\}\) in hand, we can reconstruct \(Kg\) via approximation as before.

The name EGM comes from the fact that the grid \(\{y_i\}\) is determined endogenously.

36.3. Implementation#

Let’s implement this version of the Coleman operator and see how it performs.

36.3.1. The Operator#

Here’s an implementation of \(K\) using EGM as described above.

using LinearAlgebra, Statistics
using BenchmarkTools, Interpolations, LaTeXStrings, Plots, Random, Roots
function coleman_egm(g, k_grid, beta, u_prime, u_prime_inv, f, f_prime, shocks)

    # Allocate memory for value of consumption on endogenous grid points
    c = similar(k_grid)

    # Solve for updated consumption value
    for (i, k) in enumerate(k_grid)
        vals = u_prime.(g.(f(k) * shocks)) .* f_prime(k) .* shocks
        c[i] = u_prime_inv(beta * mean(vals))
    end

    # Determine endogenous grid
    y = k_grid + c  # y_i = k_i + c_i

    # Update policy function and return
    Kg = LinearInterpolation(y, c, extrapolation_bc = Line())
    Kg_f(x) = Kg(x)
    return Kg_f
end
coleman_egm (generic function with 1 method)

Note the lack of any root finding algorithm.

We’ll also run our original implementation, which uses an exogenous grid and requires root finding, so we can perform some comparisons

function K!(Kg, g, grid, beta, u_prime, f, f_prime, shocks)

    # This function requires the container of the output value as argument Kg

    # Construct linear interpolation object #
    g_func = LinearInterpolation(grid, g, extrapolation_bc = Line())

    # solve for updated consumption value #
    for (i, y) in enumerate(grid)
        function h(c)
            vals = u_prime.(g_func.(f(y - c) * shocks)) .* f_prime(y - c) .* shocks
            return u_prime(c) - beta * mean(vals)
        end
        Kg[i] = find_zero(h, (1e-10, y - 1e-10))
    end
    return Kg
end

# The following function does NOT require the container of the output value as argument
function K(g, grid, beta, u_prime, f, f_prime, shocks)
    K!(similar(g), g, grid, beta, u_prime, f, f_prime, shocks)
end
K (generic function with 1 method)

Let’s test out the code above on some example parameterizations, after the following imports.

36.3.2. Testing on the Log / Cobb–Douglas case#

As we did for value function iteration and time iteration, let’s start by testing our method with the log-linear benchmark.

The first step is to bring in the model that we used in the Coleman policy function iteration

# model

function Model(; alpha = 0.65, # productivity parameter
               beta = 0.95, # discount factor
               gamma = 1.0,  # risk aversion
               mu = 0.0,  # lognorm(mu, sigma)
               s = 0.1,  # lognorm(mu, sigma)
               grid_min = 1e-6, # smallest grid point
               grid_max = 4.0,  # largest grid point
               grid_size = 200, # grid size
               u = gamma == 1 ? log : c -> (c^(1 - gamma) - 1) / (1 - gamma), # utility function
               u_prime = c -> c^(-gamma), # u'
               f = k -> k^alpha, # production function
               f_prime = k -> alpha * k^(alpha - 1), # f'
               grid = range(grid_min, grid_max, length = grid_size)) # grid
    return (; alpha, beta, gamma, mu, s, grid_min, grid_max, grid_size, u, u_prime,
            f, f_prime, grid)
end
Model (generic function with 1 method)

Next we generate an instance

mlog = Model(); # Log Linear model

We also need some shock draws for Monte Carlo integration

Random.seed!(42); # For reproducible behavior.

shock_size = 250     # Number of shock draws in Monte Carlo integral
shocks = exp.(mlog.mu .+ mlog.s * randn(shock_size));

As a preliminary test, let’s see if \(K c^* = c^*\), as implied by the theory

c_star(y) = (1 - mlog.alpha * mlog.beta) * y

# some useful constants
ab = mlog.alpha * mlog.beta
c1 = log(1 - ab) / (1 - mlog.beta)
c2 = (mlog.mu + mlog.alpha * log(ab)) / (1 - mlog.alpha)
c3 = 1 / (1 - mlog.beta)
c4 = 1 / (1 - ab)

v_star(y) = c1 + c2 * (c3 - c4) + c4 * log(y)
v_star (generic function with 1 method)
function verify_true_policy(m, shocks, c_star)
    k_grid = m.grid
    c_star_new = coleman_egm(c_star, k_grid, m.beta, m.u_prime, m.u_prime, m.f,
                             m.f_prime, shocks)

    plt = plot()
    plot!(plt, k_grid, c_star.(k_grid), lw = 2, label = L"optimal policy $c^*$")
    plot!(plt, k_grid, c_star_new.(k_grid), lw = 2, label = L"Kc^*")
    plot!(plt, legend = :topleft)
end
verify_true_policy (generic function with 1 method)
verify_true_policy(mlog, shocks, c_star)

Notice that we’re passing u_prime to coleman_egm twice.

The reason is that, in the case of log utility, \(u'(c) = (u')^{-1}(c) = 1/c\).

Hence u_prime and u_prime_inv are the same.

We can’t really distinguish the two plots.

In fact it’s easy to see that the difference is essentially zero:

c_star_new = coleman_egm(c_star, mlog.grid, mlog.beta, mlog.u_prime,
                         mlog.u_prime, mlog.f, mlog.f_prime, shocks)
maximum(abs(c_star_new(g) - c_star(g)) for g in mlog.grid)
4.440892098500626e-16

Next let’s try iterating from an arbitrary initial condition and see if we converge towards \(c^*\).

Let’s start from the consumption policy that eats the whole pie: \(c(y) = y\)

n = 15
function check_convergence(m, shocks, c_star, g_init, n_iter)
    k_grid = m.grid
    g = g_init
    plt = plot()
    plot!(plt, m.grid, g.(m.grid),
          color = RGBA(0, 0, 0, 1), lw = 2, alpha = 0.6,
          label = L"initial condition $c(y) = y$")
    for i in 1:n_iter
        new_g = coleman_egm(g, k_grid, m.beta, m.u_prime, m.u_prime, m.f, m.f_prime,
                            shocks)
        g = new_g
        plot!(plt, k_grid, new_g.(k_grid), alpha = 0.6,
              color = RGBA(0, 0, (i / n_iter), 1),
              lw = 2, label = "")
    end

    plot!(plt, k_grid, c_star.(k_grid),
          color = :black, lw = 2, alpha = 0.8,
          label = L"true policy function $c^*$")
    plot!(plt, legend = :topleft)
end
check_convergence (generic function with 1 method)
check_convergence(mlog, shocks, c_star, identity, n)

We see that the policy has converged nicely, in only a few steps.

36.4. Speed#

Now let’s compare the clock times per iteration for the standard Coleman operator (with exogenous grid) and the EGM version.

We’ll do so using the CRRA model adopted in the exercises of the Euler equation time iteration lecture.

Here’s the model and some convenient functions

mcrra = Model(alpha = 0.65, beta = 0.95, gamma = 1.5)
u_prime_inv(c) = c^(-1 / mcrra.gamma)
u_prime_inv (generic function with 1 method)

Here’s the result

crra_coleman(g, m, shocks) = K(g, m.grid, m.beta, m.u_prime, m.f, m.f_prime, shocks)
function crra_coleman_egm(g, m, shocks)
    coleman_egm(g, m.grid, m.beta, m.u_prime,
                u_prime_inv, m.f, m.f_prime, shocks)
end
function coleman(m = m, shocks = shocks; sim_length = 20)
    g = m.grid
    for i in 1:sim_length
        g = crra_coleman(g, m, shocks)
    end
    return g
end
function egm(m, g = identity, shocks = shocks; sim_length = 20)
    for i in 1:sim_length
        g = crra_coleman_egm(g, m, shocks)
    end
    return g.(m.grid)
end
egm (generic function with 3 methods)
@benchmark coleman($mcrra)
BenchmarkTools.Trial: 3 samples with 1 evaluation.
 Range (minmax):  1.988 s 2.000 s  ┊ GC (min … max): 0.79% … 0.76%
 Time  (median):     1.997 s             ┊ GC (median):    0.76%
 Time  (mean ± σ):   1.995 s ± 6.068 ms  ┊ GC (mean ± σ):  0.77% ± 0.02%

                                             █          █  
  ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ ▁
  1.99 s        Histogram: frequency by time           2 s <

 Memory estimate: 944.17 MiB, allocs estimate: 468912.
@benchmark egm($mcrra)
BenchmarkTools.Trial: 80 samples with 1 evaluation.
 Range (minmax):  59.829 ms70.893 ms  ┊ GC (min … max): 0.00% … 1.82%
 Time  (median):     63.431 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   63.357 ms ±  1.954 ms  ┊ GC (mean ± σ):  0.87% ± 0.91%

       ▁  ▃▁▁     ▁    █▁     ▃                                
  ▄▁▁▄▄█▁▄███▆▄▇▆▆█▄▆▆██▄▆▇▇▆█▁▇▁▆▄▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  59.8 ms         Histogram: frequency by time        70.5 ms <

 Memory estimate: 20.84 MiB, allocs estimate: 112444.

We see that the EGM version is about 30 times faster.

At the same time, the absence of numerical root finding means that it is typically more accurate at each step as well.