Estimate a simple model - Schorfheide (2000)
This tutorial is intended to show the workflow to estimate a model using the No-U-Turn sampler (NUTS). The tutorial works with a benchmark model for estimation and can therefore be compared to results from other software packages (e.g. dynare).
Define the model
The first step is always to name the model and write down the equations. For the Schorfheide (2000) model this would go as follows:
julia> using MacroModelling
julia> @model FS2000 begin dA[0] = exp(gam + z_e_a * e_a[x]) log(m[0]) = (1 - rho) * log(mst) + rho * log(m[-1]) + z_e_m * e_m[x] - P[0] / (c[1] * P[1] * m[0]) + bet * P[1] * (alp * exp( - alp * (gam + log(e[1]))) * k[0] ^ (alp - 1) * n[1] ^ (1 - alp) + (1 - del) * exp( - (gam + log(e[1])))) / (c[2] * P[2] * m[1])=0 W[0] = l[0] / n[0] - (psi / (1 - psi)) * (c[0] * P[0] / (1 - n[0])) + l[0] / n[0] = 0 R[0] = P[0] * (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ ( - alp) / W[0] 1 / (c[0] * P[0]) - bet * P[0] * (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ (1 - alp) / (m[0] * l[0] * c[1] * P[1]) = 0 c[0] + k[0] = exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ (1 - alp) + (1 - del) * exp( - (gam + z_e_a * e_a[x])) * k[-1] P[0] * c[0] = m[0] m[0] - 1 + d[0] = l[0] e[0] = exp(z_e_a * e_a[x]) y[0] = k[-1] ^ alp * n[0] ^ (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) gy_obs[0] = dA[0] * y[0] / y[-1] gp_obs[0] = (P[0] / P[-1]) * m[-1] / dA[0] log_gy_obs[0] = log(gy_obs[0]) log_gp_obs[0] = log(gp_obs[0]) end
Model: FS2000 Variables Total: 18 Auxiliary: 2 States: 4 Auxiliary: 0 Jumpers: 7 Auxiliary: 2 Shocks: 2 Parameters: 9
First, we load the package and then use the @model
macro to define our model. The first argument after @model
is the model name and will be the name of the object in the global environment containing all information regarding the model. The second argument to the macro are the equations, which we write down between begin
and end
. Equations can contain an equality sign or the expression is assumed to equal 0. Equations cannot span multiple lines (unless you wrap the expression in brackets) and the timing of endogenous variables are expressed in the square brackets following the variable name (e.g. [-1]
for the past period). Exogenous variables (shocks) are followed by a keyword in square brackets indicating them being exogenous (in this case [x]
). Note that names can leverage julia's unicode capabilities (e.g. alpha can be written as α).
Define the parameters
Next we need to add the parameters of the model. The macro @parameters
takes care of this:
julia> @parameters FS2000 begin alp = 0.356 bet = 0.993 gam = 0.0085 mst = 1.0002 rho = 0.129 psi = 0.65 del = 0.01 z_e_a = 0.035449 z_e_m = 0.008862 end
Remove redundant variables in non stochastic steady state problem: 2.05 seconds Set up non stochastic steady state problem: 2.469 seconds Take symbolic derivatives up to first order: 2.785 seconds Find non stochastic steady state: 3.991 seconds Model: FS2000 Variables Total: 18 Auxiliary: 2 States: 4 Auxiliary: 0 Jumpers: 7 Auxiliary: 2 Shocks: 2 Parameters: 9
The block defining the parameters above only describes the simple parameter definitions the same way you assign values (e.g. alp = .356
).
Note that we have to write one parameter definition per line.
Load data
Given the equations and parameters, we only need the entries in the data which correspond to the observables in the model (need to have the exact same name) to estimate the model. First, we load in the data from a CSV file (using the CSV and DataFrames packages) and convert it to a KeyedArray
(using the AxisKeys package). Furthermore, we log transform the data provided in levels, and last but not least we select only those variables in the data which are observables in the model.
julia> using CSV, DataFrames, AxisKeys # load data
julia> dat = CSV.read("../assets/FS2000_data.csv", DataFrame)
192×2 DataFrame Row │ gy_obs gp_obs │ Float64 Float64 ─────┼──────────────────── 1 │ 1.03836 0.99556 2 │ 1.0267 1.0039 3 │ 1.03352 1.02277 4 │ 1.0163 1.01683 5 │ 1.00469 1.03897 6 │ 1.01428 1.00462 7 │ 1.01456 1.00051 8 │ 0.996854 1.01176 ⋮ │ ⋮ ⋮ 186 │ 1.0126 1.00303 187 │ 1.00279 1.00458 188 │ 1.00813 1.00447 189 │ 1.00849 1.00699 190 │ 1.00764 1.00387 191 │ 1.00797 1.00296 192 │ 1.00489 1.00286 177 rows omitted
julia> data = KeyedArray(Array(dat)',Variable = Symbol.("log_".*names(dat)),Time = 1:size(dat)[1])
2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element Vector{Symbol} → Time ∈ 192-element UnitRange{Int64} And data, 2×192 adjoint(::Matrix{Float64}) with eltype Float64: (1) (2) (3) … (191) (192) (:log_gy_obs) 1.03836 1.0267 1.03352 1.00797 1.00489 (:log_gp_obs) 0.99556 1.0039 1.02277 1.00296 1.00286
julia> data = log.(data) # declare observables
2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element Vector{Symbol} → Time ∈ 192-element UnitRange{Int64} And data, 2×192 Matrix{Float64}: (1) (2) … (191) (192) (:log_gy_obs) 0.0376443 0.026348 0.00793817 0.00487735 (:log_gp_obs) -0.0044494 0.0038943 0.00295712 0.00285919
julia> observables = sort(Symbol.("log_".*names(dat))) # subset observables in data
2-element Vector{Symbol}: :log_gp_obs :log_gy_obs
julia> data = data(observables,:)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element view(::Vector{Symbol},...) → Time ∈ 192-element UnitRange{Int64} And data, 2×192 view(::Matrix{Float64}, [2, 1], :) with eltype Float64: (1) (2) … (191) (192) (:log_gp_obs) -0.0044494 0.0038943 0.00295712 0.00285919 (:log_gy_obs) 0.0376443 0.026348 0.00793817 0.00487735
Define bayesian model
Next we define the parameter priors using the Turing package. The @model
macro of the Turing package allows us to define the prior distributions over the parameters and combine it with the (Kalman filter) loglikelihood of the model and parameters given the data with the help of the get_loglikelihood
function. We define the prior distributions in an array and pass it on to the arraydist
function inside the @model
macro from the Turing package. It is also possible to define the prior distributions inside the macro but especially for reverse mode auto differentiation the arraydist
function is substantially faster. When defining the prior distributions we can rely n the distribution implemented in the Distributions package. Note that the μσ
parameter allows us to hand over the moments (μ
and σ
) of the distribution as parameters in case of the non-normal distributions (Gamma, Beta, InverseGamma), and we can also define upper and lower bounds truncating the distribution as third and fourth arguments to the distribution functions. Last but not least, we define the loglikelihood and add it to the posterior loglikelihood with the help of the @addlogprob!
macro.
julia> import Zygote
julia> import DynamicPPL
julia> import Turing
julia> import Turing: NUTS, sample, logpdf, AutoZygote
julia> prior_distributions = [ Beta(0.356, 0.02, μσ = true), # alp Beta(0.993, 0.002, μσ = true), # bet Normal(0.0085, 0.003), # gam Normal(1.0002, 0.007), # mst Beta(0.129, 0.223, μσ = true), # rho Beta(0.65, 0.05, μσ = true), # psi Beta(0.01, 0.005, μσ = true), # del InverseGamma(0.035449, Inf, μσ = true), # z_e_a InverseGamma(0.008862, Inf, μσ = true) # z_e_m ]
9-element Vector{Distributions.Distribution{Distributions.Univariate, Distributions.Continuous}}: Distributions.Beta{Float64}(α=203.68895999999998, β=368.47104) Distributions.Beta{Float64}(α=1724.5927500000014, β=12.157249999999973) Distributions.Normal{Float64}(μ=0.0085, σ=0.003) Distributions.Normal{Float64}(μ=1.0002, σ=0.007) Distributions.Beta{Float64}(α=0.16246596553318987, β=1.0969601238713826) Distributions.Beta{Float64}(α=58.49999999999999, β=31.49999999999999) Distributions.Beta{Float64}(α=3.95, β=391.05) Distributions.InverseGamma{Float64}( invd: Distributions.Gamma{Float64}(α=2.0, θ=28.2095404665858) θ: 0.035449 ) Distributions.InverseGamma{Float64}( invd: Distributions.Gamma{Float64}(α=2.0, θ=112.84134506883322) θ: 0.008862 )
julia> Turing.@model function FS2000_loglikelihood_function(data, model) parameters ~ Turing.arraydist(prior_distributions) if DynamicPPL.leafcontext(__context__) !== DynamicPPL.PriorContext() Turing.@addlogprob! get_loglikelihood(model, data, parameters) end end
FS2000_loglikelihood_function (generic function with 2 methods)
Sample from posterior: No-U-Turn Sampler (NUTS)
We use the NUTS sampler to retrieve the posterior distribution of the parameters. This sampler uses the gradient of the posterior loglikelihood with respect to the model parameters to navigate the parameter space. The NUTS sampler is considered robust, fast, and user-friendly (auto-tuning of hyper-parameters).
First we define the loglikelihood model with the specific data, and model. Next, we draw 1000 samples from the model:
julia> FS2000_loglikelihood = FS2000_loglikelihood_function(data, FS2000);
julia> n_samples = 1000
1000
julia> chain_NUTS = sample(FS2000_loglikelihood, NUTS(adtype = AutoZygote()), n_samples, progress = false);
┌ Info: Found initial step size └ ϵ = 0.003125
Inspect posterior
In order to understand the posterior distribution and the sequence of sample we are plot them:
julia> using StatsPlots
julia> plot(chain_NUTS);
Next, we are plotting the posterior loglikelihood along two parameters dimensions, with the other parameters ket at the posterior mean, and add the samples to the visualisation. This visualisation allows us to understand the curvature of the posterior and puts the samples in context.
julia> using ComponentArrays, MCMCChains
julia> import DynamicPPL: logjoint
julia> parameter_mean = mean(chain_NUTS)
Mean parameters mean Symbol Float64 parameters[1] 0.4018 parameters[2] 0.9905 parameters[3] 0.0046 parameters[4] 1.0142 parameters[5] 0.8439 parameters[6] 0.6828 parameters[7] 0.0025 parameters[8] 0.0137 parameters[9] 0.0033
julia> pars = ComponentArray([parameter_mean.nt[2]], Axis(:parameters));
julia> logjoint(FS2000_loglikelihood, pars)
1343.3901665327924
julia> function calculate_log_probability(par1, par2, pars_syms, orig_pars, model) orig_pars[1][pars_syms] = [par1, par2] logjoint(model, orig_pars) end
calculate_log_probability (generic function with 1 method)
julia> granularity = 32;
julia> par1 = :del;
julia> par2 = :gam;
julia> paridx1 = indexin([par1], FS2000.parameters)[1];
julia> paridx2 = indexin([par2], FS2000.parameters)[1];
julia> par_range1 = collect(range(minimum(chain_NUTS[Symbol("parameters[$paridx1]")]), stop = maximum(chain_NUTS[Symbol("parameters[$paridx1]")]), length = granularity));
julia> par_range2 = collect(range(minimum(chain_NUTS[Symbol("parameters[$paridx2]")]), stop = maximum(chain_NUTS[Symbol("parameters[$paridx2]")]), length = granularity));
julia> p = surface(par_range1, par_range2, (x,y) -> calculate_log_probability(x, y, [paridx1, paridx2], pars, FS2000_loglikelihood), camera=(30, 65), colorbar=false, color=:inferno);
julia> joint_loglikelihood = [logjoint(FS2000_loglikelihood, ComponentArray([reduce(hcat, get(chain_NUTS, :parameters)[1])[s,:]], Axis(:parameters))) for s in 1:length(chain_NUTS)];
julia> scatter3d!(vec(collect(chain_NUTS[Symbol("parameters[$paridx1]")])), vec(collect(chain_NUTS[Symbol("parameters[$paridx2]")])), joint_loglikelihood, mc = :viridis, marker_z = collect(1:length(chain_NUTS)), msw = 0, legend = false, colorbar = false, xlabel = string(par1), ylabel = string(par2), zlabel = "Log probability", alpha = 0.5);
julia> p
Plot{Plots.GRBackend() n=2}
Find posterior mode
Other than the mean and median of the posterior distribution we can also calculate the mode as follows:
julia> modeFS2000 = Turing.maximum_a_posteriori(FS2000_loglikelihood, adtype = AutoZygote(), initial_params = FS2000.parameter_values)
ModeResult with maximized lp of 1343.75 [0.4027213061408449, 0.9909439049258149, 0.004550077284875144, 1.014322732114236, 0.8457080502071208, 0.6910342545085805, 0.0016353132402261404, 0.013479922495285962, 0.0032575451181081952]
Model estimates given the data and the model solution
Having found the parameters at the posterior mode we can retrieve model estimates of the shocks which explain the data used to estimate it. This can be done with the get_estimated_shocks
function:
julia> get_estimated_shocks(FS2000, data, parameters = collect(modeFS2000.values))
2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Shocks ∈ 2-element Vector{Symbol} → Periods ∈ 192-element UnitRange{Int64} And data, 2×192 Matrix{Float64}: (1) (2) … (191) (192) (:e_a₍ₓ₎) 3.07802 2.02956 0.31192 0.0219214 (:e_m₍ₓ₎) -0.338786 0.529109 -0.455173 -0.596633
As the first argument we pass the model, followed by the data (in levels), and then we pass the parameters at the posterior mode. The model is solved with this parameterisation and the shocks are calculated using the Kalman smoother.
We estimated the model on two variables but our model allows us to look at all variables given the data. Looking at the estimated variables can be done using the get_estimated_variables
function:
julia> get_estimated_variables(FS2000, data)
2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variables ∈ 18-element Vector{Symbol} → Periods ∈ 192-element UnitRange{Int64} And data, 18×192 Matrix{Float64}: (1) (2) … (191) (192) (:P) 0.585869 0.596628 0.560845 0.559733 (:Pᴸ⁽¹⁾) 0.585108 0.595478 0.561547 0.560631 (:R) 1.02576 1.02692 1.01966 1.01858 (:W) 2.99871 3.01049 2.96335 2.9599 (:c) 1.73321 1.7013 … 1.80033 1.80164 (:cᴸ⁽¹⁾) 1.73494 1.7039 1.79936 1.80039 ⋮ ⋱ ⋮ (:k) 53.4886 52.032 56.2707 56.2515 (:l) 0.724385 0.733136 0.705017 0.704388 (:log_gp_obs) -0.0044494 0.0038943 0.00295712 0.00285919 (:log_gy_obs) 0.0376443 0.026348 … 0.00793817 0.00487735 (:m) 1.01686 1.01822 1.00971 1.00845 (:n) 0.241598 0.243599 0.237911 0.237975 (:y) 2.1224 2.11046 2.14601 2.14608
Since we already solved the model with the parameters at the posterior mode we do not need to do so again. The function returns a KeyedArray with the values of the variables in levels at each point in time.
Another useful tool is a historical shock decomposition. It allows us to understand the contribution of the shocks for each variable. This can be done using the get_shock_decomposition
function:
julia> get_shock_decomposition(FS2000, data)
3-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variables ∈ 18-element Vector{Symbol} → Shocks ∈ 3-element Vector{Symbol} ◪ Periods ∈ 192-element UnitRange{Int64} And data, 18×3×192 Array{Float64, 3}: [showing 3 of 192 slices] [:, :, 1] ~ (:, :, 1): (:e_a₍ₓ₎) (:e_m₍ₓ₎) (:Initial_values) (:P) 0.0148888 -0.00108998 0.00810035 (:Pᴸ⁽¹⁾) 0.0146127 -0.000913525 0.00743931 (:R) 6.87463e-17 -0.000955353 0.0031234 ⋮ (:log_gy_obs) 0.0328867 8.71844e-5 0.000120402 (:m) -0.0 -0.00111942 0.00365979 (:n) 0.00322234 3.46638e-5 0.000868554 (:y) -0.018474 0.000187179 -0.0062412 [:, :, 97] ~ (:, :, 97): (:e_a₍ₓ₎) (:e_m₍ₓ₎) (:Initial_values) (:P) 0.0288478 0.00602704 0.000719126 (:Pᴸ⁽¹⁾) 0.0283128 0.00503281 0.000705791 (:R) 3.18903e-15 0.00539571 3.21953e-10 ⋮ (:log_gy_obs) -0.0168618 0.000336676 7.85288e-6 (:m) 2.74833e-15 0.00632233 3.77243e-10 (:n) 0.00624343 -0.000223709 0.000155638 (:y) -0.0357943 -0.000897025 -0.000892291 [:, :, 192] ~ (:, :, 192): (:e_a₍ₓ₎) (:e_m₍ₓ₎) (:Initial_values) (:P) 0.00129293 -0.00565097 0.000121485 (:Pᴸ⁽¹⁾) 0.00126895 -0.00472692 0.000119232 (:R) -2.67776e-16 -0.00500928 5.29091e-17 ⋮ (:log_gy_obs) 0.000247177 7.87663e-5 1.32661e-6 (:m) -3.477e-16 -0.00586953 5.81132e-17 (:n) 0.000279824 0.000195656 2.62925e-5 (:y) -0.00160426 0.000901756 -0.000150738
We get a 3-dimensional array with variables, shocks, and time periods as dimensions. The shocks dimension also includes the initial value as a residual between the actual value and what was explained by the shocks. This computation also relies on the Kalman smoother.
Last but not least, we can also plot the model estimates and the shock decomposition. The model estimates plot, using plot_model_estimates
:
julia> plot_model_estimates(FS2000, data)
3-element Vector{Any}: Plot{Plots.GRBackend() n=38} Plot{Plots.GRBackend() n=36} Plot{Plots.GRBackend() n=6}
shows the variables of the model (blue), the estimated shocks (in the last panel), and the data (red) used to estimate the model.
The shock decomposition can be plotted using plot_shock_decomposition
:
julia> plot_shock_decomposition(FS2000, data)
3-element Vector{Any}: Plot{Plots.GRBackend() n=50} Plot{Plots.GRBackend() n=52} Plot{Plots.GRBackend() n=9}
and it shows the contribution of the shocks and the contribution of the initial value to the deviations of the variables.