The purpose of this post is to implement “Nowcasting” in Stan using a state-space model. This post is originally inspired from Jim Savage’s gist located here.

Motivation

Data Generating Process

\[y_t \sim N(x_t + \epsilon_t, \sigma_{\epsilon})\]

\[x_t \sim N(x_{t-1}+Z_t\gamma_t + \eta_t, \sigma_{\eta})\]

In the case of this model we only observe the values of \(y\) at time \(t\).

Build Fake Data

First we’ll bring in our packages.

library(rstan)
suppressPackageStartupMessages(library(tidyverse))
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
set.seed(336)

Then we will generate the required data. In this case we will generate the entire DPG, then for our simulation will remove those data points that do not coincide with our measurement frequency.

# Set up DGP
n <- 300 # number of data points
freq <- 28
# High frequency helpers
z1 <- rnorm(n, 0, 3);
z2 <- rnorm(n, 0, 3)

# Set up "real" state
x <- rep(NA, n)
x[1] <- 1
for(i in 2:n){
  x[i] <- x[i-1] + 0.4*z1[i] + -0.3*z2[i] + rnorm(1, 0, 0.2)
}

# Set up y that is only recorded once for every "freq" values of x (set freq above)
y <- x + rnorm(n, 0, 1)

y[!(1:n%%freq==0)] <- NA

# Have a look at the data to make sure you know what's happening
dat <-data.frame(y, z1, z2) 

Format the data for stan.

# y is now just the observed values of y
y <- dat$y[!is.na(dat$y)]

model_list <- list(N1 = length(y), 
                   N2 = n, 
                   freq = freq, 
                   y = y, 
                   z1 = z1, 
                   z2 = z2)

Build the Model

writeLines(readLines("stan_nowcast.stan"))
// From https://github.com/khakieconomics/nowcasting_in_stan/blob/master/nowcasting.stan
// Via James Savage

data {
  int N1; // length of low frequency series
  int N2; // length of high frequency series
  int freq; // every freq-th observation of the high frequency series we get an observation of the low frequency one
  vector[N1] y;
  vector[N2] z1;
  vector[N2] z2;
}
parameters {
  real<lower = 0> sigma_epsilon;
  real<lower = 0> sigma_eta;
  vector[2] gamma;
  vector[N2] x;
}
model {
  int count;
  // priors
  sigma_epsilon ~ cauchy(0,1);
  sigma_eta ~ cauchy(0,1);
  gamma ~ normal(0,1);
  //increment_log_prob(normal_lpdf(x[1], 0, 1));
  
  // likelihood
  count = 0;
  for(i in 2:N2){
    target += normal_lpdf(x[i]| x[i-1] + z1[i]*gamma[1] + z2[i]*gamma[2], sigma_eta);
    if(i%freq==0){
      count = count + 1;
      target += normal_lpdf(y[count]| x[i], sigma_epsilon);
    }
  }
}

Now we have to compile our model.

model <- stan_model("stan_nowcast.stan")

Fit the model

fit <- sampling(model, model_list, iter = 2000, 
                chains = 2, refresh = 0,
                control = list(adapt_delta = .95,
                               max_treedepth = 15))
## Warning: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.95 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
## http://mc-stan.org/misc/warnings.html#bfmi-low
## Warning: Examine the pairs() plot to diagnose sampling problems

Model Checking

And as always special thanks to Michael Betancourt for these amazing tools for diagnostics.

util <- new.env()
source('stan_utilities.R', local=util)
util$check_all_diagnostics(fit)

So it appears that this model needs a little more work!

Now some inferences

So while our model parameterisation isn’t perfect the coefficients in the data generating process have been sussed out of the data by our model. So this isn’t a complete loss!

print(fit, pars = c("gamma"))
## Inference for Stan model: stan_nowcast.
## 2 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=2000.
## 
##           mean se_mean   sd  2.5%   25%   50%   75%   98% n_eff Rhat
## gamma[1]  0.45       0 0.08  0.29  0.40  0.45  0.49  0.60   919    1
## gamma[2] -0.31       0 0.07 -0.46 -0.35 -0.31 -0.28 -0.18   896    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Feb 26 22:28:20 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).
stan_plot(fit, pars = "gamma")
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)

Make the Graph

# Extract the estimates of the state
x_mod <- extract(fit, pars = "x", permuted = F)
x_mod <- plyr::adply(x_mod, 2)

# Summarise the parameters
yy <- dat$y
x_summarise <- x_mod %>% 
  dplyr::select(-chains) %>% 
  gather(variable, value) %>%
  mutate(obs = str_extract(variable, "[0-9]{1,4}") %>% as.numeric) %>%
  group_by(obs) %>%
  summarise(Median = median(value),
            Lower = quantile(value, 0.025),
            Upper = quantile(value, 0.975)) %>%
  mutate(Actual = x,
         Signal = yy)

Now to the graph, but first we can check how often a result occured outside of our confidence interval (technically highest density interval).

mean(x_summarise$Actual<x_summarise$Lower | x_summarise$Actual>x_summarise$Upper) %>% 
  scales::percent()
## [1] "0%"
x_summarise %>% 
  ggplot(aes(x = obs)) +
  geom_ribbon(aes(ymin = Lower, ymax = Upper), fill = "orange", alpha = 0.5) +
  geom_line(aes(y = Median)) +
  geom_line(aes(y = Actual), colour = "red") +
  geom_point(aes(y = Signal), size = 2, color = "blue") +
  labs(title = "Points are low-frequency observations\nRed is actual underlying (hf) series\nblack and orange are our estimate bounds")+
  theme_minimal()



Research and Methods Resources
me.dewitt.jr@gmail.com



Winston- Salem, NC

Michael DeWitt


Copyright © 2018 Michael DeWitt. All rights reserved.