Michael DeWitt

Remembering Apollo

Some ruminations about the legacy of Apollo and doing things when failure isn't an option.

On the use of command line tools

Using `AWK` to parse court calendars

Defining a Project Workflow

Having a defined project workflow is important for many reasons. Consistency of design allows for easier sharing (you or other collaborators don't have to look for things) and reduces some cognitive load by allowing you to focus on content and less on form. This is my lightly opinionated project structure. Of course these fews are ever evolving.

Finding the Needle in the Haystack

Sometimes instead of accuracy we need to look at different metrics. One such metric is sensitivity, which is a measure of those who are actually targets how many does the model correctly identify. This can be the metric of choice over accuracy when you are dealing with a raw event such as a terrorist attack or even student retention. It is always important to understand what metrics you are optimising your models on.

State Space Models for Poll Prediction

In this section I replicate some state space poll modeling that James Savage and Peter Ellis used in a few different scenarios. State space modeling provides a great way to model times series effects when the data are collected at irregular intervals (e.g. opinion polling).

Re-districting in Winston-Salem

In this post I explore a potential outcomes to the composition of the Winston-Salem city council.

Omitted Variable Bias

A short description of the post.

MRP Redux

Using fake data simulations to understand the our MRP model.

Speeding Things Up with Rcpp

Metropolis Hasting samplers are typically slow in R because of inability to parallelise or vectorise operations. The Rcpp package allows a way to use C++ to conduct these MCMC operations at a much greater speed. This post explores how one would do this, achieving a >20x speed up.

Latex in ggplot2

This is a quick overview of a trick to add LaTex in ggplot2.

MRP using brms

This post explores MRP using brms and tidyverse modeling.

Replicating gsynth

The purpose of this post is to replicate the examples in the gsynth package for synthetic controls. This is a methodology for causal inference especially at the state level.

Hierarchical Time Series with hts

This is just a quick reproduction of the items discussed in the hts package. This allows for hierarchical time series which is an important feature when looking at data that take a hierarchical format like counties within a state or precincts within counties within states.

the power of fake data simulations

Looking at a blog post that Andrew Gelman posted on fake data simulations and HLM. The power of fake data simulations is that it really makes you think twice about what kind of effect for which you are looking as well as the power of your research design to detect it. This illustrates a really good practice for anyone looking to do this kind of analysis.

a foray into network analysis

Network analysis provides an way to analyse the interconnectedness of different networks. This can provide insight into social networks, interconnected groups of text, tweets, etc. Visualisations help to show these relationships but also some numeric values to quantify them.

models of microeconomics

Exploring the examples in Kleiber and Zeileis' Applied Economics in R

Analysis of Short Time Series

Using Fourier Transform as coefficients in short time series data helps with prediction.

make your own api

Exploring the concept of developing internal APIs. An API could also be an R package that can be used by people in your organisation to more easily connect to common data sources. This is a good example of some internal tooling that can make data access easier.

IRT and the Rasch Model

Item Response Theory (IRT) is a method by which item difficulty is assessed and used to measure latent factors. Classical test theory has a shortcoming where the test-taker's ability and the difficulty of the item cannot be separated. Thus there is a question of universalisability outside of the instrument. Additionally, the models make some assumptions that mathematically may not be justified. In come IRT which handles some of these issues.


So I'm moving to radix

Welcome to Michael DeWitt's Blog

Welcome to the rebooted blog!

Exploring forecast

Let's examine some of the functions inside for forecast

Speed it up!

This post explores how to see opportunities to make your code run faster.

Bayesian Time Series Analysis with bsts

Exploring the bsts package and what it provides for Bayesian structural time series modeling


ggrough is a great package that can be used to make graphs that look hand-drawn. This can be a great aesthetic choice when giving presentations and making handouts.

gghighlight for the win

Exploring the power of gghighlight package to automatically highlight charts

Let's Try Some Visualisation

An example of the value suppressing uncertainty scale. Great uses include forecast uncertainity.

More articles »

Michael DeWitt


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".