Using Stan for optimization.

In this post I justed wanted to play around using the ability to optimize functions in Stan. I know that there are other solvers available, but I just wanted to try this one to start.

First, i’ll load the cmstanr package which has become my favourite way to interface with Stan.

```
library(cmdstanr)
```

Now we can compile a simple optimization scenario which could be a pretty classic production function where you get 2 dollars for widget x and four dollars for wiget y:

\[ \text{optimize } 2*x + 4*y \\ \text{subject to:} \\ 0\le x \le 10 \\ 0\le y \le 20 \\ \]

There are also contraints on how many units you can make of each.

```
m <- cmdstan_model("optim.stan")
```

Now print the code:

```
m$print()
```

```
//Test program for optimization
data{
}
parameters{
real<lower=0,upper=10> x;
real<lower=0,upper=20> y;
}
model{
target += 2*x +4*y;
}
```

Run the model:

```
fit <- m$sample(refresh = 0, adapt_delta = .99)
```

```
Running MCMC with 4 sequential chains...
Chain 1 finished in 0.2 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
Chain 4 finished in 0.1 seconds.
All 4 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.7 seconds.
```

Summarise the outputs:

```
fit$print()
```

```
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
lp__ 94.75 95.10 1.11 0.79 92.52 95.80 1.00 1226 1265
x 9.51 9.65 0.48 0.35 8.58 9.97 1.00 1649 1148
y 19.75 19.82 0.25 0.18 19.24 19.99 1.00 1385 1217
```

```
out <- fit$draws(inc_warmup = FALSE, variables = c("x", "y"))
out_x <- rowMeans(as.data.frame(out[,,1]))
out_y <- rowMeans(as.data.frame(out[,,2]))
```

Graph the outputs:

```
op <- par(mfrow = c(1,2))
op
```

```
$mfrow
[1] 1 1
```

```
hist(out_x, breaks = 30, main = "Posterior Draws of X", col = "blue")
abline(v = mean(out_x), lwd = 2, col = "red")
hist(out_y, breaks = 30, main = "Posterior Draws of Y", col = "blue")
abline(v = mean(out_y), lwd = 2, col = "red")
```

Pretty neat that it works.

Just to play a bit, I want to add some data. Say we have the same function to optimize to maximize revenue, but instead of making any value of x and y, we have some production data with variability. Perhaps these can be real output from our operating process (machines breakdown and life happens, so a higher value might not be reasonable.)

```
m2 <- cmdstan_model("optim2.stan")
```

```
dat <- list(
y_in = rnorm(100, 7, 1.2),
x_in = rnorm(100, 23, 5),
sd_x = 5L,
sd_y = 2L,
N = 100L
)
fit2 <- m2$sample(data = dat, refresh = 0)
```

```
Running MCMC with 4 sequential chains...
Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
Chain 4 finished in 0.1 seconds.
All 4 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.5 seconds.
```

Now we can see the results:

```
fit2$print()
```

```
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
lp__ 19.66 19.97 0.97 0.68 17.74 20.58 1.00 1851 2776
x 23.07 23.06 0.49 0.50 22.29 23.90 1.00 3103 2339
y 7.11 7.11 0.20 0.20 6.79 7.43 1.00 2896 2181
```

```
out <- fit2$draws(inc_warmup = FALSE, variables = c("x", "y"))
out_x <- rowMeans(as.data.frame(out[,,1]))
out_y <- rowMeans(as.data.frame(out[,,2]))
op <- par(mfrow = c(1,2))
op
```

```
$mfrow
[1] 1 1
```

```
hist(out_x, breaks = 30, main = "Posterior Draws of X", col = "blue")
abline(v = mean(out_x), lwd = 2, col = "red")
hist(out_y, breaks = 30, main = "Posterior Draws of Y", col = "blue")
abline(v = mean(out_y), lwd = 2, col = "red")
```

The results from the second model represent the influence of the data from our actual manufacturing process. This can be valuable when trying to optimize these kinds of problems when the true data generating process is available to us.

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/medewitt/medewitt.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

For attribution, please cite this work as

DeWitt (2020, Aug. 27). Michael DeWitt: Optimisation with Stan. Retrieved from https://michaeldewittjr.com/programming/2020-08-27-optimisation-with-stan/

BibTeX citation

@misc{dewitt2020optimisation, author = {DeWitt, Michael}, title = {Michael DeWitt: Optimisation with Stan}, url = {https://michaeldewittjr.com/programming/2020-08-27-optimisation-with-stan/}, year = {2020} }