Exploring the power of gghighlight package to automatically highlight charts
gghighlight is a package that is on cran that allows one to highlight certain features ones finds valuable. Right now I typically do this with some custom color coding, then pass that into the ggplot2
arguments. This might serve as a good way to more easily automate this task. Additionally, this could be super handy during exploratory analysis where this is much more iterative to find patterns.
Our libraries of course….
library(ggplot2)
library(gghighlight)
This code is copied directly from here.
Build some data which is more some white noise with a random walk.
set.seed(2)
d <- purrr::map_dfr(
letters,
~ data.frame(idx = 1:400,
value = cumsum(runif(400, -1, 1)),
type = .,
stringsAsFactors = FALSE))
Definitely some messiness and colour overload!
ggplot(d) +
geom_line(aes(idx, value, colour = type))
The way I would do it…
library(dplyr, warn.conflicts = FALSE)
d_filtered <- d %>%
group_by(type) %>%
filter(max(value) > 20) %>%
ungroup()
ggplot() +
# draw the original data series with grey
geom_line(aes(idx, value, group = type), data = d, colour = alpha("grey", 0.7)) +
# colourise only the filtered data
geom_line(aes(idx, value, colour = type), data = d_filtered)
Now with this handy package we can do the following:
gghighlight_line(d, aes(idx, value, colour = type), max(value) > 20) +
theme_minimal()
And because it is a ggplot
object we can add things to it.
gghighlight_line(d, aes(idx, value, colour = type), max(value) > 20) +
facet_wrap(~ type)
And some additional cool stuff:
gghighlight_line(d, aes(idx, value, colour = type), predicate = max(value),
max_highlight = 6)
The package author does offer a caveat that the package can run slowly with lots of data and filtering and to go back to using dplyr
in a discrete step. I imagine it is because of all the grouped operations? Dunno, but this is a neat package to use for exploratory work.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/medewitt/medewitt.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
DeWitt (2018, July 4). Michael DeWitt: gghighlight for the win. Retrieved from https://michaeldewittjr.com/programming/2018-07-04-gghighlight-for-the-win/
BibTeX citation
@misc{dewitt2018gghighlight, author = {DeWitt, Michael}, title = {Michael DeWitt: gghighlight for the win}, url = {https://michaeldewittjr.com/programming/2018-07-04-gghighlight-for-the-win/}, year = {2018} }