Abbreviated syntax for layers that employ new computations (via Stat), is probably useful. Mostly, I’ve looked at Elio C.’s work in a 2018 blog post. Some pretty plots were made - step-by-step of detrending!
ggtemp/ggexpress project exists because syntax to define a new layer with new compute seems a little more verbose and not proportionate to what you are trying to accomplish. It would be nice to have some friendly within-script routine that doesn’t consume 15 lines of code. New layer functions, e.g. geom_xmean, can make plotting code more succinct and readable; but right now this might be at the price of new Stat definition code.
I’ve learned other people think would be nice to have short-cut syntax to layer creation, and have worked on this problem, including Elio Campitelli who writes about his similar motivation (June Choe referred me):
But still, creating a geom from scratch is an involved process that doesn’t lend itself to simple transformations. – Elio Campitelli
The Spanish is slightly different, and maybe a little more to the point and from the gut:
Pero aún así, hacer un stat específico cada vez que uno quiere hacer un plot en particular es demasiado complicado.*
I worked with Elio Campitelli’s proposed stat_rasa() below - it is very nice.
It seems very similar to the ggtemp goals, and several decisions are the same like default geom is point, default position is identity. I think ggtemp might be coming from something that looks more like the full, standard extension syntax, and then maybe you’d be more positioned to migrate to that. Maybe. stat_rasa’s approach feels safer, because you aren’t creating new proto object on the fly within a function. Switching out which function to call within a ggproto object is something that is done in a lot of other contexts.
Elio’s StatRasa serves the case where computation happens group-wise. With define_layer_temp(), I’d started with compute_panel definition (just because of the motivating computation) and expanded back to compute_group. A general solution (w/ compute_layer too) or split would be good; like there could be a stat_rasa_group(), stat_rasa_panel(), stat_rasa_layer() for example.
I also heard about https://github.com/mjskay/ggbuilder from June Choe and think that is about even stat definitions within the plot build? I have given this less attention. So I’m not sure.
I also include some code Teun Van Der Brand offered on piping, but to be honest, practical implications and connection to this problem are not yet clear to me. 😭
Progessing with ggtemp::define_layer_temp(), I tried to make some changes to make the internal code more compact… but I think (hope) my ggplot2::Stat knowledge is failing me, and hope there is a solution. I want to make all the relevant ggproto slots accesible, setting NULL as default, and then filling them in with Stat$compute_group and the like if the user doesn’t define the slot. The StatRasa implementation blog example has the function working for compute_group only, whereas I think there would be a lot of use cases for the function that works in compute_panel arena.
I worked on a project with ggtemp and it was fun! I think more fun than it would have been w/ traditional syntax. https://github.com/EvaMaeRey/ggcircleof5ths We feed in a data frame contain a variable ‘chord’, which contains the progression F, A, Bb and Bbm, and see the harmonic moves being made! Which are discussed in https://switchedonpop.com/episodes/olivia-rodrigo-guts-vampire-bad-idea-right. Here, we’re looking at the out-of-key use of A chord instead of Am (in the key of F), and chromaticism and creepyness is introduced. Its followed by the Bb to Bbm change - more chromaticism - and to sad minor key!
tibble::tribble(~ lyric, ~chord,
"Hate to give ... ", "F",
"...now, How's ...", "A",
"-bout? Just ...", "Bb",
"...you, cool guy,", "Bbm") |>
ggplot() +
aes(x0 = 0,
y0 = 0,
chord = chord) +
geom_chord_cof(r = .45, key = "F") +
coord_equal() +
facet_wrap(~fct_inorder(lyric %>% str_wrap(25)), nrow = 1) +
labs(title = "Vampire (& Creep) chord progression")
knitr::include_graphics("https://github.com/EvaMaeRey/ggcircleof5ths/blob/main/README_files/figure-gfm/unnamed-chunk-6-1.png?raw=true")
https://eliocamp.github.io/codigo-r/en/2018/05/how-to-make-a-generic-stat-in-ggplot2/
library(tidyverse)
StatRasa <- ggplot2::ggproto("StatRasa", ggplot2::Stat,
compute_group = function(data, scales, fun, fun.args) {
# Change default arguments of the function to the
# values in fun.args
args <- formals(fun)
for (i in seq_along(fun.args)) {
if (names(fun.args[i]) %in% names(fun.args)) {
args[[names(fun.args[i])]] <- fun.args[[i]]
}
}
formals(fun) <- args
# Apply function to data
fun(data)
})
# stat function used in ggplot
stat_rasa <- function(mapping = NULL, data = NULL,
geom = "point",
position = "identity",
fun = NULL,
...,
show.legend = NA,
inherit.aes = TRUE) {
# Check arguments
if (!is.function(fun)) stop("fun must be a function")
# Pass dotted arguments to a list
fun.args <- match.call(expand.dots = FALSE)$`...`
ggplot2::layer(
data = data,
mapping = mapping,
stat = StatRasa,
geom = geom,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
check.aes = FALSE,
check.param = FALSE,
params = list(
fun = fun,
fun.args = fun.args,
na.rm = FALSE,
...
)
)
}
# using .75 span to match ggplot2 geom_smooth
Detrend <- function(data, method = "lm", span = 0.75) {
if (method == "lm") {
data$y <- resid(lm(y ~ x, data = data))
} else {
data$y <- resid(loess(y ~ x, span = span, data = data))
}
as.data.frame(data)
}
library(ggplot2)
set.seed(42)
x <- seq(-1, 3, length.out = 30)
y <- x^2 + rnorm(30)*0.5
df <- data.frame(x = x, y = y)
ggplot(df, aes(x, y)) +
geom_line(aes(color = "raw data")) +
geom_smooth(aes(color = "loess smoothing"),
alpha = .3) +
stat_smooth(geom = "point",
aes(color = "loess smoothing"),
xseq = df$x) +
stat_rasa(geom = "line",
fun = Detrend,
method = "smooth",
aes(color = "detrended")) +
geom_hline(yintercept = 0,
linetype = "dashed") +
scale_color_discrete(breaks =
fct_inorder(c("raw data",
"loess smoothing",
"detrended"))) +
labs(title = "detrending with loess smoothing")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
stat_detrend <- function(...) {
stat_rasa(fun = Detrend, geom = "line", ...)
}
ggplot(df, aes(x, y)) +
geom_line(aes(color = "raw data")) +
geom_smooth(method = "lm",
aes(color = "linear model")) +
stat_smooth(method = "lm", geom = "segment",
xend = df$x, yend = df$y,
xseq = df$x, alpha = .2
) +
stat_detrend(method = "lm",
aes(color = "detrended")) +
geom_hline(yintercept = 0,
linetype = "dashed") +
scale_color_discrete(breaks =
fct_inorder(c("raw data",
"linear model",
"detrended"))) +
labs(title = "Linear detrending",
color = NULL)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
summarize_xy <- function(data, sum_fun = mean){
data |>
summarize(
x = sum_fun(x),
y = sum_fun(y)
)
}
mtcars |>
ggplot() +
aes(x = wt,
y = mpg) +
geom_point() +
stat_rasa(fun = summarize_xy,
size = 8,
aes(color = "means")) +
stat_rasa(fun = summarize_xy,
sum_fun = median,
size = 8,
aes(color = "medians")) +
labs(color = "centroid")
ggtemp:::define_layer_temp
## function(
## required_aes,
## compute_panel = NULL,
## compute_group = NULL,
## geom = ggplot2::GeomPoint,
## mapping = NULL,
## data = NULL,
## position = "identity",
## na.rm = FALSE,
## show.legend = NA,
## inherit.aes = TRUE,
## ...) {
##
## if(!is.null(compute_panel)){
## StatTemp <- ggproto(
## `_class` = "StatTemp",
## `_inherit` = ggplot2::Stat,
## compute_panel = compute_panel,
## required_aes = required_aes)
## }
##
## if(!is.null(compute_group)){
## StatTemp <- ggproto(
## `_class` = "StatTemp",
## `_inherit` = ggplot2::Stat,
## compute_group = compute_group,
## required_aes = required_aes)
## }
##
## ggplot2::layer(
## stat = StatTemp, # proto object from Step 2
## geom = geom, # inherit other behavior
## data = data,
## mapping = mapping,
## position = position,
## show.legend = show.legend,
## inherit.aes = inherit.aes,
## params = list(na.rm = na.rm, ...)
## )
## }
## <bytecode: 0x7fa5bf4807e8>
## <environment: namespace:ggtemp>
ggtemp:::create_layer_temp
## function(fun_name ="geom_circle",
## compute_panel = NULL,
## compute_group = NULL,
## required_aes = c("x0", "y0", "r"),
## geom = "point"){
##
## assign(x = fun_name,
## value = function(...){
##
##
## define_layer_temp(
## required_aes = required_aes,
## compute_panel = compute_panel,
## compute_group = compute_group,
## geom = geom,
## ...) },
## pos = 1
## )
##
## }
## <bytecode: 0x7fa5bf607460>
## <environment: namespace:ggtemp>
compute_panel_circle <- function(data, scales, n_vertices = 15){
data |>
mutate(group = row_number()) |>
crossing(tibble(z = 0:n_vertices)) |>
mutate(around = 2*pi*z/max(z)) |>
mutate(x = x0 + cos(around)*r,
y = y0 + sin(around)*r)
}
geom_circle <- function(...){
ggtemp:::define_layer_temp(
compute_panel = compute_panel_circle,
required_aes = c("x0", "y0"),
...)
}
library(ggplot2)
data.frame(x0 = 0:1, y0 = 0:1, r = 1:2/3) |>
ggplot() +
aes(x0 = x0, y0 = y0, r = r) +
geom_circle(n_vertices = 5) +
aes(fill = r)
compute_group_xy_summary <- function(data, scales){
data |>
summarize(x = mean(x),
y = mean(y))
}
ggtemp:::create_layer_temp("geom_xy_summary",
compute_group = compute_group_xy_summary,
required_aes = c("x", "y"))
mtcars |>
ggplot() +
aes(x = wt, y = mpg) +
geom_point() +
geom_xy_summary(size = 8)
last_plot() +
aes(color = factor(gear))
This is giving me more problems; I thought I would be able to start with a lot of null elements, and use generic slot fillers like ggplot2::Stat$compute_panel… But strategy doesn’t seem to be working.
define_layer_temp <- function(
compute_group = NULL,
compute_panel = NULL,
required_aes = NULL,
geom = ggplot2::GeomPoint,
mapping = NULL,
data = NULL,
position = "identity",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
...) {
StatEmpty <- ggproto(
`_class` = "StatEmpty",
`_inherit` = ggplot2::Stat)
if(is.null(compute_group)){compute_group <- ggplot2::StatEmpty$compute_group}
if(is.null(compute_panel)){compute_panel <- ggplot2::StatEmpty$compute_panel}
if(is.null(required_aes)) {required_aes <- ggplot2::StatEmpty$required_aes}
StatTemp <- ggproto(
`_class` = "StatTemp",
`_inherit` = ggplot2::Stat,
compute_group = compute_group,
compute_panel = compute_panel,
required_aes = required_aes
)
ggplot2::layer(
stat = StatTemp, # proto object from Step 2
geom = geom, # inherit other behavior
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
compute_group_xy_summary <- function(data, scales){
data |>
summarize(x = mean(x),
y = mean(y))
}
geom_xy_means <- function(...){define_layer_temp(compute_group = compute_group_xy_summary, ...)}
ggplot(mtcars) +
aes(x = wt, y = mpg) +
geom_point() +
geom_xy_means()
ggbuilder https://github.com/mjskay/ggbuilder and pipe note might have something to do with more dynamic, on-the-fly internal data manipulation and stat creation…
df |>
ggplot(aes(x = condition, y = response, color = condition)) +
geom_boxplot() +
plot_data() |>
dplyr::filter(condition %in% c("B", "C")) |>
dplyr::group_by(condition) |>
dplyr::slice_max(response) |>
geom_("label", aes(label = response))
library(tidyverse)
piped <- . %>% summarise(x = mean(x)) %>%
mutate(x = x + 1)
piped
## Functional sequence with the following components:
##
## 1. summarise(., x = mean(x))
## 2. mutate(., x = x + 1)
##
## Use 'functions' to extract the individual functions.
magrittr::functions(piped)
## [[1]]
## function (.)
## summarise(., x = mean(x))
##
## [[2]]
## function (.)
## mutate(., x = x + 1)