Using ggplot2 has been described as writing āgraphical poemsā. But we may feel at a loss for āwordsā when functions weād like to have donāt exist. The ggplot2 extension system allows us to build new āvocabularyā for fluent expression.
An exciting extension mechanism is that of inheriting from existing, more primitive geoms after performing some calculation.
To get your feet wet in this world and give you a taste of patterns for geom extension, we provide three basic examples of the geoms_ that inherit from existing geoms (point, text, segment, etc) along with a practice exercise. With such geoms, calculation is done under the hood by the ggplot2 system.
With these geom, you can write new graphical poems with exciting new graphical āwordsā!
This tutorial is intended for individuals who already have a working knowledge of the grammar of ggplot2, but may like to build a richer vocabulary for themselves.
Our recipes take the form:
Below, youāll see a completely worked example (example recipes) and then a invitation to build a related target geom_*().
geom_point_xy_medians()
ā
library(tidyverse)
library(palmerpenguins)
penguins <- remove_missing(penguins)
penguins_medians <- penguins %>%
summarize(bill_length_mm_median = median(bill_length_mm),
bill_depth_mm_median = median(bill_depth_mm))
penguins %>%
ggplot() +
aes(x = bill_depth_mm) +
aes(y = bill_length_mm) +
geom_point() +
geom_point(data = penguins_medians,
color = "red", size = 4,
aes(x = bill_depth_mm_median,
y = bill_length_mm_median))
# Step 1.a
compute_group_xy_medians <- function(data, scales){ # scales is used internally in ggplot2
data %>%
summarize(x = median(x),
y = median(y))
}
# Step 1.b
penguins %>%
rename(x = bill_depth_mm, # ggplot2 will work with 'aes' column names
y = bill_length_mm) %>% # therefore rename is required to used the compute function
compute_group_xy_medians()
## # A tibble: 1 Ć 2
## x y
## <dbl> <dbl>
## 1 17.3 44.5
Things to notice
StatXYMedians <- ggplot2::ggproto(`_class` = "StatXYMedians",
`_inherit` = ggplot2::Stat,
required_aes = c("x", "y"),
compute_group = compute_group_xy_medians)
Things to notice
geom_point_xy_medians <- function(mapping = NULL, data = NULL,
position = "identity", na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, ...) {
ggplot2::layer(
stat = StatXYMedians, # proto object from step 2
geom = ggplot2::GeomPoint, # inherit other behavior
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
penguins %>%
ggplot()+
aes(x = bill_depth_mm, y = bill_length_mm)+
geom_point()+
geom_point_xy_medians(color = "red")
penguins %>%
ggplot()+
aes(x = bill_depth_mm,
y = bill_length_mm,
color = species)+
geom_point()+
geom_point_xy_medians(size = 4)
geom_point_xy_means()
Using recipe #1 as a reference, try to create the function geom_point_xy_means()
# step 0: use base ggplot2
# step 1: write your compute_group function (and test)
# step 2: write ggproto with compute_group as an input
# step 3: write your geom_*() function with ggproto as an input
# step 4: enjoy!
geom_label_id()
cars %>%
mutate(id_number = 1:n()) %>%
ggplot() +
aes(x = speed, y = dist) +
geom_point() +
geom_label(aes(label = id_number),
hjust = 1.2)
# you won't use the scales argument, but ggplot will later
compute_group_row_number <- function(data, scales){
data %>%
# add an additional column called label
# the geom we inherit from requires the label aesthetic
mutate(label = 1:n())
}
# step 1b test the computation function
cars %>%
# input must have required aesthetic inputs as columns
rename(x = speed, y = dist) %>%
compute_group_row_number() %>%
head()
## x y label
## 1 4 2 1
## 2 4 10 2
## 3 7 4 3
## 4 7 22 4
## 5 8 16 5
## 6 9 10 6
StatRownumber <- ggplot2::ggproto(`_class` = "StatRownumber",
`_inherit` = ggplot2::Stat,
required_aes = c("x", "y"),
compute_group = compute_group_row_number)
geom_label_row_number <- function(mapping = NULL, data = NULL,
position = "identity", na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, ...) {
ggplot2::layer(
stat = StatRownumber, # proto object from Step 2
geom = ggplot2::GeomLabel, # inherit other behavior
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
cars %>%
ggplot() +
aes(x = speed, y = dist) +
geom_point() +
geom_label_row_number(hjust = 1.2) # function in action
last_plot() +
aes(color = dist > 60) # Computation is within group
geom_text_coordinates()
Using recipe #2 as a reference, can you create the function geom_text_coordinates()
.
ā
Hint:
paste0("(", 1, ", ",3., ")")
## [1] "(1, 3)"
# step 0: use base ggplot2
# step 1: write your compute_group function (and test)
# step 2: write ggproto with compute_group as an input
# step 3: write your geom_*() function with ggproto as an input
# step 4: enjoy!
geom_point_lm_fitted()
model <- lm(formula = bill_length_mm ~ bill_depth_mm,
data = penguins)
penguins_w_fitted <- penguins %>%
mutate(fitted = model$fitted.values)
penguins %>%
ggplot() +
aes(x = bill_depth_mm, y = bill_length_mm) +
geom_point() +
geom_smooth(method = "lm", se = F) +
geom_point(data = penguins_w_fitted,
aes(y = fitted),
color = "blue")
compute_group_lm_fitted<- function(data, scales){
model<-lm(formula= y ~ x, data = data)
data %>%
mutate(y=model$fitted.values)
}
# test out the function
penguins %>%
# rename to explicitly state the x and y inputs
rename(x = bill_depth_mm, y = bill_length_mm)%>%
compute_group_lm_fitted()
## # A tibble: 333 Ć 8
## species island y x flipper_length_mm body_mass_g sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 43.0 18.7 181 3750 male 2007
## 2 Adelie Torgersen 43.8 17.4 186 3800 female 2007
## 3 Adelie Torgersen 43.5 18 195 3250 female 2007
## 4 Adelie Torgersen 42.6 19.3 193 3450 female 2007
## 5 Adelie Torgersen 41.8 20.6 190 3650 male 2007
## 6 Adelie Torgersen 43.6 17.8 181 3625 female 2007
## 7 Adelie Torgersen 42.4 19.6 195 4675 male 2007
## 8 Adelie Torgersen 43.7 17.6 182 3200 female 2007
## 9 Adelie Torgersen 41.4 21.2 191 3800 male 2007
## 10 Adelie Torgersen 41.5 21.1 198 4400 male 2007
## # ā¦ with 323 more rows
StatLmFitted<-ggplot2::ggproto(`_class` = "StatLmFitted",
`_inherit` = ggplot2::Stat,
required_aes = c("x", "y"),
compute_group = compute_group_lm_fitted)
geom_point_lm_fitted <- function(mapping = NULL, data = NULL,
position = "identity", na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, ...) {
ggplot2::layer(
stat = StatLmFitted, # proto object from step 2
geom = ggplot2::GeomPoint, # inherit other behavior
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
penguins %>%
ggplot() +
aes(x = bill_depth_mm, y = bill_length_mm) +
geom_point() +
geom_smooth(method="lm", se= F)+
geom_point_lm_fitted(color="blue")
penguins %>%
ggplot() +
aes(x = bill_depth_mm, y = bill_length_mm) +
geom_point() +
geom_smooth(method="lm", se= F) +
geom_point_lm_fitted() +
facet_wrap(facets = vars(species))
geom_segment_lm_residuals()
Create the function geom_segment_lm_residuals()
.
# step 0: use base ggplot2
model <- lm(formula = bill_length_mm ~ bill_depth_mm,
data = penguins)
penguins_w_fitted <- penguins %>%
mutate(fitted = model$fitted.values)
penguins %>%
ggplot() +
aes(x = bill_depth_mm, y = bill_length_mm) +
geom_point() +
geom_smooth(method = "lm", se = F) +
geom_segment(data = penguins_w_fitted,
aes(yend = fitted, xend = bill_depth_mm),
color = "blue")
# step 1: write your compute_group function (and test)
# step 2: write ggproto with compute_group as an input
# step 3: write your geom_*() function with ggproto as an input
# step 4: enjoy!
Not interested in writing your own geoms?
Check out some ready-to-go geoms that might be of interest in the ggxmean packageā¦ or other extension packages.
Interested in working a bit more with geoms and making them available to more folks, but not interested in writing your own package?
Join in on the development and validation of the ggxmean package for statistical educators and everyday analysis!