ggrepel/ggpp

w/ tufte margin plots

2024-05-13

library(tidyverse)
gapminder::gapminder %>% 
  filter(year == 2002) %>% 
  filter(continent == "Americas") ->
gap2002_americas_df

select_countries <- c("Chile" = "Chile", "United States" = "The US is here")

select_countries[as.character(gap2002_americas_df$country)]
##             <NA>             <NA>             <NA>             <NA> 
##               NA               NA               NA               NA 
##            Chile             <NA>             <NA>             <NA> 
##          "Chile"               NA               NA               NA 
##             <NA>             <NA>             <NA>             <NA> 
##               NA               NA               NA               NA 
##             <NA>             <NA>             <NA>             <NA> 
##               NA               NA               NA               NA 
##             <NA>             <NA>             <NA>             <NA> 
##               NA               NA               NA               NA 
##             <NA>             <NA>    United States             <NA> 
##               NA               NA "The US is here"               NA 
##             <NA> 
##               NA

Setting up an auto-enumerate tagger


i = 0

labs_tag_enumerate <- function(){
  
  i <<- i + 1

  # print(cat("Above code returns Plot", i - 1,"\n"))
  
  return(labs(tag = paste("Plot", i - 1)))
  
  
  
}

Baseplot, plot 0.

Sometimes labels overlap the points, depending on the random numbers seed

library(tidyverse)
library(ggrepel)


gapminder::gapminder |>
  filter(year == 2002) |>
  ggplot() +
  aes(gdpPercap, lifeExp) +
  geom_point(colour = "darkgrey") ->
base_plot  

aes_label_nas <- aes(label = select_countries[as.character(country)])
  
geom_text_repel_favs <- function(...){
    geom_text_repel(na.rm = TRUE, 
    box.padding = unit(1, "cm"),
    point.padding = unit(5, "mm"),
    arrow = arrow(length = unit(2, "mm"), 
                  type = "closed"), ...)
}

base_plot + 
  aes_label_nas +
  geom_text_repel_favs() +
  labs_tag_enumerate()

In Plot 1 labels never overlap the points as long as there is space, because we use an empty string in the vector

aes_label_empty_string <- aes(label = select_countries[as.character(country)] %>% 
        replace_na(""))

base_plot + 
  aes_label_empty_string +   # New label
  geom_text_repel_favs(
        max.overlaps = 30    # overlaps
  ) +
  labs_tag_enumerate()

In Plot 2, we can restrict repulsion to the vertical axis


base_plot + 
  aes_label_empty_string + 
  geom_text_repel_favs(
    direction = "y",       #<< direction y
    max.overlaps = 30
  )  +
  labs_tag_enumerate()

Plot 3 uses nudging to start movement upwards


base_plot + 
  aes_label_empty_string + 
  geom_text_repel_favs(
    nudge_y = 3,           #<< New
    direction = "y",
    max.overlaps = 30
  )  +
  labs_tag_enumerate()

We set the repel force and pull force to zero in plot Plot 4


base_plot + 
  aes_label_empty_string + 
  geom_text_repel_favs(
    nudge_y = 4,
    direction = "y",
    force = 0,         
    force_pull = 0,    
  )  + 
  labs_tag_enumerate()

Conclusions

In this case, the best approach seems to be the 4th: nudging + repulsion along y.

Tomorrow I should have a bit more time, and I will post here some variations using ‘ggpp’.

Hi.

Here are some examples with ‘ggpp’ and with ‘ggrepel’. I do not think any automatic approach can provide plots as nice as those than be obtained by manual placement. One important difference between ‘ggrepel’ and ‘ggpp’ is that geom_text_repel() avoids clipping of labels at the edge of the plotting area, while geom_text_s() behaves like geom_text() and may clip part of labels. Of course, the solution is to manually expand the axis limits.

Padding ‘ggpp’ does not support ‘grid’ units, differemtly to ‘ggrepel’.

In Plot 5 I label the same two countries as in the original example to show the geom_text_s behavior starting


library(tidyverse)
library(ggpp)
library(ggrepel)


geom_text_s_favs <- function(...){
  geom_text_s(
    na.rm = TRUE, 
    box.padding = 0.5,
    point.padding = 0.3,
    arrow = arrow(length = unit(2, "mm"), type = "closed"), ...
  )
}

# Using non-repulsive geom
base_plot +
  aes_label_nas + 
  geom_text_s_favs(
    nudge_y = 4
  ) + 
  labs_tag_enumerate()

In Plot 6 Justification changes the attachment of the arrow


base_plot +
  aes_label_nas +
  geom_text_s_favs(
    aes(label = c("Chile" = "Chile", "United States" = "The US is here")[as.character(country)]),
    nudge_y = 4, nudge_x = 1000,
    hjust = "left"
  )  + 
  labs_tag_enumerate()

In Plot 7 we see that nudge can be a vector (it will be recycled if necessary)


base_plot +
  aes_label_nas + 
  geom_text_s_favs(
    nudge_y = c(-4, 4), nudge_x = 1000,
    hjust = "left",
  )  + 
  labs_tag_enumerate()

In Plot 8 nudging to a target position is performed


base_plot +
  aes_label_nas +
  geom_text_s_favs(
    position = position_nudge_to(y = 82),
    hjust = "left"
  )  + 
  labs_tag_enumerate()

in Plot 9, I label more countries than in the original example


some.countries <- c("Chile" = "Chile",
                    "Argentina" = "Argentina",
                    "Paraguay" = "Paraguay",
                    "Bolivia" = "Bolivia",
                    "Peru" = "Perú",
                    "Somalia" = "Somalia",
                    "United States" = "The US is here")

aes_label_some_countries <- aes(label = some.countries[as.character(country)])

In plot 9, which does not work well without repulsion we nudging away from the cloud of points

base_plot +
  geom_point(aes(alpha = country %in% names(some.countries)), show.legend = FALSE) ->
base_plot2

base_plot2 +
  aes_label_some_countries +
  geom_text_s_favs(
    position = position_nudge_line(y = 4, x = 1000, method = "spline"),
    hjust = "outward"
  )  + 
  labs_tag_enumerate()

In plot 10, nudging away from the cloud of points increasing nudging helps.


base_plot2 +
  aes_label_some_countries +
  geom_text_s_favs(
    position = position_nudge_line(y = 6, x =2500, method = "spline")
  )  + 
  labs_tag_enumerate()

In plot 11, Nudging away from the cloud of points, increasing nudging and setting hjust


base_plot2 +
  aes_label_some_countries +
  geom_text_s(
    position = position_nudge_line(y = 6, x =2500, method = "spline"),
    hjust = "outward"
  )  + 
  labs_tag_enumerate()
#> Warning: Using alpha for a discrete variable is not advised.

In plot 12, nudging away from the cloud of points, setting nudging direction to “split”


base_plot2 +
  aes_label_some_countries +
  geom_text_s(
    position = position_nudge_line(y = 4, x = 1000, method = "spline",
                                   direction = "split")
  ) + 
  labs_tag_enumerate()

In plot 13, Using ‘ggpp’ + ‘ggrepel’… I set max.overlaps = 30 so that ‘ggrepel’ does not give up too early


The examples above tend to work, but there is no guarantee of no overlaps, so combining them with repulsion is safer. The more dispersion there is in the data points the more necessary repulsion becomes.

Nudging away based on a spline fitted to the cloud of points in plot 13


base_plot2 +
  aes_label_some_countries +
  geom_text_repel_favs(
    position = position_nudge_line(y = 5, x = 3000, method = "spline"),
  ) + 
  labs_tag_enumerate()

In plot 14, Nudging away based on a spline fitted to the cloud of points with repulsion and hjust in plot 14

The effect of justification is different in the geoms from ‘ggrepel’ and ‘ggpp’!


aes_label_some_countries_empty_string <- aes(label = ifelse(country %in% names(some.countries),
                       some.countries[as.character(country)],
                       ""))

base_plot2 +
  aes_label_some_countries_empty_string  +
  geom_text_repel(
    position = position_nudge_line(y = 5, x = 3000, method = "spline"),
    hjust = "outward", # <- added
    na.rm = TRUE, 
    arrow = arrow(length = unit(2, "mm"), type = "closed"),
    max.overlaps = 30
  ) + 
  labs_tag_enumerate()

In plot 15, nudging away from the cloud of points with nudging with direction away from fitted spline in plot 15


base_plot2 +
  aes_label_some_countries_empty_string  +
  geom_text_repel(
    position = position_nudge_line(y = 5, x = 3000, method = "spline",
                                  direction = "split"), # <-- added
    hjust = "outward",
    na.rm = TRUE, 
    max.overlaps = 50,
    arrow = arrow(length = unit(2, "mm"), type = "closed")
  ) + 
  labs_tag_enumerate()

In plot 16, Nudging away from the cloud of points with repulsion, a different option for direction in plot 16


base_plot2 +
    aes_label_some_countries_empty_string  +
  geom_text_repel(
    position = position_nudge_line(y = 5, x = 3000, method = "spline",
                                   direction = "automatic"), # <-- added
    na.rm = TRUE, 
    max.overlaps = 50,
    arrow = arrow(length = unit(2, "mm"), type = "closed")
  ) + 
  labs_tag_enumerate()

In plot 17, Nudging away from the cloud of points with repulsion, adding a minimum distance from the fitted line for the nudging that is 1.5 times the nudging of observations in plot 17


base_plot2 +
    aes_label_some_countries_empty_string +
  geom_text_repel(
    position = position_nudge_line(y = 5, x = 3000, method = "spline",
                                   direction = "automatic",
                                   line_nudge = 1.5),
    na.rm = TRUE, 
    max.overlaps = 50,
    arrow = arrow(length = unit(2, "mm"), type = "closed")
  ) + 
  labs_tag_enumerate()

In plot 18, nudging away from the cloud of points with repulsion, adding a minimum adding a minimum distance from the fitted line that is 2 times the nudging for individual observations, while reducing the amount of nudging for observations.


base_plot2 +
  aes_label_some_countries_empty_string +
  geom_text_repel(
    position = position_nudge_line(y = 3, x = 2000, method = "spline",
                                   direction = "split",
                                   line_nudge = 2),
    na.rm = TRUE, 
    max.overlaps = 50,
    min.segment.length = 0,
    arrow = arrow(length = unit(2, "mm"), type = "closed")
  ) + 
  labs_tag_enumerate()