class: center, middle, inverse, title-slide # A Minimal #TidyTuesday Flipbook ### Gina Reynolds, May 2019 --- --- # Introduction This is a minimal example to demonstrate how to create a flipbook with data from #TidyTuesday. It walks through data wrangling and plots pipelines made with the Tidyverse. The functions that make this possible are the work of Emi Tanaka, Garrick Aden-Buie and myself, and are built for Xaringan, an Rmarkdown file type for creating presentation slides; the functions make use of the function `knitr:::knit_code$get()`. The code to create the flipbook is an .Rmd that you can download [**here**](https://raw.githubusercontent.com/EvaMaeRey/little_flipbooks_library/master/tidytuesday_minimal_example/tidytuesday_minimal_example.Rmd). --- Interested in more flipbooks? Check out - [the ggplot flipbook](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html) - [The Tidyverse in Action](https://evamaerey.github.io/tidyverse_in_action/tidyverse_in_action.html) For more about Xaringan: - [Xaringan presentation slides](https://slides.yihui.name/xaringan/) The sequential workflow of the Tidyverse makes incremental display of pipelines and ggplot statements ideal: - [www.tidyverse.org](https://www.tidyverse.org/) --- # What's the slow ggplot style? "Slow ggplot" just means working more incrementally than is typical. Elements of the approach are as follows: - pulling out aes() from the ggplot() function: - using fewer functions; example - using labs() to add a title instead of ggtitle() - using functions multiple times; example aes(x = var1) + aes(y = var2) rather than aes(x = var1, y = var2) - using base R functions and tidyverse functions. For other packages, the :: style to call them - write out arguments (no shortcuts) aes(x = gdppercap) not aes(gdppercap) - order ggplot commands so that reactivity is obvious; scale adjustments to aesthetics might also be near the aesthetic declaration. --- Here, I contrast the usual plotting method to slow ggplotting: Usual approach: ```r ggplot(my_data, aes(var1, y = var2, col = var3)) + geom_point() + ggtitle("My Title") + labs(x = "the x label", y = "the y label", col = "legend title") ``` Using slow ggplotting: ```r ggplot(data = my_data) + aes(x = var1) + labs(x = "the x label") + aes(y = var2) + labs(y = "the y label") + geom_point() + aes(col = var3) + labs(col = "legend title") + labs(title = "My title") ``` --- # Set up Okay. Let's load the the `reveal for xaringan` functions for "flipbooking" and the `tidyverse`. ```r source(file = "https://raw.githubusercontent.com/EvaMaeRey/little_flipbooks_library/master/xaringan_reveal_parentheses_balanced.R") ``` And load the tidyverse. ```r library(tidyverse) ``` And load the data from the tidytuesday github page. ```r nobel_winners <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winners.csv") ``` --- # Where we are going: We'll create this plot. I have **echo** set to FALSE in the code chunk options here so that you don't see the code, and **eval** to TRUE so that the plot output is produced. The code chunk is given the name "nobel", and this is used in the in apply_reveal function, which breaks down code by wrangling and plot statements. ![](tidytuesday_minimal_example_files/figure-html/nobel-1.png)<!-- --> --- # How do we get there? In the next slide, we'll walk through the code that produces this plot, and the output along the way. We use the code `apply_reveal("nobel")` in-line to access the code from the code chunk called *nobel*. --- class: split-40 count: false .column[.content[ ```r *nobel_winners ``` ]] .column[.content[ ``` # A tibble: 969 x 18 prize_year category prize motivation prize_share laureate_id <dbl> <chr> <chr> <chr> <chr> <dbl> 1 1901 Chemist… The … "\"in rec… 1/1 160 2 1901 Literat… The … "\"in spe… 1/1 569 3 1901 Medicine The … "\"for hi… 1/1 293 4 1901 Peace The … <NA> 1/2 462 5 1901 Peace The … <NA> 1/2 463 6 1901 Physics The … "\"in rec… 1/1 1 7 1902 Chemist… The … "\"in rec… 1/1 161 8 1902 Literat… The … "\"the gr… 1/1 571 9 1902 Medicine The … "\"for hi… 1/1 294 10 1902 Peace The … <NA> 1/2 464 # … with 959 more rows, and 12 more variables: laureate_type <chr>, # full_name <chr>, birth_date <date>, birth_city <chr>, # birth_country <chr>, gender <chr>, organization_name <chr>, # organization_city <chr>, organization_country <chr>, # death_date <date>, death_city <chr>, death_country <chr> ``` ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% * mutate(age_at_win = prize_year - * lubridate::year(birth_date)) ``` ]] .column[.content[ ``` # A tibble: 969 x 19 prize_year category prize motivation prize_share laureate_id <dbl> <chr> <chr> <chr> <chr> <dbl> 1 1901 Chemist… The … "\"in rec… 1/1 160 2 1901 Literat… The … "\"in spe… 1/1 569 3 1901 Medicine The … "\"for hi… 1/1 293 4 1901 Peace The … <NA> 1/2 462 5 1901 Peace The … <NA> 1/2 463 6 1901 Physics The … "\"in rec… 1/1 1 7 1902 Chemist… The … "\"in rec… 1/1 161 8 1902 Literat… The … "\"the gr… 1/1 571 9 1902 Medicine The … "\"for hi… 1/1 294 10 1902 Peace The … <NA> 1/2 464 # … with 959 more rows, and 13 more variables: laureate_type <chr>, # full_name <chr>, birth_date <date>, birth_city <chr>, # birth_country <chr>, gender <chr>, organization_name <chr>, # organization_city <chr>, organization_country <chr>, # death_date <date>, death_city <chr>, death_country <chr>, # age_at_win <dbl> ``` ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% *ggplot() ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_4-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + * aes(x = prize_year) # x axis ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_5-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis * aes(y = age_at_win) # y axis ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_6-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis * geom_point() ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_7-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + * geom_smooth() # loess smoothing ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_8-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing * theme_minimal() ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_9-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + * labs(x = "Year of prize") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_10-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + * labs(y = "Age at Win") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_11-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + * labs(caption = "Vis: Gina Reynolds for TidyTuesday") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_12-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(caption = "Vis: Gina Reynolds for TidyTuesday") + * labs(title = "Nobel Prize award year vs. age of winner") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_13-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) %>% ggplot() + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(caption = "Vis: Gina Reynolds for TidyTuesday") + labs(title = "Nobel Prize award year vs. age of winner") + * labs(subtitle = "Data: \"A dataset of publication records for Nobel laureates\" \nLi, Jichao; Yin, Yian; Fortunato, Santo; Wang Dashun, 2018 ") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_nobel_14-1.png)<!-- --> ]] --- # A second approach: Separate the data manipulation from the plotting. If you'd like, you can also save your manipulated data and then plot, using the `%>%` to create your pipe line, and then reverse assignment operator `->`. An example follows. --- --- class: split-40 count: false .column[.content[ ```r *nobel_winners ``` ]] .column[.content[ ``` # A tibble: 969 x 18 prize_year category prize motivation prize_share laureate_id <dbl> <chr> <chr> <chr> <chr> <dbl> 1 1901 Chemist… The … "\"in rec… 1/1 160 2 1901 Literat… The … "\"in spe… 1/1 569 3 1901 Medicine The … "\"for hi… 1/1 293 4 1901 Peace The … <NA> 1/2 462 5 1901 Peace The … <NA> 1/2 463 6 1901 Physics The … "\"in rec… 1/1 1 7 1902 Chemist… The … "\"in rec… 1/1 161 8 1902 Literat… The … "\"the gr… 1/1 571 9 1902 Medicine The … "\"for hi… 1/1 294 10 1902 Peace The … <NA> 1/2 464 # … with 959 more rows, and 12 more variables: laureate_type <chr>, # full_name <chr>, birth_date <date>, birth_city <chr>, # birth_country <chr>, gender <chr>, organization_name <chr>, # organization_city <chr>, organization_country <chr>, # death_date <date>, death_city <chr>, death_country <chr> ``` ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% * mutate(age_at_win = prize_year - * lubridate::year(birth_date)) ``` ]] .column[.content[ ``` # A tibble: 969 x 19 prize_year category prize motivation prize_share laureate_id <dbl> <chr> <chr> <chr> <chr> <dbl> 1 1901 Chemist… The … "\"in rec… 1/1 160 2 1901 Literat… The … "\"in spe… 1/1 569 3 1901 Medicine The … "\"for hi… 1/1 293 4 1901 Peace The … <NA> 1/2 462 5 1901 Peace The … <NA> 1/2 463 6 1901 Physics The … "\"in rec… 1/1 1 7 1902 Chemist… The … "\"in rec… 1/1 161 8 1902 Literat… The … "\"the gr… 1/1 571 9 1902 Medicine The … "\"for hi… 1/1 294 10 1902 Peace The … <NA> 1/2 464 # … with 959 more rows, and 13 more variables: laureate_type <chr>, # full_name <chr>, birth_date <date>, birth_city <chr>, # birth_country <chr>, gender <chr>, organization_name <chr>, # organization_city <chr>, organization_country <chr>, # death_date <date>, death_city <chr>, death_country <chr>, # age_at_win <dbl> ``` ]] --- class: split-40 count: false .column[.content[ ```r nobel_winners %>% mutate(age_at_win = prize_year - lubridate::year(birth_date)) -> * nobel_winners_w_age ``` ]] .column[.content[ ]] --- # Plotting the transformed data (and *not* revealing your plot in advance -- *just* showing the build). Now you can use the transformed data to start the plot. Note that you also might prefer a behavior where you don't show a preview of the finished plot in advance. I show an example of this below. I don't evaluate the code chunk (i.e. I've set eval to FALSE) and I don't echo it (i.e. echo is set to false). This means that the code chunk itself won't yeild any output (code or plot) to be put on a slide. This means that I don't need to use the dashes, \-\-\- to separate the code chunk from the `apply_reveal()` statement, which differs from previous set-ups where a slide separator directly preceded the `apply_reveal()` call. --- class: split-40 count: false .column[.content[ ```r *ggplot(data = nobel_winners_w_age) ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_1-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + * aes(x = prize_year) # x axis ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_2-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis * aes(y = age_at_win) # y axis ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_3-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis * geom_point() ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_4-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + * geom_smooth() # loess smoothing ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_5-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing * theme_minimal() ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_6-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + * labs(x = "Year of prize") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_7-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + * labs(y = "Age at Win") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_8-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + * labs(x = "Year of prize") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_9-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(x = "Year of prize") + * labs(y = "Age at Win") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_10-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(x = "Year of prize") + labs(y = "Age at Win") + * labs(caption = "Vis: Gina Reynolds for TidyTuesday") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_11-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(caption = "Vis: Gina Reynolds for TidyTuesday") + * labs(title = "Nobel Prize award year vs. age of winner") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_12-1.png)<!-- --> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = nobel_winners_w_age) + aes(x = prize_year) + # x axis aes(y = age_at_win) + # y axis geom_point() + geom_smooth() + # loess smoothing theme_minimal() + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(x = "Year of prize") + labs(y = "Age at Win") + labs(caption = "Vis: Gina Reynolds for TidyTuesday") + labs(title = "Nobel Prize award year vs. age of winner") + * labs(subtitle = "Data: \"A dataset of publication records for Nobel laureates\" \nLi, Jichao; Yin, Yian; Fortunato, Santo; Wang Dashun, 2018 ") ``` ]] .column[.content[ ![](tidytuesday_minimal_example_files/figure-html/output_plot_nobel_again_13-1.png)<!-- --> ]] <style type="text/css"> .remark-code{line-height: 1.5; font-size: 80%} </style>