class: center, middle, inverse, title-slide # A ggplot2 grammar guide ### Gina Reynolds, July 2019 --- A data visualization: - is composed of geometric shapes -- - that take on aesthetics -- - that represent variables -- - from a data set --- We can say this another way -- a more classic definition too: A statistical graphic -- maps variables of a dataset -- to aesthetic properties -- of geometric objects --- # The ggplot2 grammar follows this order second order... Let's see a simple example this. (First watch the Hans Rosling clip - ) --- class: split-40 count: false .column[.content[ ```r # specify the data for the plot *ggplot(data = gapminder_2002) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/overview_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r # specify the data for the plot ggplot(data = gapminder_2002) + # the x position will represent # the gdp per capita variable * aes(x = gdpPercap) # x position ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/overview_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r # specify the data for the plot ggplot(data = gapminder_2002) + # the x position will represent # the gdp per capita variable aes(x = gdpPercap) + # x position # the y position will represent # the lifeExp variable * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/overview_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r # specify the data for the plot ggplot(data = gapminder_2002) + # the x position will represent # the gdp per capita variable aes(x = gdpPercap) + # x position # the y position will represent # the lifeExp variable aes(y = lifeExp) + # the point geometric shape # take on the positions according # to the specified mapping # i.e. the representation * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/overview_auto_4_output-1.png" width="100%" /> ]] --- Let's look at each of these steps one at a time. --- # The declarative mood: declaring data Data is a "first class citizen" in the 'Grammar of Graphics' and `tidyverse` framework. Therefore, the first step to creating a plot, not surprisingly, is declaring data. Let's do it. --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) # That's it ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/declare_data_1_1_output-1.png" width="100%" /> ]] --- # Pipe data into ggplot() An alternative to the above syntax is to "pipe" the data into the ggplot function, as shown below. The pipe operator, which is made available in R by loading the `tidyverse` package, pulls the object that preceeds it into the function. This second option allows us to glimpse the raw data that we will use in our plot. ```r # Option 2 gapminder_2002 %>% # data piped into ggplot() # ggplot function initiating plot ``` Let's see how this works. Then we'll move on to the second step in this grammar lesson --- the interogative mood --- aesthetic representation. --- class: split-40 count: false .column[.content[ ```r # Option 2 *gapminder_2002 # data piped into ``` ]] .column[.content[ ``` # A tibble: 142 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 2002 42.1 25268405 727. 2 Albania Europe 2002 75.7 3508512 4604. 3 Algeria Africa 2002 71.0 31287142 5288. 4 Angola Africa 2002 41.0 10866106 2773. 5 Argentina Americas 2002 74.3 38331121 8798. 6 Australia Oceania 2002 80.4 19546792 30688. 7 Austria Europe 2002 79.0 8148312 32418. 8 Bahrain Asia 2002 74.8 656397 23404. 9 Bangladesh Asia 2002 62.0 135656790 1136. 10 Belgium Europe 2002 78.3 10311970 30486. # … with 132 more rows ``` ]] --- class: split-40 count: false .column[.content[ ```r # Option 2 gapminder_2002 %>% # data piped into * ggplot() # ggplot function initiating plot ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/declare_data_option_2_auto_2_output-1.png" width="100%" /> ]] --- name: aes # The interogative mood: aesthetic mapping <!-- > aes(color = temperature) translates to "Please represent 'temperature' for me, color aesthetic!" --> Aesthetics --- like position, color, and size --- represent variables which do a lot of work for us in data visualization. Distributions in values --- which can be painstaking to digest in their "raw" form in a table of numbers or categories --- are easy to communicate when represented by position, color, size and so on. The term 'mapping' in 'aesthetic mapping' refers to the fact that variables are 'mapped' to aesthetics -- i.e. they are represented by aesthetics. --- # A main pool of aesthetics Let's think about the various aesthetics that can be used to communicate distributions. --- # A main pool of aesthetics Here's a key set of aesthetics that can be used to communicate about distribution among categories or the spread in a variable. <img src="https://serialmentor.com/dataviz/aesthetic_mapping_files/figure-html/common-aesthetics-1.png" width="80%" /> Note: This figure is from Wilke's [Fundamentals of Data Visualization](https://serialmentor.com/dataviz/aesthetic-mapping.html#aesthetics-and-types-of-data). <!-- Consumers of data visualization understand the importance of aesthetic mapping intuitively. But some may feel overwhelmed by the terminology. To overcome this in ggplot, we might personify things a bit and pronounce `aes()` --- the function used for specifying aesthetic mapping in ggplot2 --- as "ask". --> <!-- "Mapping" in the phrase "aesthetic mapping" can be understood as representation, and aesthetics are that general category of things we visually differentiate: color, position, size, shape, etc. --> <!-- So the statement `aes(x = gdpPercap)`, can be translated to 'asking the x position to represent the variable gdpPercap.' The statement `aes(color = continent)` can be understood as 'asking the color aesthetic to represent the variable continent.' --> <!-- --- --> <!-- People don't always "get" ggplot2 right away. --> <!-- One of the hurdles is aes() statements -- the aesthetic mapping statements. --> <!-- I try to say "aesthetic mapping" when talking about aes() w/ newcomers, saying "'mapping' as in representation." Then translate into plainer language, "What variable are we asking the aesthetic (color, x-position, shape, etc.) to represent?" aes() is "asking" -- *asking* very nicely for a specific aesthetic to do us a favor. So --> <!-- --- --> <!-- ## Whom to "ask"? --> <!-- It is probably a good idea to start by speaking in general terms about the variety of aesthetics that might represent values at once. You can also talk about the appropriateness of the various aesthetics for representing different value types - like numerical, categorical or ordinal data. --> --- # Think "ask" when you see aes() As discussed above aes() refers to "aesthetic mapping", where "mapping" is meant in terms of representation. What's inside aes() addresses the question, "What variable are we asking the aesthetic (color, x-position, shape, etc.) to represent?" aes() is "asking" politely for a specific aesthetic to do us the favor of representation. For example `aes(color = age)` can be translated to English as, "Please, color aesthetic, represent the variable `age` for me." It is aesthetic mapping too that helps us "interogate" our data; mapped aesthetics quickly communicate distributions within variables that we may be curious about. So it might be helpful to think "ask" when you see aes(). --- # requesting aesthetic representation Let's see how we request an aesthetic to represent a variable from our dataset using ggplot. We'll first look at the x position (horizontal) and y position (vertical). --- class: split-40 count: false .column[.content[ ```r *gapminder_2002 ``` ]] .column[.content[ ``` # A tibble: 142 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 2002 42.1 25268405 727. 2 Albania Europe 2002 75.7 3508512 4604. 3 Algeria Africa 2002 71.0 31287142 5288. 4 Angola Africa 2002 41.0 10866106 2773. 5 Argentina Americas 2002 74.3 38331121 8798. 6 Australia Oceania 2002 80.4 19546792 30688. 7 Austria Europe 2002 79.0 8148312 32418. 8 Bahrain Asia 2002 74.8 656397 23404. 9 Bangladesh Asia 2002 62.0 135656790 1136. 10 Belgium Europe 2002 78.3 10311970 30486. # … with 132 more rows ``` ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% * ggplot() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/x_and_y_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% ggplot() + # x position represents gdpPercap variable * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/x_and_y_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% ggplot() + # x position represents gdpPercap variable aes(x = gdpPercap) + # y position represents life expectancy * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/x_and_y_auto_4_output-1.png" width="100%" /> ]] --- # A complete sentance: data + aes + geom The aes() statement says that we want the variable `gdpPercap` represented by the x position, and the variable `lifeExp` represented by the y position. The x and y scales give us a clue that this info is registered. But, we don't really have any insight about our data yet. Why? Because aesthetics are taken on by geometric objects. We still need to declare what geometric object will take on the aesthetics. Let's see how, when we add a geometric object "point"; the x and y position for each row of data are taken on by that object. --- class: split-40 count: false .column[.content[ ```r *gapminder_2002 ``` ]] .column[.content[ ``` # A tibble: 142 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 2002 42.1 25268405 727. 2 Albania Europe 2002 75.7 3508512 4604. 3 Algeria Africa 2002 71.0 31287142 5288. 4 Angola Africa 2002 41.0 10866106 2773. 5 Argentina Americas 2002 74.3 38331121 8798. 6 Australia Oceania 2002 80.4 19546792 30688. 7 Austria Europe 2002 79.0 8148312 32418. 8 Bahrain Asia 2002 74.8 656397 23404. 9 Bangladesh Asia 2002 62.0 135656790 1136. 10 Belgium Europe 2002 78.3 10311970 30486. # … with 132 more rows ``` ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% * ggplot() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_for_aes_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% ggplot() + * aes(x = gdpPercap) # x position ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_for_aes_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% ggplot() + aes(x = gdpPercap) + # x position * aes(y = lifeExp) # y position ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_for_aes_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r gapminder_2002 %>% ggplot() + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position # points take on x and y position * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_for_aes_auto_5_output-1.png" width="100%" /> ]] --- ## Exploring more aesthetic mappings Let's not get distracted by geoms (we'll come back to these). Aesthetic mapping does so much work for us in data viz. So, let's explore a bunch of the *aesthetic* that might represent *variables* in our data. --- So far, we have have stated the x and y positions - these are *required* aesthetics for the "point" geometric object. But more are *optional.* In the next example, we'll do all of the required aesthetic mapping (x and y position) and then also use other allowable aesthetics for `geom_point()` (color, shape, size, alpha). We'll also see that double or tripple "mapping" is allowed -- multiple aesthetics may represent the same variable. <!-- Note to teachers: Depending on your data set you might run out of unique variables to map to aesthetics, but mapping variables to multiple aesthetics can expose students to more aesthetic mappings - even though this might not be desirable in actual work product. And go ahead and do a lot of double mapping of the same variables in the learning phase. --> --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + * aes(x = gdpPercap) # x position ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position * aes(y = lifeExp) # y position ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position * geom_point() # above aes are required for point ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point * aes(color = continent) # color ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point aes(color = continent) + # color * aes(shape = continent) # shape ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point aes(color = continent) + # color aes(shape = continent) + # shape * aes(size = pop) # size ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_7_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point aes(color = continent) + # color aes(shape = continent) + # shape aes(size = pop) + # size * aes(alpha = lifeExp) # transparency ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_8_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point aes(color = continent) + # color aes(shape = continent) + # shape aes(size = pop) + # size aes(alpha = lifeExp) + # transparency * aes(color = lifeExp) # overwriting color's representation ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point_auto_9_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point aes(color = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point1_rotate_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point * aes(shape = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point1_rotate_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point * aes(size = pop) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point1_rotate_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point * aes(alpha = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point1_rotate_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + # x position aes(y = lifeExp) + # y position geom_point() + # above aes are required for point * aes(color = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_point1_rotate_5_output-1.png" width="100%" /> ]] --- # A few more aesthetics Other aesthetics can be explored by changing the geometric object to one that allows additional aesthetics, like *fill* and *line type*; one such "geom" is geom_col(), to create a column geometry. Based on the plot that follows, how do the fill and color aesthetic differ? [Or I'm on the overview track, take me to the next session](#scales) --- class: split-40 count: false .column[.content[ ```r *continent_aggregate ``` ]] .column[.content[ ``` # A tibble: 5 x 2 continent country_count <fct> <int> 1 Africa 52 2 Americas 25 3 Asia 33 4 Europe 30 5 Oceania 2 ``` ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% * ggplot() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + * aes(x = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + * aes(y = country_count) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + aes(y = country_count) + * geom_col() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + aes(y = country_count) + geom_col() + * aes(color = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + aes(y = country_count) + geom_col() + aes(color = continent) + * aes(fill = continent) # fill color for areas ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_7_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + aes(y = country_count) + geom_col() + aes(color = continent) + aes(fill = continent) + # fill color for areas * aes(alpha = country_count) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_8_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r continent_aggregate %>% ggplot() + aes(x = continent) + aes(y = country_count) + geom_col() + aes(color = continent) + aes(fill = continent) + # fill color for areas aes(alpha = country_count) + * aes(linetype = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/aesthetics_area_auto_9_output-1.png" width="100%" /> ]] --- ## Questions: - What are *required aesthetics* vs. optional? - What are the names of seven different aesthetics that we used above? - What is the difference between the *fill* and *color* aesthetic? - Look at the *help* for geom_text. What are the *required* aesthetics? --- name: geoms # Nouns: Geometric objects So far, we have seen a few geometric objects: geom_point(), geom_col(), geom_line(). We won't cover many more here. Why? The celebrated language teacher, Michel Thomas, argued that building up vocabulary should not be a major focus in becoming fluent in a foreign language. I'm of the same mind. Geometric objects --- our "nouns" --- are abundant. But we can use rather few of them and still work towards a profound working understanding of the ggplot2 grammar. With that work done, exchanging one geom for another, like a language learner exchanging "dog" for "canine", is a rather easy business. But there are some grammatical things that we'll come back to, as well. We'll see a few more geometric objects along the way as we move forward. And, then come back to this topic squarely later on. --- name: local # The conditional mood: geom specific data, aesthetic mapping, and unmapped aesthetics The conditional mood is about "if". "If I see flowers, then I pick them." In ggplot, we also observe such conditionality. Data and aesthetic mappings may be tied to *specific* geometric objects. Also aesthetics that don't do any representation (unmapped aesthetics) may also be specified on a geom-by-geom basis. These specifications are "local" to a geom, rather than globally defined like the data declarations and aesthetic mapping statements we have seen before. --- ## Part i. Going *local* with data Let's look at tying some data to a specific geometric object in the next example. In this example you'll also see how a 'variable' can be created *on the fly*. --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) # global data ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + * aes(color = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 * aes(xend = 0) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 aes(xend = 0) + # like creating a column with a single value, 0 * aes(yend = 0) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_7_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 aes(xend = 0) + # like creating a column with a single value, 0 aes(yend = 0) + # geom specific data * geom_segment( * data = gapminder_2002_europe, * aes(size = gdpPercap, * alpha = pop), * color = "orange" * ) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/local_data_auto_8_output-1.png" width="100%" /> ]] --- ## Part ii. geom specific aesthetic representation So far we have seen aesthetic "mapping" (representation) applied globally --- where aes() is an independent statement. In this case aesthetic representation is applied to all geoms. However, we can be specific about which geoms should take on the aesthetic representation, if we use aes() within the geom_*() statement. This local aesthetic mapping declaration will overwrite An example is shown below. --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002_europe) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics aes(x = gdpPercap) + aes(y = lifeExp) + * aes(xend = 0) # required aes for segment ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics aes(x = gdpPercap) + aes(y = lifeExp) + aes(xend = 0) + # required aes for segment * aes(yend = 0) # required aes for segment ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics aes(x = gdpPercap) + aes(y = lifeExp) + aes(xend = 0) + # required aes for segment aes(yend = 0) + # required aes for segment * geom_segment() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002_europe) + # global aesthetics aes(x = gdpPercap) + aes(y = lifeExp) + aes(xend = 0) + # required aes for segment aes(yend = 0) + # required aes for segment geom_segment() + # geom specific aesthetics * geom_point(aes(color = gdpPercap, * size = pop)) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/geom_specific_aes_mapping_auto_7_output-1.png" width="100%" /> ]] --- ## Part iii. Being a dictator -- unmapped aesthetics (Imparative mode) *Mapped* aesthetics contrast with unmapped, across-the-board, aesthetics for a geometric object. geom_point(color = "blue"), is an imperative -- not an ask. A dictator move. "Do this everywhere." It is good to show a plot with two of the same geom layer, one with mapped aesthetics and the other without. --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/unmapped_aes_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/unmapped_aes_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/unmapped_aes_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/unmapped_aes_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + # Another geometric layer with aesthetics # that don't do representation: * geom_point( * color = "plum4", * size = 8, * shape = 21 * ) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/unmapped_aes_auto_5_output-1.png" width="100%" /> ]] --- # Combinations of local and global data, aesthetics Now you know about *global* and *local* data, aesthetic representing variables, and overwriting defaults for aesthetics doing no variable representation. Let's think about combining these. --- Look at the code that follows. What are your expectations for the result? Move forward in the presentation to check if your expectations match the actual result. ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point(size = 5, alpha = .7) + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 aes(xend = 0) + # like creating a column with a single value, 0 aes(yend = 0) + # geom specific data geom_segment( data = gapminder_2002_europe, #BREAK2 aes(size = gdpPercap, #BREAK3 alpha = pop), #BREAK3 color = "orange" #BREAK4 ) ``` --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) # global data ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + * geom_point(size = 5, * alpha = .7) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point(size = 5, alpha = .7) + * aes(color = continent) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point(size = 5, alpha = .7) + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 * aes(xend = 0) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point(size = 5, alpha = .7) + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 aes(xend = 0) + # like creating a column with a single value, 0 * aes(yend = 0) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_7_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + # global data aes(x = gdpPercap) + aes(y = lifeExp) + geom_point(size = 5, alpha = .7) + aes(color = continent) + # xend and yend are required for geom_segment # like creating a column with a single value, 0 aes(xend = 0) + # like creating a column with a single value, 0 aes(yend = 0) + # geom specific data * geom_segment( * data = gapminder_2002_europe, * aes(size = gdpPercap, * alpha = pop), * color = "orange" * ) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/all_conditional_auto_8_output-1.png" width="100%" /> ]] --- name: annotate # Exclamations and Interjections: Annotate using "geoms" not tied to data --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + * annotate(geom = "point", * x = 10000, * y = 60, * color = "red") ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + annotate(geom = "point", x = 10000, y = 60, color = "red") + * annotate(geom = "text", * x = c(3000, 10000, 40000), * y = 80, * label = "Hello", * color = "blue") ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + annotate(geom = "point", x = 10000, y = 60, color = "red") + annotate(geom = "text", x = c(3000, 10000, 40000), y = 80, label = "Hello", color = "blue") + * annotate(geom = "curve", * x = 3000, * y = 79, * xend = 10000 - 500, * yend = 60 + .3, * color = "green", * arrow = arrow(angle = 20)) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotate_auto_7_output-1.png" width="100%" /> ]] --- ## Convenience annotation geoms Some convenience functions have also been written for annotation using the geom_*() syntax. - geom_abline() - geom_hline() - geom_vline() --- class: split-40 count: false .column[.content[ ```r *ggplot(data = gapminder_2002) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_1_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + * aes(x = gdpPercap) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_2_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + * aes(y = lifeExp) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_3_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + * geom_point() ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_4_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + * geom_abline(slope = .01, intercept = 0) ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_5_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + geom_abline(slope = .01, intercept = 0) + * geom_hline(yintercept = 70, * linetype = "dotted") ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_6_output-1.png" width="100%" /> ]] --- class: split-40 count: false .column[.content[ ```r ggplot(data = gapminder_2002) + aes(x = gdpPercap) + aes(y = lifeExp) + geom_point() + geom_abline(slope = .01, intercept = 0) + geom_hline(yintercept = 70, linetype = "dotted") + * geom_vline(xintercept = c(1000, 10000), * linetype = "dashed") ``` ]] .column[.content[ <img src="ggplot2_grammar_guide_files/figure-html/annotation_geoms_auto_7_output-1.png" width="100%" /> ]] --- Next up: <style type="text/css"> .remark-code{line-height: 1.5; font-size: 70%} </style> <!-- ```{r, dpi=400, eval = F} --> <!-- gapminder %>% --> <!-- select(continent) %>% --> <!-- ggplot() + --> <!-- aes(x = 1) + --> <!-- aes(fill = continent) + --> <!-- geom_bar(color = "white", size = .2) + --> <!-- coord_polar(theta = "y") + --> <!-- theme_void() + --> <!-- scale_fill_viridis_d() + --> <!-- theme(rect = element_rect(fill = "grey", --> <!-- color = "grey", --> <!-- linetype = "solid", --> <!-- size = 0)) --> <!-- ggsave("pie.svg", dpi = 320,device = "svg") --> <!-- ``` --> <!-- --- --> <!-- Or at least an organized twitter thread!? Another idea for you: pull aes() out of ggplot(). Do you do it? Lot's of reasons to do it! Downside is currently most examples make use of nested approach. @grrrck does this too, and esquisse --> <!-- ggplot(data = my_data) + --> <!-- aes(x = my_var) --> <!-- Is it possible to do this with multiple geoms? I usually specify aes within the geom like --> <!-- ggplot(data) + --> <!-- geom_point(aes(x, y)) + --> <!-- geom_line(aes(x, y, group = id)) --> <!-- I prefer this approach because it's explicit which aesthetics are bound to which geoms --> <!-- My blog post, (which no one has probably ever read) exactly on this topic! https://evangelinereynolds.netlify.com/post/mapping-aesthetics/ … In general I'd say go global. I think in general, most folks don't have a bunch of conflicts for aesthetics geom by geom (though occasionally yes?). Let me know what you think! --> <!-- To change the data used for a plot, use the %+% operator! Oh!!! --> <!-- # Stat_* --> <!-- ## Univariate discrete --> <!-- ```{r univariate_discrete, eval = F, echo = F} --> <!-- ggplot(gapminder_2002) + --> <!-- aes(x = continent) + --> <!-- stat_count() + --> <!-- geom_bar() # convenience geom --> <!-- # default counting --> <!-- ``` --> <!-- --- --> <!-- r chunk_reveal("univariate_discrete")` --> <!-- --- --> <!-- ```{r} --> <!-- ggplot(data = gapminder_2002) + --> <!-- aes(x = continent) + --> <!-- aes(y = lifeExp) + --> <!-- geom_point(alpha = .1) + --> <!-- stat_summary( --> <!-- fun.ymin = min, --> <!-- fun.ymax = max, --> <!-- fun.y = median --> <!-- ) --> <!-- ``` --> <!-- --- --> <!-- ```{r} --> <!-- gapminder_2002 %>% --> <!-- mutate(seventy_plus = lifeExp > 60) %>% --> <!-- ggplot() + --> <!-- aes(x = continent) + --> <!-- aes(fill = seventy_plus) + --> <!-- geom_bar(alpha = .2) --> <!-- ``` -->