Recipe 4: geom_cat_lm(), geom_cat_fitted(), and geom_cat_residuals()

Example recipe #4: geom_cat_lm()

In this next recipe, we use panel-wise computation again to visualize an linear model that is estimated using both a continuous and a categorical variable, i.e. lm(y ~ x + cat). This may feel a bit like geom_smooth(method = lm) + aes(group = cat). However, since geom_smooth does group-wise computation, the data is broken up before model estimation when a discrete variable is mapped like aes(color = sex) – meaning a model is estimated for each category. Let’s see how we might visual a single model that includes a categorical variable.

Our first goal is to be able to specify a plot with newly created geom_cat_lm() (and well look at defining geom_cat_fitted() or geom_cat_residuals())

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm, 
      y = bill_length_mm,
      cat = species) +
  geom_point() + 
  geom_cat_lm()

Let’s get started!

Step 0: use base ggplot2 to get the job done

It’s a good idea to look at how you’d get things done without Stat extension first, just using ‘base’ ggplot2. The computational moves you make here can serve a reference for building our extension function.

library(tidyverse)
penguins <- remove_missing(palmerpenguins::penguins)

model <- lm(formula = bill_length_mm ~ bill_depth_mm + 
              species, 
            data = penguins) 

penguins_w_fitted <- penguins |> 
  mutate(fitted = model$fitted.values)

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm, 
      y = bill_length_mm,
      group = species) +
  geom_point() + 
  geom_line(data = penguins_w_fitted,
             aes(y = fitted),
             color = "maroon4")

Use ggplot2::layer_data() to inspect the render-ready data internal in the plot. Your Stat will help prep data to look something like this.

layer_data(plot = last_plot(), 
           i = 2) # the fitted y (not the raw data y) is of interest
           y    x group PANEL flipped_aes  colour linewidth linetype alpha
137 34.83336 15.5     1     1       FALSE maroon4       0.5        1    NA
119 35.39398 15.9     1     1       FALSE maroon4       0.5        1    NA
97  35.53414 16.0     1     1       FALSE maroon4       0.5        1    NA
73  35.67430 16.1     1     1       FALSE maroon4       0.5        1    NA
93  35.67430 16.1     1     1       FALSE maroon4       0.5        1    NA
61  35.81445 16.2     1     1       FALSE maroon4       0.5        1    NA
105 36.23492 16.5     1     1       FALSE maroon4       0.5        1    NA
133 36.23492 16.5     1     1       FALSE maroon4       0.5        1    NA
53  36.37508 16.6     1     1       FALSE maroon4       0.5        1    NA
63  36.37508 16.6     1     1       FALSE maroon4       0.5        1    NA
26  36.51523 16.7     1     1       FALSE maroon4       0.5        1    NA
71  36.65539 16.8     1     1       FALSE maroon4       0.5        1    NA
139 36.65539 16.8     1     1       FALSE maroon4       0.5        1    NA
40  36.79555 16.9     1     1       FALSE maroon4       0.5        1    NA
55  36.79555 16.9     1     1       FALSE maroon4       0.5        1    NA
30  36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
57  36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
103 36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
111 36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
113 36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
117 36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
138 36.93570 17.0     1     1       FALSE maroon4       0.5        1    NA
59  37.07586 17.1     1     1       FALSE maroon4       0.5        1    NA
87  37.07586 17.1     1     1       FALSE maroon4       0.5        1    NA
123 37.07586 17.1     1     1       FALSE maroon4       0.5        1    NA
135 37.07586 17.1     1     1       FALSE maroon4       0.5        1    NA
145 37.07586 17.1     1     1       FALSE maroon4       0.5        1    NA
20  37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
67  37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
75  37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
101 37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
115 37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
136 37.21602 17.2     1     1       FALSE maroon4       0.5        1    NA
89  37.35617 17.3     1     1       FALSE maroon4       0.5        1    NA
2   37.49633 17.4     1     1       FALSE maroon4       0.5        1    NA
51  37.63648 17.5     1     1       FALSE maroon4       0.5        1    NA
69  37.63648 17.5     1     1       FALSE maroon4       0.5        1    NA
130 37.63648 17.5     1     1       FALSE maroon4       0.5        1    NA
131 37.63648 17.5     1     1       FALSE maroon4       0.5        1    NA
8   37.77664 17.6     1     1       FALSE maroon4       0.5        1    NA
76  37.77664 17.6     1     1       FALSE maroon4       0.5        1    NA
121 37.77664 17.6     1     1       FALSE maroon4       0.5        1    NA
129 37.77664 17.6     1     1       FALSE maroon4       0.5        1    NA
45  37.91680 17.7     1     1       FALSE maroon4       0.5        1    NA
107 37.91680 17.7     1     1       FALSE maroon4       0.5        1    NA
6   38.05695 17.8     1     1       FALSE maroon4       0.5        1    NA
11  38.05695 17.8     1     1       FALSE maroon4       0.5        1    NA
28  38.05695 17.8     1     1       FALSE maroon4       0.5        1    NA
79  38.05695 17.8     1     1       FALSE maroon4       0.5        1    NA
143 38.05695 17.8     1     1       FALSE maroon4       0.5        1    NA
23  38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
43  38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
47  38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
95  38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
125 38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
134 38.19711 17.9     1     1       FALSE maroon4       0.5        1    NA
3   38.33727 18.0     1     1       FALSE maroon4       0.5        1    NA
36  38.33727 18.0     1     1       FALSE maroon4       0.5        1    NA
60  38.33727 18.0     1     1       FALSE maroon4       0.5        1    NA
85  38.33727 18.0     1     1       FALSE maroon4       0.5        1    NA
124 38.33727 18.0     1     1       FALSE maroon4       0.5        1    NA
19  38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
27  38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
49  38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
86  38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
88  38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
144 38.47742 18.1     1     1       FALSE maroon4       0.5        1    NA
58  38.61758 18.2     1     1       FALSE maroon4       0.5        1    NA
16  38.75773 18.3     1     1       FALSE maroon4       0.5        1    NA
110 38.75773 18.3     1     1       FALSE maroon4       0.5        1    NA
122 38.75773 18.3     1     1       FALSE maroon4       0.5        1    NA
14  38.89789 18.4     1     1       FALSE maroon4       0.5        1    NA
37  38.89789 18.4     1     1       FALSE maroon4       0.5        1    NA
66  38.89789 18.4     1     1       FALSE maroon4       0.5        1    NA
142 38.89789 18.4     1     1       FALSE maroon4       0.5        1    NA
33  39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
38  39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
70  39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
92  39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
94  39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
118 39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
127 39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
128 39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
146 39.03805 18.5     1     1       FALSE maroon4       0.5        1    NA
22  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
24  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
50  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
82  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
91  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
99  39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
114 39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
141 39.17820 18.6     1     1       FALSE maroon4       0.5        1    NA
1   39.31836 18.7     1     1       FALSE maroon4       0.5        1    NA
17  39.31836 18.7     1     1       FALSE maroon4       0.5        1    NA
140 39.31836 18.7     1     1       FALSE maroon4       0.5        1    NA
41  39.45852 18.8     1     1       FALSE maroon4       0.5        1    NA
52  39.45852 18.8     1     1       FALSE maroon4       0.5        1    NA
77  39.45852 18.8     1     1       FALSE maroon4       0.5        1    NA
84  39.45852 18.8     1     1       FALSE maroon4       0.5        1    NA
21  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
25  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
29  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
46  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
68  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
90  39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
100 39.59867 18.9     1     1       FALSE maroon4       0.5        1    NA
12  39.73883 19.0     1     1       FALSE maroon4       0.5        1    NA
42  39.73883 19.0     1     1       FALSE maroon4       0.5        1    NA
65  39.73883 19.0     1     1       FALSE maroon4       0.5        1    NA
104 39.73883 19.0     1     1       FALSE maroon4       0.5        1    NA
120 39.73883 19.0     1     1       FALSE maroon4       0.5        1    NA
35  39.87898 19.1     1     1       FALSE maroon4       0.5        1    NA
54  39.87898 19.1     1     1       FALSE maroon4       0.5        1    NA
62  39.87898 19.1     1     1       FALSE maroon4       0.5        1    NA
74  39.87898 19.1     1     1       FALSE maroon4       0.5        1    NA
18  40.01914 19.2     1     1       FALSE maroon4       0.5        1    NA
83  40.01914 19.2     1     1       FALSE maroon4       0.5        1    NA
126 40.01914 19.2     1     1       FALSE maroon4       0.5        1    NA
4   40.15930 19.3     1     1       FALSE maroon4       0.5        1    NA
34  40.15930 19.3     1     1       FALSE maroon4       0.5        1    NA
64  40.29945 19.4     1     1       FALSE maroon4       0.5        1    NA
72  40.29945 19.4     1     1       FALSE maroon4       0.5        1    NA
78  40.29945 19.4     1     1       FALSE maroon4       0.5        1    NA
48  40.43961 19.5     1     1       FALSE maroon4       0.5        1    NA
81  40.43961 19.5     1     1       FALSE maroon4       0.5        1    NA
108 40.43961 19.5     1     1       FALSE maroon4       0.5        1    NA
7   40.57977 19.6     1     1       FALSE maroon4       0.5        1    NA
39  40.71992 19.7     1     1       FALSE maroon4       0.5        1    NA
116 40.86008 19.8     1     1       FALSE maroon4       0.5        1    NA
32  41.14039 20.0     1     1       FALSE maroon4       0.5        1    NA
96  41.14039 20.0     1     1       FALSE maroon4       0.5        1    NA
98  41.14039 20.0     1     1       FALSE maroon4       0.5        1    NA
102 41.14039 20.0     1     1       FALSE maroon4       0.5        1    NA
132 41.28055 20.1     1     1       FALSE maroon4       0.5        1    NA
80  41.56086 20.3     1     1       FALSE maroon4       0.5        1    NA
106 41.56086 20.3     1     1       FALSE maroon4       0.5        1    NA
112 41.84117 20.5     1     1       FALSE maroon4       0.5        1    NA
5   41.98133 20.6     1     1       FALSE maroon4       0.5        1    NA
13  42.12149 20.7     1     1       FALSE maroon4       0.5        1    NA
109 42.12149 20.7     1     1       FALSE maroon4       0.5        1    NA
10  42.68211 21.1     1     1       FALSE maroon4       0.5        1    NA
31  42.68211 21.1     1     1       FALSE maroon4       0.5        1    NA
56  42.68211 21.1     1     1       FALSE maroon4       0.5        1    NA
9   42.82227 21.2     1     1       FALSE maroon4       0.5        1    NA
44  42.82227 21.2     1     1       FALSE maroon4       0.5        1    NA
15  43.24274 21.5     1     1       FALSE maroon4       0.5        1    NA
316 46.00184 16.4     2     1       FALSE maroon4       0.5        1    NA
327 46.14200 16.5     2     1       FALSE maroon4       0.5        1    NA
288 46.28215 16.6     2     1       FALSE maroon4       0.5        1    NA
296 46.28215 16.6     2     1       FALSE maroon4       0.5        1    NA
304 46.28215 16.6     2     1       FALSE maroon4       0.5        1    NA
322 46.28215 16.6     2     1       FALSE maroon4       0.5        1    NA
298 46.42231 16.7     2     1       FALSE maroon4       0.5        1    NA
301 46.56247 16.8     2     1       FALSE maroon4       0.5        1    NA
309 46.84278 17.0     2     1       FALSE maroon4       0.5        1    NA
328 46.84278 17.0     2     1       FALSE maroon4       0.5        1    NA
280 46.98294 17.1     2     1       FALSE maroon4       0.5        1    NA
278 47.26325 17.3     2     1       FALSE maroon4       0.5        1    NA
286 47.26325 17.3     2     1       FALSE maroon4       0.5        1    NA
315 47.26325 17.3     2     1       FALSE maroon4       0.5        1    NA
318 47.26325 17.3     2     1       FALSE maroon4       0.5        1    NA
320 47.26325 17.3     2     1       FALSE maroon4       0.5        1    NA
287 47.54356 17.5     2     1       FALSE maroon4       0.5        1    NA
307 47.54356 17.5     2     1       FALSE maroon4       0.5        1    NA
271 47.96403 17.8     2     1       FALSE maroon4       0.5        1    NA
276 47.96403 17.8     2     1       FALSE maroon4       0.5        1    NA
283 47.96403 17.8     2     1       FALSE maroon4       0.5        1    NA
294 47.96403 17.8     2     1       FALSE maroon4       0.5        1    NA
266 48.10419 17.9     2     1       FALSE maroon4       0.5        1    NA
290 48.10419 17.9     2     1       FALSE maroon4       0.5        1    NA
310 48.10419 17.9     2     1       FALSE maroon4       0.5        1    NA
312 48.10419 17.9     2     1       FALSE maroon4       0.5        1    NA
279 48.38450 18.1     2     1       FALSE maroon4       0.5        1    NA
330 48.38450 18.1     2     1       FALSE maroon4       0.5        1    NA
272 48.52466 18.2     2     1       FALSE maroon4       0.5        1    NA
273 48.52466 18.2     2     1       FALSE maroon4       0.5        1    NA
285 48.52466 18.2     2     1       FALSE maroon4       0.5        1    NA
331 48.52466 18.2     2     1       FALSE maroon4       0.5        1    NA
302 48.66481 18.3     2     1       FALSE maroon4       0.5        1    NA
292 48.80497 18.4     2     1       FALSE maroon4       0.5        1    NA
311 48.94512 18.5     2     1       FALSE maroon4       0.5        1    NA
284 49.08528 18.6     2     1       FALSE maroon4       0.5        1    NA
300 49.08528 18.6     2     1       FALSE maroon4       0.5        1    NA
269 49.22544 18.7     2     1       FALSE maroon4       0.5        1    NA
314 49.22544 18.7     2     1       FALSE maroon4       0.5        1    NA
333 49.22544 18.7     2     1       FALSE maroon4       0.5        1    NA
299 49.36559 18.8     2     1       FALSE maroon4       0.5        1    NA
321 49.36559 18.8     2     1       FALSE maroon4       0.5        1    NA
324 49.36559 18.8     2     1       FALSE maroon4       0.5        1    NA
274 49.50575 18.9     2     1       FALSE maroon4       0.5        1    NA
291 49.64591 19.0     2     1       FALSE maroon4       0.5        1    NA
293 49.64591 19.0     2     1       FALSE maroon4       0.5        1    NA
317 49.64591 19.0     2     1       FALSE maroon4       0.5        1    NA
332 49.64591 19.0     2     1       FALSE maroon4       0.5        1    NA
308 49.78606 19.1     2     1       FALSE maroon4       0.5        1    NA
268 49.92622 19.2     2     1       FALSE maroon4       0.5        1    NA
289 50.20653 19.4     2     1       FALSE maroon4       0.5        1    NA
325 50.20653 19.4     2     1       FALSE maroon4       0.5        1    NA
267 50.34669 19.5     2     1       FALSE maroon4       0.5        1    NA
306 50.34669 19.5     2     1       FALSE maroon4       0.5        1    NA
326 50.34669 19.5     2     1       FALSE maroon4       0.5        1    NA
281 50.48684 19.6     2     1       FALSE maroon4       0.5        1    NA
313 50.48684 19.6     2     1       FALSE maroon4       0.5        1    NA
319 50.62700 19.7     2     1       FALSE maroon4       0.5        1    NA
270 50.76716 19.8     2     1       FALSE maroon4       0.5        1    NA
329 50.76716 19.8     2     1       FALSE maroon4       0.5        1    NA
275 50.90731 19.9     2     1       FALSE maroon4       0.5        1    NA
305 50.90731 19.9     2     1       FALSE maroon4       0.5        1    NA
323 50.90731 19.9     2     1       FALSE maroon4       0.5        1    NA
282 51.04747 20.0     2     1       FALSE maroon4       0.5        1    NA
295 51.04747 20.0     2     1       FALSE maroon4       0.5        1    NA
277 51.46794 20.3     2     1       FALSE maroon4       0.5        1    NA
303 52.02856 20.7     2     1       FALSE maroon4       0.5        1    NA
297 52.16872 20.8     2     1       FALSE maroon4       0.5        1    NA
171 44.90981 13.1     3     1       FALSE maroon4       0.5        1    NA
147 45.04997 13.2     3     1       FALSE maroon4       0.5        1    NA
194 45.19012 13.3     3     1       FALSE maroon4       0.5        1    NA
155 45.33028 13.4     3     1       FALSE maroon4       0.5        1    NA
152 45.47043 13.5     3     1       FALSE maroon4       0.5        1    NA
163 45.47043 13.5     3     1       FALSE maroon4       0.5        1    NA
184 45.61059 13.6     3     1       FALSE maroon4       0.5        1    NA
157 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
159 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
182 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
186 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
188 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
261 45.75075 13.7     3     1       FALSE maroon4       0.5        1    NA
206 45.89090 13.8     3     1       FALSE maroon4       0.5        1    NA
223 45.89090 13.8     3     1       FALSE maroon4       0.5        1    NA
229 45.89090 13.8     3     1       FALSE maroon4       0.5        1    NA
191 46.03106 13.9     3     1       FALSE maroon4       0.5        1    NA
192 46.03106 13.9     3     1       FALSE maroon4       0.5        1    NA
202 46.03106 13.9     3     1       FALSE maroon4       0.5        1    NA
208 46.03106 13.9     3     1       FALSE maroon4       0.5        1    NA
233 46.17122 14.0     3     1       FALSE maroon4       0.5        1    NA
252 46.17122 14.0     3     1       FALSE maroon4       0.5        1    NA
149 46.31137 14.1     3     1       FALSE maroon4       0.5        1    NA
197 46.31137 14.1     3     1       FALSE maroon4       0.5        1    NA
258 46.31137 14.1     3     1       FALSE maroon4       0.5        1    NA
177 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
196 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
210 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
213 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
221 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
232 46.45153 14.2     3     1       FALSE maroon4       0.5        1    NA
167 46.59168 14.3     3     1       FALSE maroon4       0.5        1    NA
174 46.59168 14.3     3     1       FALSE maroon4       0.5        1    NA
262 46.59168 14.3     3     1       FALSE maroon4       0.5        1    NA
198 46.73184 14.4     3     1       FALSE maroon4       0.5        1    NA
200 46.73184 14.4     3     1       FALSE maroon4       0.5        1    NA
231 46.73184 14.4     3     1       FALSE maroon4       0.5        1    NA
243 46.73184 14.4     3     1       FALSE maroon4       0.5        1    NA
151 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
165 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
168 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
169 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
178 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
204 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
225 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
237 46.87200 14.5     3     1       FALSE maroon4       0.5        1    NA
153 47.01215 14.6     3     1       FALSE maroon4       0.5        1    NA
160 47.01215 14.6     3     1       FALSE maroon4       0.5        1    NA
161 47.01215 14.6     3     1       FALSE maroon4       0.5        1    NA
227 47.01215 14.6     3     1       FALSE maroon4       0.5        1    NA
242 47.01215 14.6     3     1       FALSE maroon4       0.5        1    NA
239 47.15231 14.7     3     1       FALSE maroon4       0.5        1    NA
250 47.15231 14.7     3     1       FALSE maroon4       0.5        1    NA
180 47.29247 14.8     3     1       FALSE maroon4       0.5        1    NA
218 47.29247 14.8     3     1       FALSE maroon4       0.5        1    NA
264 47.29247 14.8     3     1       FALSE maroon4       0.5        1    NA
207 47.43262 14.9     3     1       FALSE maroon4       0.5        1    NA
173 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
189 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
199 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
203 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
214 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
215 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
219 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
235 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
245 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
248 47.57278 15.0     3     1       FALSE maroon4       0.5        1    NA
166 47.71293 15.1     3     1       FALSE maroon4       0.5        1    NA
172 47.71293 15.1     3     1       FALSE maroon4       0.5        1    NA
253 47.71293 15.1     3     1       FALSE maroon4       0.5        1    NA
150 47.85309 15.2     3     1       FALSE maroon4       0.5        1    NA
164 47.85309 15.2     3     1       FALSE maroon4       0.5        1    NA
254 47.85309 15.2     3     1       FALSE maroon4       0.5        1    NA
256 47.85309 15.2     3     1       FALSE maroon4       0.5        1    NA
154 47.99325 15.3     3     1       FALSE maroon4       0.5        1    NA
175 47.99325 15.3     3     1       FALSE maroon4       0.5        1    NA
176 47.99325 15.3     3     1       FALSE maroon4       0.5        1    NA
205 47.99325 15.3     3     1       FALSE maroon4       0.5        1    NA
156 48.13340 15.4     3     1       FALSE maroon4       0.5        1    NA
201 48.13340 15.4     3     1       FALSE maroon4       0.5        1    NA
247 48.27356 15.5     3     1       FALSE maroon4       0.5        1    NA
216 48.41372 15.6     3     1       FALSE maroon4       0.5        1    NA
217 48.41372 15.6     3     1       FALSE maroon4       0.5        1    NA
226 48.41372 15.6     3     1       FALSE maroon4       0.5        1    NA
162 48.55387 15.7     3     1       FALSE maroon4       0.5        1    NA
185 48.55387 15.7     3     1       FALSE maroon4       0.5        1    NA
209 48.55387 15.7     3     1       FALSE maroon4       0.5        1    NA
240 48.55387 15.7     3     1       FALSE maroon4       0.5        1    NA
263 48.55387 15.7     3     1       FALSE maroon4       0.5        1    NA
170 48.69403 15.8     3     1       FALSE maroon4       0.5        1    NA
195 48.69403 15.8     3     1       FALSE maroon4       0.5        1    NA
241 48.69403 15.8     3     1       FALSE maroon4       0.5        1    NA
251 48.69403 15.8     3     1       FALSE maroon4       0.5        1    NA
190 48.83419 15.9     3     1       FALSE maroon4       0.5        1    NA
193 48.83419 15.9     3     1       FALSE maroon4       0.5        1    NA
228 48.83419 15.9     3     1       FALSE maroon4       0.5        1    NA
255 48.83419 15.9     3     1       FALSE maroon4       0.5        1    NA
187 48.97434 16.0     3     1       FALSE maroon4       0.5        1    NA
220 48.97434 16.0     3     1       FALSE maroon4       0.5        1    NA
259 48.97434 16.0     3     1       FALSE maroon4       0.5        1    NA
158 49.11450 16.1     3     1       FALSE maroon4       0.5        1    NA
238 49.11450 16.1     3     1       FALSE maroon4       0.5        1    NA
249 49.11450 16.1     3     1       FALSE maroon4       0.5        1    NA
265 49.11450 16.1     3     1       FALSE maroon4       0.5        1    NA
212 49.25465 16.2     3     1       FALSE maroon4       0.5        1    NA
260 49.25465 16.2     3     1       FALSE maroon4       0.5        1    NA
148 49.39481 16.3     3     1       FALSE maroon4       0.5        1    NA
181 49.39481 16.3     3     1       FALSE maroon4       0.5        1    NA
222 49.39481 16.3     3     1       FALSE maroon4       0.5        1    NA
257 49.39481 16.3     3     1       FALSE maroon4       0.5        1    NA
224 49.53497 16.4     3     1       FALSE maroon4       0.5        1    NA
244 49.67512 16.5     3     1       FALSE maroon4       0.5        1    NA
211 50.09559 16.8     3     1       FALSE maroon4       0.5        1    NA
179 50.37590 17.0     3     1       FALSE maroon4       0.5        1    NA
234 50.37590 17.0     3     1       FALSE maroon4       0.5        1    NA
246 50.37590 17.0     3     1       FALSE maroon4       0.5        1    NA
236 50.51606 17.1     3     1       FALSE maroon4       0.5        1    NA
183 50.79637 17.3     3     1       FALSE maroon4       0.5        1    NA
230 50.79637 17.3     3     1       FALSE maroon4       0.5        1    NA

Step 1: Define compute. Test.

Now you are ready to begin building your extension function. The first step is to define the compute that should be done under-the-hood when your function is used. We’ll define this in a function called compute_panel_cat_lm(). The data input will look similar to the plot data. You will also need to include a scales argument, which ggplot2 uses internally.

compute_panel_cat_lm <- function(data, scales){

  model<-lm(formula = y ~ x + cat, data = data)
  
  data |> 
    mutate(y = model$fitted.values)
  
}
NoteYou may have noticed …
  1. … the scales argument in the compute definition, which is used internally in ggplot2. While it won’t be used in your test (up next), you do need so that the computation will work in the ggplot2 setting.

  2. … that the compute function can only be used with data with variables x and y. These aesthetic variables names, relevant for building the plot, are generally not found in the raw data inputs for plot.

Test compute.

## Test compute. 
penguins |>
  select(x = bill_depth_mm, 
         y = bill_length_mm,
         cat = species) |>
  compute_panel_cat_lm()
# A tibble: 333 × 3
       x     y cat   
   <dbl> <dbl> <fct> 
 1  18.7  39.3 Adelie
 2  17.4  37.5 Adelie
 3  18    38.3 Adelie
 4  19.3  40.2 Adelie
 5  20.6  42.0 Adelie
 6  17.8  38.1 Adelie
 7  19.6  40.6 Adelie
 8  17.6  37.8 Adelie
 9  21.2  42.8 Adelie
10  21.1  42.7 Adelie
# ℹ 323 more rows
NoteYou may have noticed …

… that we prepare the data to have columns with names x and y before testing compute_group_medians. Computation will fail if the names x and y are not present given our function definition. Internally in a plot, columns are named based on aesthetic mapping, e.g. aes(x = bill_depth, y = bill_length).

Step 2: Define new Stat. Test.

Next, we use the ggplot2::ggproto function which allows you to define a new Stat object - which will let us do computation under the hood while building our plot.

Define Stat.

StatCatLm <- ggplot2::ggproto(`_class` = "StatCatLm",
                                  `_inherit` = ggplot2::Stat,
                                  required_aes = c("x", "y", "cat"),
                                  compute_panel = compute_panel_cat_lm)
NoteYou may have noticed …
  1. … that the naming convention for the ggproto object is CamelCase. The new class should also be named the same, i.e. "StatLmFitted".

  2. … that we inherit from the ‘Stat’ class. In fact, your ggproto object is a subclass and you aren’t fully defining it. You simplify the definition by inheriting class properties from ggplot2::Stat.

  3. that the compute_panel_cat_lm function is used to define our Stat’s compute_panel element. This means that data will be transformed by our compute definition – group-wise if groups are specified.

  4. that setting required_aes to ‘x’, ‘y’, and ‘cat’ is consistent with compute requirements The compute assumes data to be a dataframe with columns x and y. If you data doesn’t have x, y, and cat your compute will fail. Specifying required_aes in your Stat can improve your user interface because standard ggplot2 error messages will issue when required aes are not specified, e.g. ‘stat_cat_lm() requires the following missing aesthetics: x.’

Test Stat.

You can test out your Stat using them in ggplot2 geom_*() functions.

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm,
      y = bill_length_mm,
      cat = species) + 
  geom_point() + 
  geom_point(stat = StatCatLm) +
  geom_line(stat = StatCatLm) + 
  labs(title = "Testing StatCatLm")

NoteYou may have noticed …

that we don’t use “cat_lm” as the stat argument, which would be more consistent with base ggplot2 documentation. However, if you prefer, you can refer to your newly created Stat this way when testing, i.e. geom_point(stat = "cat_lm", size = 7).

Test panel-wise behavior

last_plot() + 
  aes(color = species) + 
  facet_wrap(facet = vars(sex))

You might be thinking, what we’ve done would already be pretty useful to me. Can I just use my Stat as-is within geom_*() functions?

The short answer is ‘yes’! If you just want to use the Stat yourself locally in a script, there might not be much reason to go on to Step 3, user-facing functions. But if you have a wider audience in mind, i.e. internal to organization or open sourcing in a package, probably a more succinct expression of what functionality you deliver will be useful - i.e. write the user-facing functions.

Instead of using a geom_*() function, you might prefer to use the layer() function in your testing step. Occasionally, it’s necessary to go this route; for example, geom_vline() contain no stat argument, but you can use the GeomVline in layer(). If you are teaching this content, using layer() may help you better connect this step with the next, defining the user-facing functions.

A test of StatFitted using this method follows. You can see it is a little more verbose, as there is no default for the position argument, and setting the size must be handled with a little more care.

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm,
      y = bill_length_mm,
      cat = species) + 
  geom_point() + 
  layer(geom = GeomLine, 
        stat = StatCatLm, 
        position = "identity", 
        params = list(color = "blue")) + 
  labs(title = "Testing StatCatLm with layer() function")

Step 3: Define user-facing functions. Test.

In this next section, we define user-facing functions. Doing so is a bit of a mouthful, but see the ‘Pro tip: Use stat_identity definition as a template in this step …’ that follows.

stat_cat_lm <- function(mapping = NULL, data = NULL, geom = "line", position = "identity", 
    ..., show.legend = NA, inherit.aes = TRUE) {
    layer(data = data, mapping = mapping, stat = StatCatLm, 
        geom = geom, position = position, show.legend = show.legend, 
        inherit.aes = inherit.aes, params = list(na.rm = FALSE, 
            ...))
}
NoteYou may have noticed…
  1. … that the stat_*() function name derives from the Stat objects’s name, but is snake case. So if I wanted a StatBigCircle-based stat_*() function, I’d create stat_big_circle().

  2. … that StatCatLm is used to define the new layer function, so the computation that defines it, which is to summarize to medians, will be in play before the layer is rendered.

  3. … that "label" is specified as the default for the geom argument in the function. This means that the ggplot2::GeomLine will be used in the layer unless otherwise specified by the user.

stat_cat_lm <- make_constructor(StatCatLm, geom = "line")

Define geom_*() function

Because users are more accustom to using layers that have the ‘geom’ prefix, you might also define geom with identical properties via aliasing.

geom_cat_lm <- stat_cat_lm

It is more conventional write out scaffolding code, nearly identical to the stat_*() definition, but has the geom fixed and the stat flexible.

But soon we can use make_constructor() in the next ggplot2 release, just about as easy as aliasing and which will deliver the fixed geom and flexible stat convention in what follows:

geom_cat_lm <- make_constructor(GeomLine, stat = "cat_lm")

Test/Enjoy functions

Below we use the new function geom_cat_lm(), contrasting it to geom_smooth(), which have parallel and not parallel slopes respectively.

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm, 
      y = bill_length_mm,
      cat = species) +
  geom_point() + 
  geom_cat_lm(color = "maroon4") +
  geom_smooth(method = "lm", 
              linewidth = .2) 
`geom_smooth()` using formula = 'y ~ x'

And check out conditionality

penguins |> 
  ggplot() + 
  aes(x = bill_depth_mm, 
      y = bill_length_mm,
      cat = species) +
  geom_point() + 
  geom_cat_lm(color = "maroon4") + 
  facet_wrap(facets = vars(sex))

Note that because panel-wise (facet-wise) computation is specified, there are in fact, two separately are estimated models for female and male. If the model is to be computed across all of the data, it’s worth considering layer-wise computation, i.e. specifying the compute_layer slot (not yet covered in these tutorials).

Done! Time for a review.

Here is a quick review of the functions and ggproto objects we’ve covered, dropping tests and discussion.

NoteReview
library(tidyverse)

# Step 1. Define compute
compute_panel_cat_lm <- function(data, scales){
  model <- lm(formula = y ~ x + cat, data = data)
  data |> 
    mutate(y = model$fitted.values)
}


# Step 2. Define Stat
StatCatLm = ggproto(`_class` = "StatCatLm",
                      `_inherit` = Stat,
                      required_aes = c("x", "y"),
                      compute_group = compute_panel_cat_lm)

# Step 3. Define user-facing functions

## define stat_*()
stat_cat_lm <- function(mapping = NULL, data = NULL, 
                         geom = "line", position = "identity", 
                         ..., show.legend = NA, inherit.aes = TRUE) 
{
    layer(data = data, mapping = mapping, stat = StatCatLm, 
        geom = geom, position = position, show.legend = show.legend, 
        inherit.aes = inherit.aes, params = rlang::list2(na.rm = FALSE, 
            ...))
}

## define geom_*()
geom_cat_lm <- stat_cat_lm

Your Turn: Write geom_cat_fitted() and geom_cat_residuals()

Using the geom_cat_lm Recipe #3 as a reference, try to create a geom_cat_fitted() and geom_cat_residuals() that draws fitted values and segments between observed and fitted values for a linear model with a categorical variable.

Hint: consider what aesthetics are required for segments. We’ll give you Step 0 this time…

Step 0: use base ggplot2 to get the job done

Step 1: Write compute. Test.

Step 2: Write Stat.

Step 3: Write user-facing functions.

Congratulations!

If you’ve finished all four recipes, you should have a good feel for writing Stats, and stat_\*() and geom_\*() functions.