Introduction to the Bootstrap

# Introduction to the Bootstrap
### Gina Reynolds, June 2019

---

# Introduction

This is a minimal example to demonstrate how to create a flipbook with data from #TidyTuesday.  It walks through data wrangling and plots pipelines made with the Tidyverse.  The functions that make this possible are the work of Emi Tanaka, Garrick Aden-Buie and myself, and are built for Xaringan, an Rmarkdown file type for creating presentation slides; the functions make use of the function `knitr:::knit_code$get()`.

The code to create the flipbook is an .Rmd that you can download [**here**](https://raw.githubusercontent.com/EvaMaeRey/little_flipbooks_library/master/tidytuesday_minimal_example/tidytuesday_minimal_example.Rmd).

---

Interested in more flipbooks? Check out

- [the ggplot flipbook](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html)
- [The Tidyverse in Action](https://evamaerey.github.io/tidyverse_in_action/tidyverse_in_action.html)

For more about Xaringan:

- [Xaringan presentation slides](https://slides.yihui.name/xaringan/)

The sequential workflow of the Tidyverse makes incremental display of pipelines and ggplot statements ideal:

- [www.tidyverse.org](https://www.tidyverse.org/)

---

# Set up

Okay. Let's load the the `reveal for xaringan` functions for "flipbooking" and the `tidyverse`.

```r
library(flipbookr)
```

And load the tidyverse.

```r
library(tidyverse)
```

---

```r
*set.seed(2019)
```
]]
.column[.content[

]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
*tibble(x = rnorm(30))
```
]]
.column[.content[

```
  # A tibble: 30 x 1
          x
      <dbl>
   1  0.739
   2 -0.515
   3 -1.64 
   4  0.916
   5 -1.27 
   6  0.738
   7 -0.783
   8  0.509
   9 -1.49 
  10 -0.319
  # … with 20 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
* mutate(y = x * .4 + rnorm(30))
```
]]
.column[.content[

```
  # A tibble: 30 x 2
          x       y
      <dbl>   <dbl>
   1  0.739 -0.639 
   2 -0.515  0.424 
   3 -1.64   0.105 
   4  0.916 -0.145 
   5 -1.27   0.495 
   6  0.738 -0.0881
   7 -0.783 -0.790 
   8  0.509  0.458 
   9 -1.49  -1.76  
  10 -0.319 -0.566 
  # … with 20 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
* mutate(id = 1:n())
```
]]
.column[.content[

```
  # A tibble: 30 x 3
          x       y    id
      <dbl>   <dbl> <int>
   1  0.739 -0.639      1
   2 -0.515  0.424      2
   3 -1.64   0.105      3
   4  0.916 -0.145      4
   5 -1.27   0.495      5
   6  0.738 -0.0881     6
   7 -0.783 -0.790      7
   8  0.509  0.458      8
   9 -1.49  -1.76       9
  10 -0.319 -0.566     10
  # … with 20 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
*my_data
```
]]
.column[.content[

]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
my_data  
*ggplot(data = my_data)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/sample_auto_6_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
my_data  
ggplot(data = my_data) +  
* aes(x = x)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/sample_auto_7_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
my_data  
ggplot(data = my_data) +  
  aes(x = x) +  
* aes(y = y)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/sample_auto_8_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
my_data  
ggplot(data = my_data) +  
  aes(x = x) +  
  aes(y = y) +  
* geom_point()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/sample_auto_9_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
set.seed(2019)  
tibble(x = rnorm(30)) %>%  
  mutate(y = x * .4 + rnorm(30)) %>%  
  mutate(id = 1:n()) ->  
my_data  
ggplot(data = my_data) +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point() +  
* geom_smooth(method = lm, se = F)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/sample_auto_10_output-1.png" width="100%" />
]]

---

```r
*my_data
```
]]
.column[.content[

```r
my_data %>%  
* crossing(boot_sample_id = 1:100)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
         x     y    id boot_sample_id
     <dbl> <dbl> <int>          <int>
   1 -2.26 -1.23    20              1
   2 -2.26 -1.23    20              2
   3 -2.26 -1.23    20              3
   4 -2.26 -1.23    20              4
   5 -2.26 -1.23    20              5
   6 -2.26 -1.23    20              6
   7 -2.26 -1.23    20              7
   8 -2.26 -1.23    20              8
   9 -2.26 -1.23    20              9
  10 -2.26 -1.23    20             10
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
* arrange(boot_sample_id)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
          x      y    id boot_sample_id
      <dbl>  <dbl> <int>          <int>
   1 -2.26  -1.23     20              1
   2 -1.80  -1.51     27              1
   3 -1.77  -0.568    18              1
   4 -1.64   0.105     3              1
   5 -1.62   0.173    30              1
   6 -1.49  -1.76      9              1
   7 -1.27   0.495     5              1
   8 -1.12  -0.355    13              1
   9 -0.808 -1.18     25              1
  10 -0.783 -0.790     7              1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
* group_by(boot_sample_id)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
  # Groups:   boot_sample_id [100]
          x      y    id boot_sample_id
      <dbl>  <dbl> <int>          <int>
   1 -2.26  -1.23     20              1
   2 -1.80  -1.51     27              1
   3 -1.77  -0.568    18              1
   4 -1.64   0.105     3              1
   5 -1.62   0.173    30              1
   6 -1.49  -1.76      9              1
   7 -1.27   0.495     5              1
   8 -1.12  -0.355    13              1
   9 -0.808 -1.18     25              1
  10 -0.783 -0.790     7              1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
  group_by(boot_sample_id) %>%  
* sample_frac(size = 1,
*             replace = TRUE)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
  # Groups:   boot_sample_id [100]
          x        y    id boot_sample_id
      <dbl>    <dbl> <int>          <int>
   1  0.738 -0.0881      6              1
   2 -0.515  0.424       2              1
   3  0.878  1.21       17              1
   4  0.371  0.354      16              1
   5 -0.512 -0.733      26              1
   6 -0.783 -0.790       7              1
   7  2.64   1.08       29              1
   8  0.867  0.00896    23              1
   9  0.316  0.767      15              1
  10  0.738 -0.0881      6              1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
  group_by(boot_sample_id) %>%  
  sample_frac(size = 1,  
              replace = TRUE) %>%  
* arrange(boot_sample_id, id)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
  # Groups:   boot_sample_id [100]
          x       y    id boot_sample_id
      <dbl>   <dbl> <int>          <int>
   1 -1.64   0.105      3              1
   2  0.738 -0.0881     6              1
   3  0.738 -0.0881     6              1
   4  0.738 -0.0881     6              1
   5 -0.783 -0.790      7              1
   6 -0.783 -0.790      7              1
   7 -0.783 -0.790      7              1
   8 -1.49  -1.76       9              1
   9 -1.49  -1.76       9              1
  10 -0.319 -0.566     10              1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
  group_by(boot_sample_id) %>%  
  sample_frac(size = 1,  
              replace = TRUE) %>%  
  arrange(boot_sample_id, id) %>%  
* group_by(id, boot_sample_id)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 4
  # Groups:   id, boot_sample_id [1,916]
          x       y    id boot_sample_id
      <dbl>   <dbl> <int>          <int>
   1  0.739 -0.639      1              1
   2 -0.515  0.424      2              1
   3 -0.515  0.424      2              1
   4 -1.64   0.105      3              1
   5 -1.64   0.105      3              1
   6  0.916 -0.145      4              1
   7  0.738 -0.0881     6              1
   8 -0.783 -0.790      7              1
   9  0.509  0.458      8              1
  10  0.509  0.458      8              1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
  group_by(boot_sample_id) %>%  
  sample_frac(size = 1,  
              replace = TRUE) %>%  
  arrange(boot_sample_id, id) %>%  
  group_by(id, boot_sample_id) %>%  
* mutate(times_sampled = n())
```
]]
.column[.content[

```
  # A tibble: 3,000 x 5
  # Groups:   id, boot_sample_id [1,889]
          x      y    id boot_sample_id times_sampled
      <dbl>  <dbl> <int>          <int>         <int>
   1 -1.64   0.105     3              1             1
   2  0.916 -0.145     4              1             3
   3  0.916 -0.145     4              1             3
   4  0.916 -0.145     4              1             3
   5 -1.27   0.495     5              1             2
   6 -1.27   0.495     5              1             2
   7  0.509  0.458     8              1             3
   8  0.509  0.458     8              1             3
   9  0.509  0.458     8              1             3
  10 -1.12  -0.355    13              1             1
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
my_data %>%  
  crossing(boot_sample_id = 1:100) %>%  
  arrange(boot_sample_id) %>%  
  group_by(boot_sample_id) %>%  
  sample_frac(size = 1,  
              replace = TRUE) %>%  
  arrange(boot_sample_id, id) %>%  
  group_by(id, boot_sample_id) %>%  
  mutate(times_sampled = n()) ->  
*boot_samples
```
]]
.column[.content[

]]

---

# Ensemble plot

---

```r
*boot_samples
```
]]
.column[.content[

```
  # A tibble: 3,000 x 5
  # Groups:   id, boot_sample_id [1,899]
          x      y    id boot_sample_id times_sampled
      <dbl>  <dbl> <int>          <int>         <int>
   1  0.739 -0.639     1              1             1
   2  0.916 -0.145     4              1             2
   3  0.916 -0.145     4              1             2
   4 -1.27   0.495     5              1             3
   5 -1.27   0.495     5              1             3
   6 -1.27   0.495     5              1             3
   7  0.509  0.458     8              1             2
   8  0.509  0.458     8              1             2
   9 -1.49  -1.76      9              1             3
  10 -1.49  -1.76      9              1             3
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
* filter(boot_sample_id <= 12)
```
]]
.column[.content[

```
  # A tibble: 360 x 5
  # Groups:   id, boot_sample_id [226]
          x      y    id boot_sample_id times_sampled
      <dbl>  <dbl> <int>          <int>         <int>
   1  0.739 -0.639     1              1             1
   2  0.916 -0.145     4              1             2
   3  0.916 -0.145     4              1             2
   4 -1.27   0.495     5              1             3
   5 -1.27   0.495     5              1             3
   6 -1.27   0.495     5              1             3
   7  0.509  0.458     8              1             2
   8  0.509  0.458     8              1             2
   9 -1.49  -1.76      9              1             3
  10 -1.49  -1.76      9              1             3
  # … with 350 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
* ggplot()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_3_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
* aes(x = x)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_4_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
* aes(y = y)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_5_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
* geom_point(size = 3, col = "magenta")
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_6_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
* facet_wrap(~ paste("boot sample", boot_sample_id))
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_7_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
* aes(label = paste0(times_sampled, "X"))
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_8_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
* geom_text(size = 2, col = "grey", aes(alpha = NULL))
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_9_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
  geom_text(size = 2, col = "grey", aes(alpha = NULL)) +  
* aes(group = boot_sample_id)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_10_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
  geom_text(size = 2, col = "grey", aes(alpha = NULL)) +  
  aes(group = boot_sample_id) +  
* aes(alpha = times_sampled)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_11_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
  geom_text(size = 2, col = "grey", aes(alpha = NULL)) +  
  aes(group = boot_sample_id) +  
  aes(alpha = times_sampled) +  
* geom_smooth(method = lm, se = F)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_12_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
  geom_text(size = 2, col = "grey", aes(alpha = NULL)) +  
  aes(group = boot_sample_id) +  
  aes(alpha = times_sampled) +  
  geom_smooth(method = lm, se = F) +  
* ggdark::dark_mode()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_13_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  filter(boot_sample_id <= 12) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = y) +  
  geom_point(size = 3, col = "magenta") +  
  facet_wrap(~ paste("boot sample", boot_sample_id)) +  
  aes(label = paste0(times_sampled, "X")) +  
  geom_text(size = 2, col = "grey", aes(alpha = NULL)) +  
  aes(group = boot_sample_id) +  
  aes(alpha = times_sampled) +  
  geom_smooth(method = lm, se = F) +  
  ggdark::dark_mode() +  
* theme(legend.position = "none")
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/plotboot_auto_14_output-1.png" width="100%" />
]]

---

```r
boot_model <- function(df){
  lm(y ~ x, data = df)
}
```

---

```r
*boot_samples
```
]]
.column[.content[

```r
boot_samples %>%  
* group_by(boot_sample_id)
```
]]
.column[.content[

```
  # A tibble: 3,000 x 5
  # Groups:   boot_sample_id [100]
          x      y    id boot_sample_id times_sampled
      <dbl>  <dbl> <int>          <int>         <int>
   1  0.739 -0.639     1              1             1
   2  0.916 -0.145     4              1             2
   3  0.916 -0.145     4              1             2
   4 -1.27   0.495     5              1             3
   5 -1.27   0.495     5              1             3
   6 -1.27   0.495     5              1             3
   7  0.509  0.458     8              1             2
   8  0.509  0.458     8              1             2
   9 -1.49  -1.76      9              1             3
  10 -1.49  -1.76      9              1             3
  # … with 2,990 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  group_by(boot_sample_id) %>%  
* nest()
```
]]
.column[.content[

```
  # A tibble: 100 x 2
  # Groups:   boot_sample_id [100]
     boot_sample_id data             
              <int> <list>           
   1              1 <tibble [30 × 4]>
   2              2 <tibble [30 × 4]>
   3              3 <tibble [30 × 4]>
   4              4 <tibble [30 × 4]>
   5              5 <tibble [30 × 4]>
   6              6 <tibble [30 × 4]>
   7              7 <tibble [30 × 4]>
   8              8 <tibble [30 × 4]>
   9              9 <tibble [30 × 4]>
  10             10 <tibble [30 × 4]>
  # … with 90 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  group_by(boot_sample_id) %>%  
  nest() %>%  
* mutate(model = map(data, boot_model))   # model results are summarised in tidy dataframes using broom
```
]]
.column[.content[

```
  # A tibble: 100 x 3
  # Groups:   boot_sample_id [100]
     boot_sample_id data              model 
              <int> <list>            <list>
   1              1 <tibble [30 × 4]> <lm>  
   2              2 <tibble [30 × 4]> <lm>  
   3              3 <tibble [30 × 4]> <lm>  
   4              4 <tibble [30 × 4]> <lm>  
   5              5 <tibble [30 × 4]> <lm>  
   6              6 <tibble [30 × 4]> <lm>  
   7              7 <tibble [30 × 4]> <lm>  
   8              8 <tibble [30 × 4]> <lm>  
   9              9 <tibble [30 × 4]> <lm>  
  10             10 <tibble [30 × 4]> <lm>  
  # … with 90 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  group_by(boot_sample_id) %>%  
  nest() %>%  
  mutate(model = map(data, boot_model)) %>%  # model results are summarised in tidy dataframes using broom
* mutate(glance  = map(model, broom::glance),
*        augment = map(model, broom::augment),
*        tidy    = map(model, broom::tidy))
```
]]
.column[.content[

```
  # A tibble: 100 x 6
  # Groups:   boot_sample_id [100]
     boot_sample_id data          model  glance         augment       tidy        
              <int> <list>        <list> <list>         <list>        <list>      
   1              1 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   2              2 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   3              3 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   4              4 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   5              5 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   6              6 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   7              7 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   8              8 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
   9              9 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
  10             10 <tibble [30 … <lm>   <tibble [1 × … <tibble [30 … <tibble [2 …
  # … with 90 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples %>%  
  group_by(boot_sample_id) %>%  
  nest() %>%  
  mutate(model = map(data, boot_model)) %>%  # model results are summarised in tidy dataframes using broom
  mutate(glance  = map(model, broom::glance),  
         augment = map(model, broom::augment),  
         tidy    = map(model, broom::tidy)) ->  
*boot_samples_models
```
]]
.column[.content[

]]

---

# Slope coefficient confidence interval

---

```r
*boot_samples_models
```
]]
.column[.content[

```r
boot_samples_models %>%  
* unnest(tidy)
```
]]
.column[.content[

```
  # A tibble: 200 x 10
  # Groups:   boot_sample_id [100]
     boot_sample_id data  model glance augment term  estimate std.error statistic
              <int> <lis> <lis> <list> <list>  <chr>    <dbl>     <dbl>     <dbl>
   1              1 <tib… <lm>  <tibb… <tibbl… (Int… -0.0613      0.180   -0.340 
   2              1 <tib… <lm>  <tibb… <tibbl… x      0.560       0.171    3.28  
   3              2 <tib… <lm>  <tibb… <tibbl… (Int…  0.0738      0.139    0.532 
   4              2 <tib… <lm>  <tibb… <tibbl… x      0.549       0.107    5.14  
   5              3 <tib… <lm>  <tibb… <tibbl… (Int…  0.0708      0.132    0.538 
   6              3 <tib… <lm>  <tibb… <tibbl… x      0.410       0.121    3.39  
   7              4 <tib… <lm>  <tibb… <tibbl… (Int…  0.00706     0.143    0.0495
   8              4 <tib… <lm>  <tibb… <tibbl… x      0.403       0.109    3.70  
   9              5 <tib… <lm>  <tibb… <tibbl… (Int… -0.0901      0.134   -0.674 
  10              5 <tib… <lm>  <tibb… <tibbl… x      0.601       0.135    4.45  
  # … with 190 more rows, and 1 more variable: p.value <dbl>
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
* filter(term == "x")
```
]]
.column[.content[

```
  # A tibble: 100 x 10
  # Groups:   boot_sample_id [100]
     boot_sample_id data  model glance augment term  estimate std.error statistic
              <int> <lis> <lis> <list> <list>  <chr>    <dbl>     <dbl>     <dbl>
   1              1 <tib… <lm>  <tibb… <tibbl… x        0.560     0.171      3.28
   2              2 <tib… <lm>  <tibb… <tibbl… x        0.549     0.107      5.14
   3              3 <tib… <lm>  <tibb… <tibbl… x        0.410     0.121      3.39
   4              4 <tib… <lm>  <tibb… <tibbl… x        0.403     0.109      3.70
   5              5 <tib… <lm>  <tibb… <tibbl… x        0.601     0.135      4.45
   6              6 <tib… <lm>  <tibb… <tibbl… x        0.444     0.126      3.53
   7              7 <tib… <lm>  <tibb… <tibbl… x        0.473     0.158      3.00
   8              8 <tib… <lm>  <tibb… <tibbl… x        0.281     0.163      1.72
   9              9 <tib… <lm>  <tibb… <tibbl… x        0.599     0.115      5.19
  10             10 <tib… <lm>  <tibb… <tibbl… x        0.353     0.114      3.10
  # … with 90 more rows, and 1 more variable: p.value <dbl>
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
*x_estimates
```
]]
.column[.content[

]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
x_estimates  
*ggplot(data = x_estimates)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confint_auto_5_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
x_estimates  
ggplot(data = x_estimates) +  
* aes(x = estimate)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confint_auto_6_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
x_estimates  
ggplot(data = x_estimates) +  
  aes(x = estimate) +  
* geom_histogram()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confint_auto_7_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
x_estimates  
ggplot(data = x_estimates) +  
  aes(x = estimate) +  
  geom_histogram() +  
* geom_rug()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confint_auto_8_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  filter(term == "x") ->  
x_estimates  
ggplot(data = x_estimates) +  
  aes(x = estimate) +  
  geom_histogram() +  
  geom_rug() +  
* geom_vline(xintercept =
*              quantile(x_estimates$estimate,
*                       probs = c(.05, .95)),
*            col = "red",
*            linetype = "dashed")
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confint_auto_9_output-1.png" width="100%" />
]]

---

```r
*boot_samples_models
```
]]
.column[.content[

```r
boot_samples_models %>%  
* unnest(tidy)
```
]]
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
* select(boot_sample_id, estimate, term)
```
]]
.column[.content[

```
  # A tibble: 200 x 3
  # Groups:   boot_sample_id [100]
     boot_sample_id estimate term       
              <int>    <dbl> <chr>      
   1              1 -0.0613  (Intercept)
   2              1  0.560   x          
   3              2  0.0738  (Intercept)
   4              2  0.549   x          
   5              3  0.0708  (Intercept)
   6              3  0.410   x          
   7              4  0.00706 (Intercept)
   8              4  0.403   x          
   9              5 -0.0901  (Intercept)
  10              5  0.601   x          
  # … with 190 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
* spread(key = term, value = estimate)
```
]]
.column[.content[

```
  # A tibble: 100 x 3
  # Groups:   boot_sample_id [100]
     boot_sample_id `(Intercept)`     x
              <int>         <dbl> <dbl>
   1              1      -0.0613  0.560
   2              2       0.0738  0.549
   3              3       0.0708  0.410
   4              4       0.00706 0.403
   5              5      -0.0901  0.601
   6              6      -0.131   0.444
   7              7      -0.0508  0.473
   8              8       0.191   0.281
   9              9      -0.102   0.599
  10             10       0.105   0.353
  # … with 90 more rows
```
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
  spread(key = term, value = estimate) %>%  
* ggplot()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confintplus_auto_5_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
  spread(key = term, value = estimate) %>%  
  ggplot() +  
* aes(x = x)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confintplus_auto_6_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
  spread(key = term, value = estimate) %>%  
  ggplot() +  
  aes(x = x) +  
* aes(y = `(Intercept)`)
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confintplus_auto_7_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
  spread(key = term, value = estimate) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = `(Intercept)`) +  
* geom_point()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confintplus_auto_8_output-1.png" width="100%" />
]]
---
class: split-40
count: false
.column[.content[

```r
boot_samples_models %>%  
  unnest(tidy) %>%  
  select(boot_sample_id, estimate, term) %>%  
  spread(key = term, value = estimate) %>%  
  ggplot() +  
  aes(x = x) +  
  aes(y = `(Intercept)`) +  
  geom_point() +  
* geom_density_2d()
```
]]
.column[.content[
<img src="bootstrap_files/figure-html/confintplus_auto_9_output-1.png" width="100%" />
]]

---