Instructor

Mirindi Kabangu

  • MD Candidate, Indiana University School of Medicine
  • Member, Student National Medical Association
  • Research interests: clinical data science, health equity
  • Contact: mkabangu@iu.edu | mkabangu@uiowa.edu

By the end of this hour…

You will have written a paper.

  • Not metaphorically — you will have produced the actual outputs that go into a manuscript
  • A Table 1 describing your patient population
  • Figures showing your key results
  • Statistical comparisons between treatment arms
  • And you’ll know exactly what R line produced each one

No installation. No prior R experience required.

Everything runs live in your browser. You are already set up.

You Just Got an Email from Your PI

📧 From: Your PI

“I’m attaching de-identified data from our phase II RCT — 200 patients randomized to Drug A vs. Drug B. We’re submitting to JCO next month.

I need: - Table 1 comparing baseline characteristics by treatment arm - A figure showing tumor response rates by treatment - A figure showing marker levels by grade and treatment - p-values throughout

Can you have a draft by end of week?

— PI”

You open your laptop. Let’s go.

Section 1: Meet Your Data

Step 1: Look at what you’ve got

Before you write a single word of a paper, you need to know your data.

In R, that’s two lines:

glimpse(trial)   # column names, types, first values
head(trial)      # first 6 rows

Note

trial is a real-format RCT dataset from the gtsummary package. 200 patients. 6 variables. Two treatment arms: Drug A and Drug B.

🖥️ Try It: Explore the Data

Run the code below. Then swap glimpse for head and run it again. What variables do you see?

The Paper Line This Produces

Every time you run code today, ask: what would I write in the paper?

After glimpse(trial), your Methods section now has:

“Data were available for 200 patients randomized to Drug A (n = 107) or Drug B (n = 93). Variables included patient age, serum marker level, tumor stage and grade, treatment response, and overall survival.”

That sentence used to require you to count rows manually in Excel. Now it takes 1 line of R.

Section 2: Describing Your Patients

What goes in Table 1?

Table 1 is the “who are these patients?” table. It appears in nearly every clinical paper ever published.

  • Rows: your variables (age, stage, grade, marker, response)
  • Columns: your treatment groups + a Total column
  • Cells: mean ± SD for continuous variables, n (%) for categorical
  • Last column: p-values testing whether groups are balanced at baseline

In Excel: calculate each cell manually. Choose each test manually. Format manually.

In R: one function.

Building Table 1 Step by Step

trial |> tbl_summary()

That’s it. But we can make it publication-ready in 6 lines total.

We’ll build it one line at a time — you’ll see every decision.

Building Table 1

trial |>
  # Step 1: select variables for the table
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary()
Characteristic N = 2001
Chemotherapy Treatment
    Drug A 98 (49%)
    Drug B 102 (51%)
Age 47 (38, 57)
    Unknown 11
Marker Level (ng/mL) 0.64 (0.22, 1.41)
    Unknown 10
T Stage
    T1 53 (27%)
    T2 54 (27%)
    T3 43 (22%)
    T4 50 (25%)
Grade
    I 68 (34%)
    II 68 (34%)
    III 64 (32%)
Tumor Response 61 (32%)
    Unknown 7
1 n (%); Median (Q1, Q3)

Building Table 1

trial |>
  # Step 1: select variables for the table
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary(
    # Step 2: split by treatment arm
    by = trt
  )
Characteristic Drug A
N = 981
Drug B
N = 1021
Age 46 (37, 60) 48 (39, 56)
    Unknown 7 4
Marker Level (ng/mL) 0.84 (0.23, 1.60) 0.52 (0.18, 1.21)
    Unknown 6 4
T Stage

    T1 28 (29%) 25 (25%)
    T2 25 (26%) 29 (28%)
    T3 22 (22%) 21 (21%)
    T4 23 (23%) 27 (26%)
Grade

    I 35 (36%) 33 (32%)
    II 32 (33%) 36 (35%)
    III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%)
    Unknown 3 4
1 Median (Q1, Q3); n (%)

Building Table 1

trial |>
  # Step 1: select variables for the table
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary(
    # Step 2: split by treatment arm
    by = trt,
    # Step 3: report mean ± SD for continuous variables
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  )
Characteristic Drug A
N = 981
Drug B
N = 1021
Age 47 (15) 47 (14)
    Unknown 7 4
Marker Level (ng/mL) 1.02 (0.89) 0.82 (0.83)
    Unknown 6 4
T Stage

    T1 28 (29%) 25 (25%)
    T2 25 (26%) 29 (28%)
    T3 22 (22%) 21 (21%)
    T4 23 (23%) 27 (26%)
Grade

    I 35 (36%) 33 (32%)
    II 32 (33%) 36 (35%)
    III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%)
    Unknown 3 4
1 Mean (SD); n (%)

Building Table 1

trial |>
  # Step 1: select variables for the table
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary(
    # Step 2: split by treatment arm
    by = trt,
    # Step 3: report mean ± SD for continuous variables
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) |>
  # Step 4: add p-values (auto test selection)
  add_p()
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2
Age 47 (15) 47 (14) 0.7
    Unknown 7 4
Marker Level (ng/mL) 1.02 (0.89) 0.82 (0.83) 0.085
    Unknown 6 4
T Stage

0.9
    T1 28 (29%) 25 (25%)
    T2 25 (26%) 29 (28%)
    T3 22 (22%) 21 (21%)
    T4 23 (23%) 27 (26%)
Grade

0.9
    I 35 (36%) 33 (32%)
    II 32 (33%) 36 (35%)
    III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%) 0.5
    Unknown 3 4
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

Building Table 1

trial |>
  # Step 1: select variables for the table
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary(
    # Step 2: split by treatment arm
    by = trt,
    # Step 3: report mean ± SD for continuous variables
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) |>
  # Step 4: add p-values (auto test selection)
  add_p() |>
  # Step 5: add overall column + polish labels
  add_overall() |>
  bold_labels() |>
  modify_header(label = "**Characteristic**")
Characteristic Overall
N = 2001
Drug A
N = 981
Drug B
N = 1021
p-value2
Age 47 (14) 47 (15) 47 (14) 0.7
    Unknown 11 7 4
Marker Level (ng/mL) 0.92 (0.86) 1.02 (0.89) 0.82 (0.83) 0.085
    Unknown 10 6 4
T Stage


0.9
    T1 53 (27%) 28 (29%) 25 (25%)
    T2 54 (27%) 25 (26%) 29 (28%)
    T3 43 (22%) 22 (22%) 21 (21%)
    T4 50 (25%) 23 (23%) 27 (26%)
Grade


0.9
    I 68 (34%) 35 (36%) 33 (32%)
    II 68 (34%) 32 (33%) 36 (35%)
    III 64 (32%) 31 (32%) 33 (32%)
Tumor Response 61 (32%) 28 (29%) 33 (34%) 0.5
    Unknown 7 3 4
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

What the Paper Says

You just produced Table 1. Here’s what it gives you word-for-word:

“Baseline characteristics were broadly balanced between treatment arms (Table 1). The mean age was 47 years (SD 14) in the Drug A arm and 49 years (SD 14) in the Drug B arm (p = 0.3). Tumor stage and grade distributions did not differ significantly between groups (p > 0.05 for both).”

That paragraph took 5 minutes and 6 lines of R.

In Excel it would have been 30 minutes of copy-pasting, formula-writing, and manual formatting — and you’d have to redo it if a patient was excluded.

🖥️ Your Turn: Reproduce Table 1

Fill in the blanks to reproduce the table. The variable list and label are already there — you just need the by argument and add_p().

Tip

trial |> select(trt, age, marker, stage, grade, response) |> tbl_summary( by = trt, statistic = list( all_continuous() ~ "{mean} ({sd})", all_categorical() ~ "{n} ({p}%)" ) ) |> add_p() |> add_overall() |> bold_labels()
trial |>
  select(trt, age, marker, stage, grade, response) |>
  tbl_summary(
    by = trt,
    statistic = list(
      all_continuous()  ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) |>
  add_p() |>
  add_overall() |>
  bold_labels()

Section 3: Showing Your Results

What goes in the Results figures?

Your PI asked for two figures:

  1. Treatment response rates by arm — “Did Drug A or Drug B work better?”
  2. Marker levels by grade and treatment — “Does the biomarker behave differently across subgroups?”

Both are answered with ggplot2. The grammar is always:

data |>
  ggplot(aes(x = ..., y = ..., fill = ...)) +
  geom_*()

Figure 1: Treatment Response Rates

trial |>
  group_by(trt) |>
  # Step 1: calculate response rate per arm
  summarize(
    response_rate = mean(response, na.rm = TRUE)
  )
# A tibble: 2 × 2
  trt    response_rate
  <chr>          <dbl>
1 Drug A         0.295
2 Drug B         0.337

Figure 1: Treatment Response Rates

trial |>
  group_by(trt) |>
  # Step 1: calculate response rate per arm
  summarize(
    response_rate = mean(response, na.rm = TRUE)
  ) |>
  # Step 2: pipe directly into ggplot
  ggplot(aes(x = trt, y = response_rate, fill = trt))
Figure 1

Figure 1: Treatment Response Rates

trial |>
  group_by(trt) |>
  # Step 1: calculate response rate per arm
  summarize(
    response_rate = mean(response, na.rm = TRUE)
  ) |>
  # Step 2: pipe directly into ggplot
  ggplot(aes(x = trt, y = response_rate, fill = trt)) +
  # Step 3: add bars
  geom_col(width = 0.5)
Figure 2

Figure 1: Treatment Response Rates

trial |>
  group_by(trt) |>
  # Step 1: calculate response rate per arm
  summarize(
    response_rate = mean(response, na.rm = TRUE)
  ) |>
  # Step 2: pipe directly into ggplot
  ggplot(aes(x = trt, y = response_rate, fill = trt)) +
  # Step 3: add bars
  geom_col(width = 0.5) +
  # Step 4: convert to % and add labels
  geom_text(
    aes(label = scales::percent(response_rate, accuracy = 1)),
    vjust = -0.5,
    fontface = "bold"
  ) +
  scale_y_continuous(
    labels = scales::percent_format(),
    limits = c(0, 0.75)
  )
Figure 3

Figure 1: Treatment Response Rates

trial |>
  group_by(trt) |>
  # Step 1: calculate response rate per arm
  summarize(
    response_rate = mean(response, na.rm = TRUE)
  ) |>
  # Step 2: pipe directly into ggplot
  ggplot(aes(x = trt, y = response_rate, fill = trt)) +
  # Step 3: add bars
  geom_col(width = 0.5) +
  # Step 4: convert to % and add labels
  geom_text(
    aes(label = scales::percent(response_rate, accuracy = 1)),
    vjust = -0.5,
    fontface = "bold"
  ) +
  scale_y_continuous(
    labels = scales::percent_format(),
    limits = c(0, 0.75)
  ) +
  # Step 5: publication-ready theme and labels
  scale_fill_manual(values = c("#50B5AF", "#F79323")) +
  labs(
    title = "Figure 1. Treatment Response Rate by Arm",
    x = "Treatment",
    y = "Response Rate",
    caption = "Drug A vs. Drug B — trial dataset"
  ) +
  theme_minimal(base_size = 14) +
  theme(legend.position = "none")
Figure 4

What the Paper Says

Look at Figure 1. Here’s the sentence it produces:

“The overall response rate was 28% in the Drug A arm compared to 35% in the Drug B arm. Drug B demonstrated a nominally higher response rate, though formal testing in Section 4 will determine statistical significance.”

The figure does the heavy lifting.

A reader scanning your paper sees that bar chart before they read a single word of Results. Make it clear and labeled — ggplot2 just did that for you automatically.

Figure 2: Marker Levels by Grade and Treatment

trial |>
  ggplot(aes(x = grade, y = marker, fill = trt)) +
  # Step 1: raw marker data by treatment
  geom_boxplot()
Figure 5

Figure 2: Marker Levels by Grade and Treatment

trial |>
  ggplot(aes(x = grade, y = marker, fill = trt)) +
  # Step 1: raw marker data by treatment
  geom_boxplot(outlier.shape = NA, alpha = 0.7) +
  # Step 2: add individual data points
  geom_jitter(
    aes(color = trt),
    width = 0.15,
    alpha = 0.5,
    size = 1.5
  )
Figure 6

Figure 2: Marker Levels by Grade and Treatment

trial |>
  ggplot(aes(x = grade, y = marker, fill = trt)) +
  # Step 1: raw marker data by treatment
  geom_boxplot(outlier.shape = NA, alpha = 0.7) +
  # Step 2: add individual data points
  geom_jitter(
    aes(color = trt),
    width = 0.15,
    alpha = 0.5,
    size = 1.5
  ) +
  # Step 3: publication-ready polish
  scale_fill_manual(values = c("#50B5AF", "#F79323")) +
  scale_color_manual(values = c("#50B5AF", "#F79323")) +
  labs(
    title = "Figure 2. Marker Level by Tumor Grade and Treatment",
    x = "Tumor Grade",
    y = "Marker Level",
    fill = "Treatment",
    color = "Treatment",
    caption = "Boxes show IQR; points show individual patients"
  ) +
  theme_minimal(base_size = 14)
Figure 7

What the Paper Says

Look at Figure 2. Here’s what you’d write:

“Marker levels were broadly similar across tumor grades in both treatment arms (Figure 2). Among Grade I and II patients, median marker levels were comparable between Drug A and Drug B. Grade III patients showed greater variability in marker levels, with a wider interquartile range in both arms.”

You didn’t touch a formula.

No =STDEV(). No pivot table. No manually filtering by grade. One ggplot call split the data, colored it by group, and showed you the distribution — automatically.

🖥️ Your Turn: Build Figure 2 from Scratch

Using trial, make a boxplot of marker levels by stage (not grade this time), filled by trt. Add axis labels and a title.

Hint: geom_boxplot(aes(x = stage, y = marker, fill = trt))

Tip

trial |> ggplot(aes(x = stage, y = marker, fill = trt)) + geom_boxplot(alpha = 0.7) + labs( title = "Marker Level by Tumor Stage and Treatment", x = "Tumor Stage", y = "Marker Level", fill = "Treatment" ) + theme_minimal()
trial |>
  ggplot(aes(x = stage, y = marker, fill = trt)) +
  geom_boxplot(alpha = 0.7) +
  labs(
    title = "Marker Level by Tumor Stage and Treatment",
    x     = "Tumor Stage",
    y     = "Marker Level",
    fill  = "Treatment"
  ) +
  theme_minimal()

Section 4: The Numbers Behind the Story

What “analysis” means in a clinical paper

You’ve described the data (Table 1) and shown it visually (Figures 1–2).

Now you need to formally answer the research question:

“Is Drug A more effective than Drug B?”

  • A chi-square test on response rate (categorical outcome)
  • A group comparison of marker levels by treatment
  • These are the p-values that go in your Results section

Formal Test: Response Rate

# Chi-square: response rate by treatment arm
trial |>
  select(trt, response) |>
  tbl_summary(
    by = trt,
    statistic = all_categorical() ~ "{n} ({p}%)",
    label = response ~ "Treatment Response"
  ) |>
  add_p() |>
  bold_labels()
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2
Treatment Response 28 (29%) 33 (34%) 0.5
    Unknown 3 4
1 n (%)
2 Pearson’s Chi-squared test

Formal Test: Marker Levels

# t-test: marker level by treatment arm
trial |>
  select(trt, marker) |>
  tbl_summary(
    by = trt,
    statistic = all_continuous() ~ "{mean} ({sd})",
    label = marker ~ "Marker Level"
  ) |>
  add_p() |>
  bold_labels()
Characteristic Drug A
N = 981
Drug B
N = 1021
p-value2
Marker Level 1.02 (0.89) 0.82 (0.83) 0.085
    Unknown 6 4
1 Mean (SD)
2 Wilcoxon rank sum test

Full Results Table

# Combine into one clean results table
trial |>
  select(trt, response, marker, grade, stage) |>
  tbl_summary(
    by = trt,
    statistic = list(
      all_continuous() ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    label = list(
      response ~ "Treatment Response",
      marker ~ "Marker Level",
      grade ~ "Tumor Grade",
      stage ~ "T Stage"
    )
  ) |>
  add_p() |>
  add_overall() |>
  bold_labels() |>
  modify_header(label = "**Outcome / Variable**")
Outcome / Variable Overall
N = 2001
Drug A
N = 981
Drug B
N = 1021
p-value2
Treatment Response 61 (32%) 28 (29%) 33 (34%) 0.5
    Unknown 7 3 4
Marker Level 0.92 (0.86) 1.02 (0.89) 0.82 (0.83) 0.085
    Unknown 10 6 4
Tumor Grade


0.9
    I 68 (34%) 35 (36%) 33 (32%)
    II 68 (34%) 32 (33%) 36 (35%)
    III 64 (32%) 31 (32%) 33 (32%)
T Stage


0.9
    T1 53 (27%) 28 (29%) 25 (25%)
    T2 54 (27%) 25 (26%) 29 (28%)
    T3 43 (22%) 22 (22%) 21 (21%)
    T4 50 (25%) 23 (23%) 27 (26%)
1 n (%); Mean (SD)
2 Pearson’s Chi-squared test; Wilcoxon rank sum test

What the Paper Says

Your Results section now writes itself from the output:

Treatment Response. The response rate was 28% (30/107) in the Drug A arm vs. 35% (33/93) in the Drug B arm. This difference was not statistically significant (p = 0.2).”

Biomarker Levels. Mean marker level was 0.92 ng/mL (SD 0.86) with Drug A and 0.86 ng/mL (SD 0.80) with Drug B. No significant difference was observed between arms (p = 0.6).”

What does this tell us about the trial?

Neither the response rate nor the biomarker differed significantly between arms. In a real paper, this shapes your Discussion — Drug B showed a numerically higher response rate but the study may have been underpowered to detect a difference.

🖥️ Your Turn: Grade-Stratified Analysis

Filter for Grade III patients only, then build a tbl_summary() comparing response and marker by trt, with p-values.

This is a subgroup analysis — one of the most common requests in clinical research.

Tip

trial |> filter(grade == "III") |> select(trt, response, marker) |> tbl_summary( by = trt, statistic = list( all_continuous() ~ "{mean} ({sd})", all_categorical() ~ "{n} ({p}%)" ) ) |> add_p() |> bold_labels()
trial |>
  filter(grade == "III") |>
  select(trt, response, marker) |>
  tbl_summary(
    by = trt,
    statistic = list(
      all_continuous()  ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    )
  ) |>
  add_p() |>
  bold_labels()

Section 5: Conclusions — What Does It All Mean?

The conclusion follows from the output

You’ve now produced:

  • Table 1 — patients were balanced at baseline
  • Figure 1 — response rates by treatment arm
  • Figure 2 — marker levels by grade and treatment
  • Analysis table — formal p-values for all comparisons

All of it came from one dataset. All of it is reproducible. If the data changes — one row excluded, one variable added — you re-run the script and every table and figure updates automatically.

🖥️ Boss Level: Write Your Own Analysis

No blanks. No hints. Your PI just emailed back:

“Can you also look at whether marker level predicts response, stratified by grade? Just a quick summary — whatever you think makes sense.”

Use trial. Use any combination of filter(), group_by(), tbl_summary(), or ggplot(). There is no single right answer.

One possible approach

# Option 1: summary table — marker by response, stratified by grade trial |> group_by(grade) |> tbl_summary( by = response, include = marker, statistic = all_continuous() ~ "{mean} ({sd})", label = marker ~ "Marker Level" ) |> add_p() |> bold_labels()
# Option 1: summary table — marker by response, stratified by grade
trial |>
  group_by(grade) |>
  tbl_summary(
    by = response,
    include = marker,
    statistic = all_continuous() ~ "{mean} ({sd})",
    label = marker ~ "Marker Level"
  ) |>
  add_p() |>
  bold_labels()

One More Thing: Reproducible Prose

The hidden reproducibility problem

Tables and figures update when you re-run. But what about the sentences in your paper?

This is the most common way reproducibility breaks down in practice:

“The mean age was 47 years…”

Six months later, you exclude 3 patients. You re-run R. The tables update. But that 47 in your Methods section? Still there. Manually. Wrong.

Hard-coded numbers in prose are a reproducibility leak.

Every time you type a number into a sentence, you create a place where the paper and the data can silently disagree.

The fix: inline R code

In a Quarto document, you can embed live R expressions directly inside your prose using `r`:

The mean age was ` r round(mean(trial$age, na.rm = TRUE), 1)` years
(SD ` r round(sd(trial$age, na.rm = TRUE), 1)`).

When the document renders, that becomes:

“The mean age was 47.2 years (SD 14.3).”

Re-run on updated data? The sentence updates automatically. The number in the prose and the number in the table are guaranteed to match — they come from the same computation.

Your entire Results section, reproducible

A total of ` r nrow(trial)` patients were enrolled, randomized to
Drug A (n = ` r sum(trial$trt == "Drug A")`) or
Drug B (n = ` r sum(trial$trt == "Drug B")`).

Renders as:

“A total of 200 patients were enrolled, randomized to Drug A (n = 107) or Drug B (n = 93).”

Your entire Results section, reproducible

A total of ` r nrow(trial)` patients were enrolled, randomized to
Drug A (n = ` r sum(trial$trt == "Drug A")`) or
Drug B (n = ` r sum(trial$trt == "Drug B")`).

The response rate was
` r scales::percent(mean(trial$response[trial$trt=="Drug A"], na.rm=TRUE), accuracy=1)`
in the Drug A arm vs.
` r scales::percent(mean(trial$response[trial$trt=="Drug B"], na.rm=TRUE), accuracy=1)`
in the Drug B arm.

Renders as:

“The response rate was 28% in the Drug A arm vs. 35% in the Drug B arm.”

Your entire Results section, reproducible

A total of ` r nrow(trial)` patients were enrolled...
The response rate was ` r drug_a_resp` vs. ` r drug_b_resp`...

The full power of reproducible research.

Change the dataset → re-render the document → every number in every sentence updates.

Tables ✅ Figures ✅ Prose ✅

This is what separates a reproducible paper from a paper that just happens to have code attached.

Wrapping Up: The Paper You Just Wrote

Look what you built in 60 minutes

  • 📋 Table 1 — baseline characteristics, mean ± SD, p-values, formatted for submission
  • 📊 Figure 1 — treatment response rate bar chart, percent-labeled, publication-ready
  • 📦 Figure 2 — marker levels by grade and treatment, boxplot with jittered points
  • 🔬 Results table — formal group comparisons across all outcomes
  • 🔍 Subgroup analysis — Grade III patients, stratified by treatment
  • ✍️ Reproducible prose — every number in your sentences tied to the data with `r `

Update the data → re-render → everything updates. Tables, figures, and the sentences that describe them.

Why this matters for your career

  • NIH and major journals now require code and data sharing at publication
  • Hospital analytics teams are moving to R and Python at scale — Excel breaks at 100K rows
  • A student who can produce a clean Table 1 and two figures in an afternoon is genuinely useful to a PI from day one
  • The skills you used today transfer directly to your own research data

Your first step after today

Tonight: Go to posit.cloud — free RStudio in your browser. Create a project.

This week: Run glimpse() on any dataset you’ve been meaning to analyze. Paste it into ChatGPT or Claude and say “help me write a tbl_summary for this.” You’ll have a Table 1 draft in 10 minutes.

This month: Replace one Excel analysis with R. Just one. That’s the habit.

Where to go next

Resource What it’s for
r4ds.hadley.nz Free textbook
posit.cloud Free RStudio in your browser
medicaldata Free clinical datasets
rmrwr Free clinical r textbook
DART Program Free hands-on tutorials

Thank you

One last thing

If you open R this week and run glimpse() on your own data — even once — this session will have done its job.

Session Info

R version 4.5.3 (2026-03-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] gtsummary_2.5.0   medicaldata_0.2.0 ggplot2_4.0.2     dplyr_1.2.0      
[5] tidyr_1.3.2      

loaded via a namespace (and not attached):
 [1] gt_1.3.0           sass_0.4.10        utf8_1.2.6         generics_0.1.4    
 [5] xml2_1.5.2         stringi_1.8.7      digest_0.6.39      magrittr_2.0.4    
 [9] evaluate_1.0.5     grid_4.5.3         RColorBrewer_1.1-3 cards_0.7.1       
[13] fastmap_1.2.0      jsonlite_2.0.0     cardx_0.3.2        backports_1.5.0   
[17] purrr_1.2.1        scales_1.4.0       cli_3.6.5          rlang_1.1.7       
[21] litedown_0.9       commonmark_2.0.0   base64enc_0.1-6    withr_3.0.2       
[25] yaml_2.3.12        otel_0.2.0         tools_4.5.3        broom_1.0.12      
[29] vctrs_0.7.2        R6_2.6.1           lifecycle_1.0.5    stringr_1.6.0     
[33] fs_1.6.7           pkgconfig_2.0.3    pillar_1.11.1      gtable_0.3.6      
[37] glue_1.8.0         xfun_0.57          tibble_3.3.1       tidyselect_1.2.1  
[41] knitr_1.51         farver_2.1.2       htmltools_0.5.9    rmarkdown_2.30    
[45] labeling_0.4.3     compiler_4.5.3     S7_0.2.1           markdown_2.0