## Activate the Core Packages
library(tidyverse) ## Brings in a core of useful functions
library(gt) ## Tables
## Specialized Packages
library(webshot2) ## Output PNG or PDF files from gt
library(gtExtras) ## gt table formatting and highlighting5 Presentation Tables
Producing good presentation tables is a basic requirement in virtually all R applications.
A good table is informative. This requires careful attention to table design and details. The small things are important!
There are quite a few packages that print tables. Even base R lets you make a simple table. Choosing a competent, full-function table package is important as you’ll want to master many of the key aspects of table-making, even for simple projects.
The package gt is the choice shown here. “gt” stands for “grammar of tables.” While this package is not part of the tidyverse, its style reflect the philosophy of ggplot2.
The main documentation for gt is given on this website: https://gt.rstudio.com/index.html
5.1 Basic Tables
Let’s get some basic data which describes properties of the Hawaiian Islands. We’ll use some statistics from Wikipedia (“Hawaiian Islands — Wikipedia, the Free Encyclopedia” 2023). Besides the data from Wikipedia, we’ll calculate the population density for each island.
## Read the data
data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=
"Island, Nickname, Area, Population
Hawaiʻi, The Big Island, 4028.0, 200629
Maui, The Valley Isle, 727.2, 164221
Oʻahu, The Gathering Place, 596.7, 1016508
Kauaʻi, The Garden Isle, 552.3, 73298
Molokaʻi, The Friendly Isle, 260.0, 7345
Lānaʻi, The Pineapple Isle, 140.5, 3367
Niʻihau, The Forbidden Isle, 69.5, 84
Kahoʻolawe, The Target Isle, 44.6, 0")
## Calculate the population density
data$Density <- data$Population/data$Area
## Confirm the data
data# A tibble: 8 × 5
Island Nickname Area Population Density
<chr> <chr> <dbl> <dbl> <dbl>
1 Hawaiʻi The Big Island 4028 200629 49.8
2 Maui The Valley Isle 727. 164221 226.
3 Oʻahu The Gathering Place 597. 1016508 1704.
4 Kauaʻi The Garden Isle 552. 73298 133.
5 Molokaʻi The Friendly Isle 260 7345 28.2
6 Lānaʻi The Pineapple Isle 140. 3367 24.0
7 Niʻihau The Forbidden Isle 69.5 84 1.21
8 Kahoʻolawe The Target Isle 44.6 0 0
This simple table confirms the data entry. But this output doesn’t work very well for a formal report. Lots of things need fixing.
5.2 Formatting in gt
Our package choice for table making, gt, starts with a simple specification. Complexity is added with statement that do specific enhancements.
5.2.1 The Default Table
It is a straight forward task to show the Hawaiian Islands data in a gt table.
## Use the data from the previous chunk
## Make a default gt table
gt(data)| Island | Nickname | Area | Population | Density |
|---|---|---|---|---|
| Hawaiʻi | The Big Island | 4028.0 | 200629 | 49.808590 |
| Maui | The Valley Isle | 727.2 | 164221 | 225.826458 |
| Oʻahu | The Gathering Place | 596.7 | 1016508 | 1703.549522 |
| Kauaʻi | The Garden Isle | 552.3 | 73298 | 132.714105 |
| Molokaʻi | The Friendly Isle | 260.0 | 7345 | 28.250000 |
| Lānaʻi | The Pineapple Isle | 140.5 | 3367 | 23.964413 |
| Niʻihau | The Forbidden Isle | 69.5 | 84 | 1.208633 |
| Kahoʻolawe | The Target Isle | 44.6 | 0 | 0.000000 |
5.2.2 Number Formatting
Everything fits in the table now. But there too many digits in the Density column. It will be easier to read the large Area and Population values if commas are added. Also, these two columns shouldn’t have any decimal values.
## Use the data from the previous chunk
## Make a default gt table
gt(data) |>
fmt_number(
columns = Density, decimals = 1) |>
fmt_number(
columns = c(Area, Population),
use_seps = TRUE) |>
fmt_number(
columns = c(Area,Population),
decimals = 0)| Island | Nickname | Area | Population | Density |
|---|---|---|---|---|
| Hawaiʻi | The Big Island | 4,028 | 200,629 | 49.8 |
| Maui | The Valley Isle | 727 | 164,221 | 225.8 |
| Oʻahu | The Gathering Place | 597 | 1,016,508 | 1,703.5 |
| Kauaʻi | The Garden Isle | 552 | 73,298 | 132.7 |
| Molokaʻi | The Friendly Isle | 260 | 7,345 | 28.2 |
| Lānaʻi | The Pineapple Isle | 140 | 3,367 | 24.0 |
| Niʻihau | The Forbidden Isle | 70 | 84 | 1.2 |
| Kahoʻolawe | The Target Isle | 45 | 0 | 0.0 |
5.2.3 Source, Units and Annotations
A good table has general descriptions and annotations about specific data.
Several statements add needed information.
Data Source: It’s a good idea to always include the source of the data as part of the table.
Measurement Units: Quantitative data need to have units specified.
Superscript: The Unicode value (“\U00B2”) is used to generate the squared symbol.
Note that here, we pass the table to the hawaii variable. That’s useful when we want to save copy of the table.
## Use the data from the previous chunk
## Make a complex gt table
hawaii <- gt(data) |>
## Column Formatting
fmt_number(columns = Density,
decimals = 1) |>
fmt_number(columns = c(Area, Population),
use_seps = TRUE) |>
fmt_number(columns = c(Area,Population),
decimals = 0) |>
## Source Information
tab_source_note(
source_note = "Source: Wikipedia (Hawaiian_Islands") |>
## Footnotes
tab_footnote(
footnote = "miles\U00B2",
locations = cells_column_labels(columns=Area)) |>
tab_footnote(
footnote = "2020 Population",
locations = cells_column_labels(columns=Population)) |>
tab_footnote(
footnote = "people/mile\U00B2",
locations = cells_column_labels(columns=Density)) |>
tab_footnote(
footnote = "Limited or no access",
locations = cells_body(columns=Nickname,
rows=c(7,8)))
## Show the table
hawaii| Island | Nickname | Area1 | Population2 | Density3 |
|---|---|---|---|---|
| Hawaiʻi | The Big Island | 4,028 | 200,629 | 49.8 |
| Maui | The Valley Isle | 727 | 164,221 | 225.8 |
| Oʻahu | The Gathering Place | 597 | 1,016,508 | 1,703.5 |
| Kauaʻi | The Garden Isle | 552 | 73,298 | 132.7 |
| Molokaʻi | The Friendly Isle | 260 | 7,345 | 28.2 |
| Lānaʻi | The Pineapple Isle | 140 | 3,367 | 24.0 |
| Niʻihau | The Forbidden Isle4 | 70 | 84 | 1.2 |
| Kahoʻolawe | The Target Isle4 | 45 | 0 | 0.0 |
| Source: Wikipedia (Hawaiian_Islands | ||||
| 1 miles² | ||||
| 2 2020 Population | ||||
| 3 people/mile² | ||||
| 4 Limited or no access | ||||
This gt code looks complex. Indeed, detailed instructions are needed to do the precise control of the formatting and location of the annotation. This code chunk provides all of the types of information you’ll generally need to add the required meta data.
5.2.4 Data Column Formatting
You can change the appearance of the data in a column. Scientific names, for example, are always shown in italics.
data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=
"Scientific, Common
Aleurites moluccana, Kukui
Artocarpus altilis, Breadfruit
Cocos nucifera, Coconut Palm
Cordia subcordata, Kou
Cordyline fruticosa, Ti
Dioscorea bulbifera, Air Yam
Hibiscus tiliaceus, Hau
Ipomoea cairica, Mile A Minute Vine
Morinda citrifolia, Noni
Pandanus tectorius, Hala
Saccharum officinarum, Sugarcane
Syzygium malaccense, Mountain Apple
Thespesia populnea, Milo
Zingiber zerumbet, Shampoo Ginger")
## Make a gt table
gt(data) |>
## Source Information
tab_source_note(
source_note = "Source: wildlifeofhawaii.com") |>
## Italics for the Scientific names
tab_style(
style =
cell_text(style = "italic"),
locations =
cells_body(columns = Scientific))| Scientific | Common |
|---|---|
| Aleurites moluccana | Kukui |
| Artocarpus altilis | Breadfruit |
| Cocos nucifera | Coconut Palm |
| Cordia subcordata | Kou |
| Cordyline fruticosa | Ti |
| Dioscorea bulbifera | Air Yam |
| Hibiscus tiliaceus | Hau |
| Ipomoea cairica | Mile A Minute Vine |
| Morinda citrifolia | Noni |
| Pandanus tectorius | Hala |
| Saccharum officinarum | Sugarcane |
| Syzygium malaccense | Mountain Apple |
| Thespesia populnea | Milo |
| Zingiber zerumbet | Shampoo Ginger |
| Source: wildlifeofhawaii.com | |
5.2.5 Column Head Enhancement
The column contents are identified by the column head text. There are a number of things that help clarify and emphasize table content.
5.2.5.1 Column spanners
A spanner is a heading that spans the width of several column headers.
data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=
"Mountain, Location, Feet, Meters
Everest, Nepal/Tibet, 29029, 8848
Denali, Alaska, 20310, 6190
Whitney, California, 14505, 4421
Mauna Kea, Hawai`i, 13786, 4205
Mauna Loa, Hawai`i, 13680, 4170")
## Build the table
gt(data) |>
tab_spanner(
label = "Elevation",
columns = c(Feet, Meters)) |>
tab_footnote(
footnote = "Tallest if measured from its base",
locations = cells_body(columns=Mountain, rows=4)) |>
tab_source_note(source_note = "Source: Internet search")| Mountain | Location | Elevation | |
|---|---|---|---|
| Feet | Meters | ||
| Everest | Nepal/Tibet | 29029 | 8848 |
| Denali | Alaska | 20310 | 6190 |
| Whitney | California | 14505 | 4421 |
| Mauna Kea1 | Hawai`i | 13786 | 4205 |
| Mauna Loa | Hawai`i | 13680 | 4170 |
| Source: Internet search | |||
| 1 Tallest if measured from its base | |||
5.2.5.2 Column head formatting
The default is for the same font weight as the table cell entries. This means your eye isn’t drawn to the information that identifies the table contents. Making the head text bold corrects this deficiency.
data <- read_csv(col_names = TRUE, show_col_types = FALSE, file =
"Name, Born, Died, Age
Kamehameha I, 1758, 1819, 61
Kamehameha II, 1797, 1824, 27
Kamehameha III, 1813, 1854, 41
Kamehameha IV, 1834, 1863, 29
Kamehameha V, 1830, 1872, 42
Lunalilo, 1835, 1874, 39
Kalākaua, 1836, 1891, 54
Lili`uokalani, 1838, 1917, 79")
## Build the table
gt(data) |>
tab_style(style = cell_text(weight = "bold"),
locations = cells_column_labels()) |>
tab_footnote(
footnote = "Exact date unknown",
locations = cells_body(columns=Born, rows=1)) |>
tab_source_note(source_note = "Source: Wikipedia")| Name | Born | Died | Age |
|---|---|---|---|
| Kamehameha I | 1 1758 | 1819 | 61 |
| Kamehameha II | 1797 | 1824 | 27 |
| Kamehameha III | 1813 | 1854 | 41 |
| Kamehameha IV | 1834 | 1863 | 29 |
| Kamehameha V | 1830 | 1872 | 42 |
| Lunalilo | 1835 | 1874 | 39 |
| Kalākaua | 1836 | 1891 | 54 |
| Lili`uokalani | 1838 | 1917 | 79 |
| Source: Wikipedia | |||
| 1 Exact date unknown | |||
5.2.6 Grouping with a Column Variable
Sometimes, the structure of a table can be improved by grouping rows that share a characteristic. This can be seen in the following example with the same data presented as ungrouped followed by the same data in a grouped table.
The grouped table includes coloring the rows with the grouped values.
data <- read_csv(col_names=TRUE, show_col_types=FALSE, file= "Author, Title, Year
Jones, Riding the Wind, 2012
Jones, Viewing Space, 2009
Smith, People and Places, 2010
Smith, Been Seen There, 2012
Smith, Focus and Defocus, 2007
Jones, I am Out of Order, 2010")
## Ungrouped table
gt(data) |>
tab_caption(caption = "Ungrouped data") |>
tab_source_note(source_note = "Source: Example data")| Author | Title | Year |
|---|---|---|
| Jones | Riding the Wind | 2012 |
| Jones | Viewing Space | 2009 |
| Smith | People and Places | 2010 |
| Smith | Been Seen There | 2012 |
| Smith | Focus and Defocus | 2007 |
| Jones | I am Out of Order | 2010 |
| Source: Example data | ||
## Grouped table
gt(data,
groupname_col = "Author") |>
tab_options(row_group.background.color = "lightblue") |>
tab_style(style = cell_text(weight = "bold"),
locations = cells_column_labels()) |>
tab_caption(caption = "Grouped data") |>
tab_source_note(source_note = "Source: Example data")| Title | Year |
|---|---|
| Jones | |
| Riding the Wind | 2012 |
| Viewing Space | 2009 |
| I am Out of Order | 2010 |
| Smith | |
| People and Places | 2010 |
| Been Seen There | 2012 |
| Focus and Defocus | 2007 |
| Source: Example data | |
5.3 Outputting the Table
There are times when you need to generate a table as either a PNG graphic file or a PDF file.
Here are a few points worth noting:
Webshot2: This package is required to get either a PNG or PDF output.
Whitespace: The
expandparameter puts white space on the sides of the graphic. The default value is 5.
## Output the table created in a previous chunk.
## Make it a PNG file.
hawaii |>
gtsave("Hawaiian_Islands.png", expand = 20)
## Make it a PDF document
hawaii |>
gtsave("Hawaiian_Islands.pdf")5.4 Table Captions
Tables need a caption. This is the row that is on the top of the table. It generally includes a table number, such as “Table 1. The Hawaiian Islands”
5.4.1 Quarto Captions
The R option tbl-cap: is the place to give the text for the table caption. Don’t put in the “Table X.” part; Quarto does automatic table numbering.
You use the R option label: to make links to the table number in the text. For example, there are eight Hawaiian Island (Table 5.1).
## Uses a table generated in a previous chunk
hawaii| Island | Nickname | Area1 | Population2 | Density3 |
|---|---|---|---|---|
| Hawaiʻi | The Big Island | 4,028 | 200,629 | 49.8 |
| Maui | The Valley Isle | 727 | 164,221 | 225.8 |
| Oʻahu | The Gathering Place | 597 | 1,016,508 | 1,703.5 |
| Kauaʻi | The Garden Isle | 552 | 73,298 | 132.7 |
| Molokaʻi | The Friendly Isle | 260 | 7,345 | 28.2 |
| Lānaʻi | The Pineapple Isle | 140 | 3,367 | 24.0 |
| Niʻihau | The Forbidden Isle4 | 70 | 84 | 1.2 |
| Kahoʻolawe | The Target Isle4 | 45 | 0 | 0.0 |
| Source: Wikipedia (Hawaiian_Islands | ||||
| 1 miles² | ||||
| 2 2020 Population | ||||
| 3 people/mile² | ||||
| 4 Limited or no access | ||||
5.4.2 Caption with gt
If you are generating a table for use outside Quarto, such as with a PNG or PDF file, you may want to add a table caption using gt. There is a caption function that lets you do this.
Note that you’ll need to specify your own table numbering.
## Uses a table generated in a previous chunk
hawaii <- hawaii |>
tab_caption(caption = "Table 5. The Hawaiian Islands")
hawaii| Island | Nickname | Area1 | Population2 | Density3 |
|---|---|---|---|---|
| Hawaiʻi | The Big Island | 4,028 | 200,629 | 49.8 |
| Maui | The Valley Isle | 727 | 164,221 | 225.8 |
| Oʻahu | The Gathering Place | 597 | 1,016,508 | 1,703.5 |
| Kauaʻi | The Garden Isle | 552 | 73,298 | 132.7 |
| Molokaʻi | The Friendly Isle | 260 | 7,345 | 28.2 |
| Lānaʻi | The Pineapple Isle | 140 | 3,367 | 24.0 |
| Niʻihau | The Forbidden Isle4 | 70 | 84 | 1.2 |
| Kahoʻolawe | The Target Isle4 | 45 | 0 | 0.0 |
| Source: Wikipedia (Hawaiian_Islands | ||||
| 1 miles² | ||||
| 2 2020 Population | ||||
| 3 people/mile² | ||||
| 4 Limited or no access | ||||
5.5 Special Situations
There are a few table tasks that are less commonly encountered. Here are some solutions.
5.5.2 Sorting a Table
Sometimes you need to sort a table so the rows are in a different order than the original data. The dplyr package provides the arrange function to accomplish this task.
## Data from the previous chunk
## Do the sort using dplyr
data <- data |>
arrange(Scientific)
## Make a nice table
gt(data) |>
fmt_markdown() |>
tab_source_note(
source_note = "Sources: Google, ChatGPT and Kew Plants of the World") | Common | Scientific |
|---|---|
Bluestar |
Amsonia tabernaemontana Walter |
Butterfly weed |
Asclepias tuberosa L. |
Butterfly Bush |
Buddleja davidii Franch. |
Coneflower |
Echinacea purpurea (L.) Moench |
Lantana |
Lantana camara L. |
Lavender |
Lavandula angustifolia Mill. |
Phlox |
Phlox paniculata L. |
Black-eyed Susan |
Rudbeckia hirta L. |
| Sources: Google, ChatGPT and Kew Plants of the World | |
5.5.3 Color Stripes: Enhance with gtExtras
The gtExtras package (Mock and Sjoberg 2022) makes it quite straight forward to apply useful overall table formatting and highlight cells.
Information and examples of gtExtras application are on the website: https://themockup.blog/posts/2022-06-13-gtextras-cran/index.html.
The following example shows how rows are emphasized by using color stripes.
## Read the data from a CSV file
data <- read_csv(col_names = TRUE,
file="data/hawaii_ethnicity_2019.csv")
## Assign annotation information
fn1 <- "The sum of the individual categories may sum to more than 100% because people who reported more than one race were tallied in each ethnicity category."
fn2 <- "Date: July 1, 2019"
source <- "Source: census.hawaii.gov"
## Create the table
gt(data) |>
## Put percent signs on all the values
fmt_percent(
scale_values = FALSE,
decimals = 1) |>
## Add standard annotations
tab_source_note(source_note = source) |>
tab_footnote(
footnote = fn1,
locations = cells_column_labels(columns=Ethnicity)) |>
tab_footnote(
footnote = fn2,
locations = cells_column_labels(columns=Ethnicity)) |>
## Shade alternate rows
gt_theme_guardian()| Ethnicity1,2 | United States | Hawaii | Hawaii County | Honolulu County | Kauai County | Maui County |
|---|---|---|---|---|---|---|
| Asian | 7.0% | 57.3% | 45.2% | 61.8% | 51.5% | 48.2% |
| White | 78.8% | 43.5% | 57.2% | 38.6% | 52.1% | 52.1% |
| Native Hawaiian and Other Pacific Islander | 0.5% | 27.0% | 35.3% | 25.1% | 26.5% | 28.2% |
| Black or African American | 14.7% | 3.6% | 2.5% | 4.3% | 1.8% | 1.8% |
| American Indian and Alaska Native | 2.1% | 2.7% | 4.8% | 2.2% | 2.8% | 2.7% |
| Hispanic any race | 18.5% | 10.7% | 12.9% | 10.0% | 11.4% | 11.6% |
| Source: census.hawaii.gov | ||||||
| 1 The sum of the individual categories may sum to more than 100% because people who reported more than one race were tallied in each ethnicity category. | ||||||
| 2 Date: July 1, 2019 | ||||||
The gtExtras package has a variety of overall formats as well as ways to use color to highlight rows, emphasize numeric differences, and add specialized formats to values (e.g., percent signs and oF).
The addition of a bar graph to a table is another useful gtExtras function. This is shown in an example later.
5.5.4 Wrapped & Multipage Tables
Sometimes, a table is too big to fit on a single page. There are two basic strategies:
Wrap the data into parallel columns
Divide the data into separate tables that share a common header.
5.5.4.1 Wrap into two columns
This example shows that the two columns in the table do not need to have the same number of entries. This is the case when the total number of table rows is odd.
A vertical line is used to separate the two sides of the table.
## Get some data
data <- read_csv(col_names=TRUE,show_col_types=FALSE,file=
"name, id_code
R. Jones, 243
P. Smith, 126
J. Brown, 045
M. Fryer, 084
K. Laner, 312
Q. Ables, 264
H. Ramus, 165
H. Tokai, 311
J. Manus, 205")
no_entries <- nrow(data)
left_divide <- round((no_entries/2)+0.5) ## account for odd rows
right_divide <- left_divide + 1
## Divides the entries into two sets
left_data <- data[1:left_divide,]
right_data <- data[right_divide:no_entries,]
colnames(right_data)[1] <- "name2"
colnames(right_data)[2] <- "id_code2"
## Pad out the lengths of each set so they are equal
n <- max(length(left_data$name), length(right_data$name2))
both_data <- data.frame(left_data$name[1:n],
left_data$id_code[1:n],
right_data$name2[1:n],
right_data$id_code2[1:n])
## Make sure column names are correct
colnames(both_data) <- c("name","id_code","name2","id_code2")
## Build the table
wrapped_table <- gt(both_data) %>%
cols_label(name2 = "name") %>%
cols_label(id_code2 = "id_code") %>%
cols_align(align = "left", columns = name) %>%
sub_missing(columns = everything(),rows = everything(),
missing_text = " ") %>%
gt_add_divider(columns = "id_code",
style = "solid",
color = "gray88") %>%
tab_source_note(source_note = "Source: Not real data.")
## Output the table
wrapped_table| name | id_code | name | id_code |
|---|---|---|---|
| R. Jones | 243 | Q. Ables | 264 |
| P. Smith | 126 | H. Ramus | 165 |
| J. Brown | 045 | H. Tokai | 311 |
| M. Fryer | 084 | J. Manus | 205 |
| K. Laner | 312 | ||
| Source: Not real data. | |||
The code should be wrapped into a function if this table structure is used very often.
5.5.4.2 Divide into multiple tables
Using multiple tables is important when you are using big data sets. Dividing the tables gives you header information on each table.
Create a limited table by choosing the rows to be used to create the table. Note the structure of the statement, where data is the table content and 1:5 (followed by a comma!) gives the rows in data to be used in the table:
gt(data[1:5,])
You need one of these gt blocks for each of the tables.
A tab_header is used to create a title (and, optionally, a subtitle) instead of the tbl-cap statement. This allows all the tables to be created in a single code chunk. However, you can use multiple chunks and tbl-cap statements as an alternative.
## Use the data from a previous chunk
## Divide by specifying row numbers for each table
## NOTE: a comma comes after the row numbers
gt(data[1:5,]) |>
tab_header(title = "Divided table",
subtitle = "(1 of 2)") |>
tab_source_note(source_note = "Source: Not real data.")| Divided table | |
| (1 of 2) | |
| name | id_code |
|---|---|
| R. Jones | 243 |
| P. Smith | 126 |
| J. Brown | 045 |
| M. Fryer | 084 |
| K. Laner | 312 |
| Source: Not real data. | |
gt(data[6:nrow(data),]) |>
tab_header(title = "Divided table",
subtitle = "(2 of 2)") |>
tab_source_note(source_note = "Source: Not real data.") | Divided table | |
| (2 of 2) | |
| name | id_code |
|---|---|
| Q. Ables | 264 |
| H. Ramus | 165 |
| H. Tokai | 311 |
| J. Manus | 205 |
| Source: Not real data. | |
In practice, it’s likely that each table will be printed as a PDF file.
References