5 Presentation Tables

Producing good presentation tables is a basic requirement in virtually all R applications.

A good table is informative. This requires careful attention to table design and details. The small things are important!

There are quite a few packages that print tables. Even base R lets you make a simple table. Choosing a competent, full-function table package is important as you’ll want to master many of the key aspects of table-making, even for simple projects.

The package gt is the choice shown here. “gt” stands for “grammar of tables.” While this package is not part of the tidyverse, its style reflect the philosophy of ggplot2.

The main documentation for gt is given on this website: https://gt.rstudio.com/index.html

## Activate the Core Packages
library(tidyverse) ## Brings in a core of useful functions
library(gt)        ## Tables

## Specialized Packages
library(webshot2)  ## Output PNG or PDF files from gt
library(gtExtras)  ## gt table formatting and highlighting

5.1 Basic Tables

Let’s get some basic data which describes properties of the Hawaiian Islands. We’ll use some statistics from Wikipedia (“Hawaiian Islands — Wikipedia, the Free Encyclopedia” 2023). Besides the data from Wikipedia, we’ll calculate the population density for each island.

## Read the data
data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=  
   "Island,     Nickname,           Area,      Population
    Hawaiʻi,    The Big Island,     4028.0,     200629
    Maui,       The Valley Isle,     727.2,     164221
    Oʻahu,      The Gathering Place, 596.7,    1016508
    Kauaʻi,     The Garden Isle,     552.3,      73298
    Molokaʻi,   The Friendly Isle,   260.0,       7345
    Lānaʻi,     The Pineapple Isle,  140.5,       3367
    Niʻihau,    The Forbidden Isle,   69.5,         84
    Kahoʻolawe, The Target Isle,      44.6,          0")

## Calculate the population density
data$Density <- data$Population/data$Area

## Confirm the data
data

# A tibble: 8 × 5
  Island     Nickname              Area Population Density
  <chr>      <chr>                <dbl>      <dbl>   <dbl>
1 Hawaiʻi    The Big Island      4028       200629   49.8 
2 Maui       The Valley Isle      727.      164221  226.  
3 Oʻahu      The Gathering Place  597.     1016508 1704.  
4 Kauaʻi     The Garden Isle      552.       73298  133.  
5 Molokaʻi   The Friendly Isle    260         7345   28.2 
6 Lānaʻi     The Pineapple Isle   140.        3367   24.0 
7 Niʻihau    The Forbidden Isle    69.5         84    1.21
8 Kahoʻolawe The Target Isle       44.6          0    0

This simple table confirms the data entry. But this output doesn’t work very well for a formal report. Lots of things need fixing.

5.2 Formatting in gt

Our package choice for table making, gt, starts with a simple specification. Complexity is added with statement that do specific enhancements.

5.2.1 The Default Table

It is a straight forward task to show the Hawaiian Islands data in a gt table.

## Use the data from the previous chunk

## Make a default gt table
gt(data)

Island	Nickname	Area	Population	Density
Hawaiʻi	The Big Island	4028.0	200629	49.808590
Maui	The Valley Isle	727.2	164221	225.826458
Oʻahu	The Gathering Place	596.7	1016508	1703.549522
Kauaʻi	The Garden Isle	552.3	73298	132.714105
Molokaʻi	The Friendly Isle	260.0	7345	28.250000
Lānaʻi	The Pineapple Isle	140.5	3367	23.964413
Niʻihau	The Forbidden Isle	69.5	84	1.208633
Kahoʻolawe	The Target Isle	44.6	0	0.000000

5.2.2 Number Formatting

Everything fits in the table now. But there too many digits in the Density column. It will be easier to read the large Area and Population values if commas are added. Also, these two columns shouldn’t have any decimal values.

## Use the data from the previous chunk

## Make a default gt table
gt(data) |> 
  fmt_number(
    columns = Density, decimals = 1) |> 
  fmt_number(
    columns = c(Area, Population), 
    use_seps = TRUE) |> 
  fmt_number(
    columns = c(Area,Population), 
    decimals = 0)

Island	Nickname	Area	Population	Density
Hawaiʻi	The Big Island	4,028	200,629	49.8
Maui	The Valley Isle	727	164,221	225.8
Oʻahu	The Gathering Place	597	1,016,508	1,703.5
Kauaʻi	The Garden Isle	552	73,298	132.7
Molokaʻi	The Friendly Isle	260	7,345	28.2
Lānaʻi	The Pineapple Isle	140	3,367	24.0
Niʻihau	The Forbidden Isle	70	84	1.2
Kahoʻolawe	The Target Isle	45	0	0.0

5.2.3 Source, Units and Annotations

A good table has general descriptions and annotations about specific data.

Several statements add needed information.

Data Source: It’s a good idea to always include the source of the data as part of the table.
Measurement Units: Quantitative data need to have units specified.
Superscript: The Unicode value (“\U00B2”) is used to generate the squared symbol.

Note that here, we pass the table to the hawaii variable. That’s useful when we want to save copy of the table.

## Use the data from the previous chunk

## Make a complex gt table
hawaii <- gt(data) |> 
  ## Column Formatting
  fmt_number(columns = Density, 
             decimals = 1) |> 
  fmt_number(columns = c(Area, Population), 
             use_seps = TRUE) |> 
  fmt_number(columns = c(Area,Population), 
             decimals = 0) |> 
  ## Source Information
  tab_source_note(
    source_note = "Source: Wikipedia (Hawaiian_Islands") |> 
  ## Footnotes
  tab_footnote(
    footnote = "miles\U00B2",
    locations = cells_column_labels(columns=Area)) |> 
  tab_footnote(
    footnote = "2020 Population",
    locations = cells_column_labels(columns=Population)) |> 
  tab_footnote(
    footnote = "people/mile\U00B2",
    locations = cells_column_labels(columns=Density)) |> 
  tab_footnote(
    footnote = "Limited or no access",
    locations = cells_body(columns=Nickname,
                           rows=c(7,8)))

## Show the table
hawaii

Island	Nickname	Area¹	Population²	Density³
Hawaiʻi	The Big Island	4,028	200,629	49.8
Maui	The Valley Isle	727	164,221	225.8
Oʻahu	The Gathering Place	597	1,016,508	1,703.5
Kauaʻi	The Garden Isle	552	73,298	132.7
Molokaʻi	The Friendly Isle	260	7,345	28.2
Lānaʻi	The Pineapple Isle	140	3,367	24.0
Niʻihau	The Forbidden Isle⁴	70	84	1.2
Kahoʻolawe	The Target Isle⁴	45	0	0.0
Source: Wikipedia (Hawaiian_Islands
¹ miles²
² 2020 Population
³ people/mile²
⁴ Limited or no access

This gt code looks complex. Indeed, detailed instructions are needed to do the precise control of the formatting and location of the annotation. This code chunk provides all of the types of information you’ll generally need to add the required meta data.

5.2.4 Data Column Formatting

You can change the appearance of the data in a column. Scientific names, for example, are always shown in italics.

data <- read_csv(col_names=TRUE, show_col_types=FALSE, file= 
    "Scientific,            Common
     Aleurites moluccana,   Kukui
     Artocarpus altilis,    Breadfruit
     Cocos nucifera,        Coconut Palm
     Cordia subcordata,     Kou
     Cordyline fruticosa,   Ti
     Dioscorea bulbifera,   Air Yam
     Hibiscus tiliaceus,    Hau
     Ipomoea cairica,       Mile A Minute Vine
     Morinda citrifolia,    Noni
     Pandanus tectorius,    Hala
     Saccharum officinarum, Sugarcane
     Syzygium malaccense,   Mountain Apple
     Thespesia populnea,    Milo
     Zingiber zerumbet,     Shampoo Ginger")

## Make a gt table
gt(data) |> 
  ## Source Information
  tab_source_note(
    source_note = "Source: wildlifeofhawaii.com") |>
  ## Italics for the Scientific names
  tab_style(
    style = 
      cell_text(style = "italic"),
    locations = 
      cells_body(columns = Scientific))

Scientific	Common
Aleurites moluccana	Kukui
Artocarpus altilis	Breadfruit
Cocos nucifera	Coconut Palm
Cordia subcordata	Kou
Cordyline fruticosa	Ti
Dioscorea bulbifera	Air Yam
Hibiscus tiliaceus	Hau
Ipomoea cairica	Mile A Minute Vine
Morinda citrifolia	Noni
Pandanus tectorius	Hala
Saccharum officinarum	Sugarcane
Syzygium malaccense	Mountain Apple
Thespesia populnea	Milo
Zingiber zerumbet	Shampoo Ginger
Source: wildlifeofhawaii.com

5.2.5 Column Head Enhancement

The column contents are identified by the column head text. There are a number of things that help clarify and emphasize table content.

5.2.5.1 Column spanners

A spanner is a heading that spans the width of several column headers.

data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=
        "Mountain,  Location,    Feet,  Meters
         Everest,   Nepal/Tibet, 29029, 8848
         Denali,    Alaska,      20310, 6190
         Whitney,   California,  14505, 4421
         Mauna Kea, Hawai`i,     13786, 4205
         Mauna Loa, Hawai`i,     13680, 4170")

## Build the table
gt(data) |>
   tab_spanner(
    label = "Elevation",
    columns = c(Feet, Meters)) |>
  tab_footnote(
    footnote  = "Tallest if measured from its base",
    locations = cells_body(columns=Mountain, rows=4)) |> 
  tab_source_note(source_note = "Source: Internet search")

Column spanner example
Mountain	Location	Elevation
Mountain	Location	Feet	Meters
Everest	Nepal/Tibet	29029	8848
Denali	Alaska	20310	6190
Whitney	California	14505	4421
Mauna Kea¹	Hawai`i	13786	4205
Mauna Loa	Hawai`i	13680	4170
Source: Internet search
¹ Tallest if measured from its base

5.2.5.2 Column head formatting

The default is for the same font weight as the table cell entries. This means your eye isn’t drawn to the information that identifies the table contents. Making the head text bold corrects this deficiency.

data <- read_csv(col_names = TRUE, show_col_types = FALSE, file = 
        "Name,          Born, Died, Age
        Kamehameha I,   1758, 1819, 61
        Kamehameha II,  1797, 1824, 27
        Kamehameha III, 1813, 1854, 41
        Kamehameha IV,  1834, 1863, 29
        Kamehameha V,   1830, 1872, 42
        Lunalilo,       1835, 1874, 39
        Kalākaua,       1836, 1891, 54
        Lili`uokalani,  1838, 1917, 79")

## Build the table
gt(data) |>
   tab_style(style = cell_text(weight = "bold"),
             locations = cells_column_labels()) |>
   tab_footnote(
       footnote  = "Exact date unknown",
       locations = cells_body(columns=Born, rows=1)) |> 
  tab_source_note(source_note = "Source: Wikipedia")

Hawaiian Monarchs
Name	Born	Died	Age
Kamehameha I	¹ 1758	1819	61
Kamehameha II	1797	1824	27
Kamehameha III	1813	1854	41
Kamehameha IV	1834	1863	29
Kamehameha V	1830	1872	42
Lunalilo	1835	1874	39
Kalākaua	1836	1891	54
Lili`uokalani	1838	1917	79
Source: Wikipedia
¹ Exact date unknown

5.2.6 Grouping with a Column Variable

Sometimes, the structure of a table can be improved by grouping rows that share a characteristic. This can be seen in the following example with the same data presented as ungrouped followed by the same data in a grouped table.

The grouped table includes coloring the rows with the grouped values.

data <- read_csv(col_names=TRUE, show_col_types=FALSE, file=       "Author, Title, Year
      Jones,  Riding the Wind,    2012
      Jones,  Viewing Space,      2009
      Smith,  People and Places,  2010
      Smith,  Been Seen There,    2012
      Smith,  Focus and Defocus,  2007
      Jones,  I am Out of Order,  2010")

## Ungrouped table
gt(data) |>
   tab_caption(caption = "Ungrouped data") |>
   tab_source_note(source_note = "Source: Example data")

Ungrouped data
Author	Title	Year
Jones	Riding the Wind	2012
Jones	Viewing Space	2009
Smith	People and Places	2010
Smith	Been Seen There	2012
Smith	Focus and Defocus	2007
Jones	I am Out of Order	2010
Source: Example data

## Grouped table
gt(data,
   groupname_col = "Author") |>
   tab_options(row_group.background.color = "lightblue") |>
   tab_style(style = cell_text(weight = "bold"),
             locations = cells_column_labels()) |>
   tab_caption(caption = "Grouped data") |>
   tab_source_note(source_note = "Source: Example data")

Grouped data
Title	Year
Jones
Riding the Wind	2012
Viewing Space	2009
I am Out of Order	2010
Smith
People and Places	2010
Been Seen There	2012
Focus and Defocus	2007
Source: Example data

5.3 Outputting the Table

There are times when you need to generate a table as either a PNG graphic file or a PDF file.

Here are a few points worth noting:

Webshot2: This package is required to get either a PNG or PDF output.
Whitespace: The expand parameter puts white space on the sides of the graphic. The default value is 5.

## Output the table created in a previous chunk.

## Make it a PNG file.
hawaii |> 
  gtsave("Hawaiian_Islands.png", expand = 20)  

## Make it a PDF document
hawaii |> 
  gtsave("Hawaiian_Islands.pdf")

5.4 Table Captions

Tables need a caption. This is the row that is on the top of the table. It generally includes a table number, such as “Table 1. The Hawaiian Islands”

5.4.1 Quarto Captions

The R option tbl-cap: is the place to give the text for the table caption. Don’t put in the “Table X.” part; Quarto does automatic table numbering.

You use the R option label: to make links to the table number in the text. For example, there are eight Hawaiian Island (Table 5.1).

## Uses a table generated in a previous chunk

hawaii

Table 5.1: The Hawaiian Islands
Island	Nickname	Area¹	Population²	Density³
Hawaiʻi	The Big Island	4,028	200,629	49.8
Maui	The Valley Isle	727	164,221	225.8
Oʻahu	The Gathering Place	597	1,016,508	1,703.5
Kauaʻi	The Garden Isle	552	73,298	132.7
Molokaʻi	The Friendly Isle	260	7,345	28.2
Lānaʻi	The Pineapple Isle	140	3,367	24.0
Niʻihau	The Forbidden Isle⁴	70	84	1.2
Kahoʻolawe	The Target Isle⁴	45	0	0.0
Source: Wikipedia (Hawaiian_Islands
¹ miles²
² 2020 Population
³ people/mile²
⁴ Limited or no access

5.4.2 Caption with gt

If you are generating a table for use outside Quarto, such as with a PNG or PDF file, you may want to add a table caption using gt. There is a caption function that lets you do this.

Note that you’ll need to specify your own table numbering.

## Uses a table generated in a previous chunk

hawaii <- hawaii |> 
  tab_caption(caption = "Table 5. The Hawaiian Islands")

hawaii

Table 5. The Hawaiian Islands
Island	Nickname	Area¹	Population²	Density³
Hawaiʻi	The Big Island	4,028	200,629	49.8
Maui	The Valley Isle	727	164,221	225.8
Oʻahu	The Gathering Place	597	1,016,508	1,703.5
Kauaʻi	The Garden Isle	552	73,298	132.7
Molokaʻi	The Friendly Isle	260	7,345	28.2
Lānaʻi	The Pineapple Isle	140	3,367	24.0
Niʻihau	The Forbidden Isle⁴	70	84	1.2
Kahoʻolawe	The Target Isle⁴	45	0	0.0
Source: Wikipedia (Hawaiian_Islands
¹ miles²
² 2020 Population
³ people/mile²
⁴ Limited or no access

5.5 Special Situations

There are a few table tasks that are less commonly encountered. Here are some solutions.

5.5.1 Scientific Names with Authors

A proper scientific name has a binomial followed by the author of the name. The difficulty is making the binomial in italics and the author in roman type.

The solution is to add markdown format character (i.e., “*” for italics) around the type to be italicized. Then add a fmt_markdown() statement to the gt function.

Data for the following example show a list of butterfly plants. The list is from a Google search. Species names came from a ChatGPT request. The authors were obtained from Kew’s Plants of the World.

## Read the data
data <- read_csv(col_names = TRUE, file = 
   "Common,           Species,                 Author
    Butterfly Bush,   Buddleja davidii,        Franch.
    Butterfly weed,   Asclepias tuberosa,      L.
    Coneflower,       Echinacea purpurea,      (L.) Moench
    Lantana,          Lantana camara,          L.
    Bluestar,         Amsonia tabernaemontana, Walter
    Phlox,            Phlox paniculata,        L.
    Black-eyed Susan, Rudbeckia hirta,         L.
    Lavender,         Lavandula angustifolia,  Mill.")

## Put the Scientific name in italics & add the Author
data$Scientific <- paste0("*",
                       data$Species,
                       "* ", 
                       data$Author)

## Remove the unneeded columns
data <- data |> 
  select(Common, Scientific)

## Make a nice table
gt(data) |> 
  fmt_markdown() |> 
  tab_source_note(
    source_note = "Sources: Google, ChatGPT and Kew Plants of the World")

Common	Scientific
Butterfly Bush	Buddleja davidii Franch.
Butterfly weed	Asclepias tuberosa L.
Coneflower	Echinacea purpurea (L.) Moench
Lantana	Lantana camara L.
Bluestar	Amsonia tabernaemontana Walter
Phlox	Phlox paniculata L.
Black-eyed Susan	Rudbeckia hirta L.
Lavender	Lavandula angustifolia Mill.
Sources: Google, ChatGPT and Kew Plants of the World

5.5.2 Sorting a Table

Sometimes you need to sort a table so the rows are in a different order than the original data. The dplyr package provides the arrange function to accomplish this task.

## Data from the previous chunk

## Do the sort using dplyr
data <- data |> 
  arrange(Scientific)

## Make a nice table
gt(data) |> 
  fmt_markdown() |> 
  tab_source_note(
    source_note = "Sources: Google, ChatGPT and Kew Plants of the World")

Common	Scientific
Bluestar	Amsonia tabernaemontana Walter
Butterfly weed	Asclepias tuberosa L.
Butterfly Bush	Buddleja davidii Franch.
Coneflower	Echinacea purpurea (L.) Moench
Lantana	Lantana camara L.
Lavender	Lavandula angustifolia Mill.
Phlox	Phlox paniculata L.
Black-eyed Susan	Rudbeckia hirta L.
Sources: Google, ChatGPT and Kew Plants of the World

5.5.3 Color Stripes: Enhance with gtExtras

The gtExtras package (Mock and Sjoberg 2022) makes it quite straight forward to apply useful overall table formatting and highlight cells.

Information and examples of gtExtras application are on the website: https://themockup.blog/posts/2022-06-13-gtextras-cran/index.html.

The following example shows how rows are emphasized by using color stripes.

## Read the data from a CSV file
data <- read_csv(col_names = TRUE,
                 file="data/hawaii_ethnicity_2019.csv")

## Assign annotation information
fn1 <- "The sum of the individual categories may sum to more than 100% because people who reported more than one race were tallied in each ethnicity category."

fn2 <- "Date: July 1, 2019"

source <- "Source: census.hawaii.gov"

## Create the table
gt(data) |> 
  ## Put percent signs on all the values
  fmt_percent(
    scale_values = FALSE,
    decimals = 1) |> 
  ## Add standard annotations
  tab_source_note(source_note = source) |> 
  tab_footnote(
    footnote  = fn1,
    locations = cells_column_labels(columns=Ethnicity)) |> 
  tab_footnote(
    footnote  = fn2,
    locations = cells_column_labels(columns=Ethnicity)) |> 
  ## Shade alternate rows
   gt_theme_guardian()

Ethnicity^1,2	United States	Hawaii	Hawaii County	Honolulu County	Kauai County	Maui County
Asian	7.0%	57.3%	45.2%	61.8%	51.5%	48.2%
White	78.8%	43.5%	57.2%	38.6%	52.1%	52.1%
Native Hawaiian and Other Pacific Islander	0.5%	27.0%	35.3%	25.1%	26.5%	28.2%
Black or African American	14.7%	3.6%	2.5%	4.3%	1.8%	1.8%
American Indian and Alaska Native	2.1%	2.7%	4.8%	2.2%	2.8%	2.7%
Hispanic any race	18.5%	10.7%	12.9%	10.0%	11.4%	11.6%
Source: census.hawaii.gov
¹ The sum of the individual categories may sum to more than 100% because people who reported more than one race were tallied in each ethnicity category.
² Date: July 1, 2019

The gtExtras package has a variety of overall formats as well as ways to use color to highlight rows, emphasize numeric differences, and add specialized formats to values (e.g., percent signs and ^oF).

The addition of a bar graph to a table is another useful gtExtras function. This is shown in an example later.

5.5.4 Wrapped & Multipage Tables

Sometimes, a table is too big to fit on a single page. There are two basic strategies:

Wrap the data into parallel columns
Divide the data into separate tables that share a common header.

5.5.4.1 Wrap into two columns

This example shows that the two columns in the table do not need to have the same number of entries. This is the case when the total number of table rows is odd.

A vertical line is used to separate the two sides of the table.

## Get some data
data <- read_csv(col_names=TRUE,show_col_types=FALSE,file=
      "name, id_code
      R. Jones, 243
      P. Smith, 126
      J. Brown, 045
      M. Fryer, 084
      K. Laner, 312
      Q. Ables, 264
      H. Ramus, 165
      H. Tokai, 311
      J. Manus, 205")

no_entries   <- nrow(data)
left_divide  <- round((no_entries/2)+0.5) ## account for odd rows
right_divide <- left_divide + 1

## Divides the entries into two sets
left_data  <- data[1:left_divide,]
right_data <- data[right_divide:no_entries,]
colnames(right_data)[1] <- "name2"
colnames(right_data)[2] <- "id_code2"

## Pad out the lengths of each set so they are equal
n <- max(length(left_data$name), length(right_data$name2))
both_data <- data.frame(left_data$name[1:n],
                        left_data$id_code[1:n],
                        right_data$name2[1:n],
                        right_data$id_code2[1:n])

## Make sure column names are correct
colnames(both_data) <- c("name","id_code","name2","id_code2")

## Build the table
wrapped_table <-  gt(both_data) %>% 
  cols_label(name2 = "name") %>%
  cols_label(id_code2 = "id_code") %>%
  cols_align(align = "left", columns = name) %>% 
  sub_missing(columns = everything(),rows = everything(),
              missing_text = " ") %>% 
  gt_add_divider(columns = "id_code", 
                  style = "solid",
                  color = "gray88") %>% 
  tab_source_note(source_note = "Source: Not real data.")

## Output the table
wrapped_table

Table data wrapped into two parallel columns
name	id_code	name	id_code
R. Jones	243	Q. Ables	264
P. Smith	126	H. Ramus	165
J. Brown	045	H. Tokai	311
M. Fryer	084	J. Manus	205
K. Laner	312
Source: Not real data.

The code should be wrapped into a function if this table structure is used very often.

5.5.4.2 Divide into multiple tables

Using multiple tables is important when you are using big data sets. Dividing the tables gives you header information on each table.

Create a limited table by choosing the rows to be used to create the table. Note the structure of the statement, where data is the table content and 1:5 (followed by a comma!) gives the rows in data to be used in the table:

gt(data[1:5,])

You need one of these gt blocks for each of the tables.

A tab_header is used to create a title (and, optionally, a subtitle) instead of the tbl-cap statement. This allows all the tables to be created in a single code chunk. However, you can use multiple chunks and tbl-cap statements as an alternative.

## Use the data from a previous chunk

## Divide by specifying row numbers for each table
## NOTE: a comma comes after the row numbers
gt(data[1:5,]) |>
  tab_header(title = "Divided table",
             subtitle = "(1 of 2)") |>
  tab_source_note(source_note = "Source: Not real data.")

name	id_code
Divided table
(1 of 2)
R. Jones	243
P. Smith	126
J. Brown	045
M. Fryer	084
K. Laner	312
Source: Not real data.

gt(data[6:nrow(data),]) |>
  tab_header(title = "Divided table",
             subtitle = "(2 of 2)") |>
  tab_source_note(source_note = "Source: Not real data.")

name	id_code
Divided table
(2 of 2)
Q. Ables	264
H. Ramus	165
H. Tokai	311
J. Manus	205
Source: Not real data.

In practice, it’s likely that each table will be printed as a PDF file.

References

“Hawaiian Islands — Wikipedia, the Free Encyclopedia.” 2023. https://en.wikipedia.org/wiki/Hawaiian_Islands.

Mock, Thomas, and Daniel D. Sjoberg. 2022. Package "gtExtras". https://cran.r-project.org/web/packages/gtExtras/gtExtras.pdf.