Making beautiful tables with the `gt` package

The power of the gt package

The dreaded table


I dread encountering a data table in an academic paper. A jumble of numbers, column sub-headers, and confusing footnotes, seemingly bereft of purpose. Why, I cry, is this not in the supplementary material?! Surely, I think, they could’ve created a graph! Despite my prejudice, the fact is that we often do need to use tables to present data. While a plot is a beautiful way to show trends or patterns, it falls short if your goal is to compare individual values. This year saw the release of the gt package, a new addition to the growing number of packages that can produce publication quality tables in R. It seemed like as good a time as any to overcome my dread of tables. Here is my attempt.


The data


As a test case I decided to work with data on greenhouse gas emissions from aviation travel, recently published in Our World in Data. Hannah Ritchie does a great job explaining the complexity of calculating per capita aviation emissions and I would highly recommend giving her article a read. For now, though, we’ll side-step this complexity and focus on the simplest case, emissions from domestic travel. The map from Hannah’s post clearly shows the range and disparity in emissions across the globe, but if we want to compare individual values between countries, or maybe within continents, such a map isn’t the best option. How about a table? While the original post includes a basic table to compare emissions, I wanted to see whether I could create a table of my own using only R. A table that I might be happy to encounter in a publication.


The basics: Clean the data and add some context


We’ll need dplyr to tidy some of the data and gt to build our tables. scales will be used for creating colour palettes.

library(dplyr)
library(scales)
library(gt)
packageVersion("gt")
## [1] '0.2.2'

gt integrates well into the existing tidyverse, and creating a gt table is as simple as two lines of code.

NOTE: For this example we’re just showing the worst emitters.

#Load data and arrange in descending order of emissions
emissions_data <- read.csv(here::here("static/data/per-capita-co2-domestic-aviation.csv")) %>% 
  arrange(desc(Per.capita.domestic.aviation.CO2))

#Generate a gt table from head of data
head(emissions_data) %>% 
  gt()
Entity Code Per.capita.domestic.aviation.CO2
United States USA 0.386
Australia AUS 0.267
Norway NOR 0.209
New Zealand NZL 0.174
Canada CAN 0.168
Japan JPN 0.074

A table, yes, but not one you’re likely to publish! To take our first steps towards table perfection we need to tidy up the data and add a title and data source. The reader needs to understand what’s going on. While much of the tidying could be completed using dplyr before converting to a table I’ll demonstrate how this could be achieved exclusively inside gt.

TIP: Notice how we select columns using the tidy-select function vars(). For larger, more complex data other tidy-select functions like starts_with() or contains() may come in handy.

(emissions_table <- head(emissions_data) %>% 
   gt() %>% 
   #Hide unwanted columns
   cols_hide(columns = vars(Code)) %>% 
   #Rename columns
   cols_label(Entity = "Country",
              Per.capita.domestic.aviation.CO2 = "Per capita emissions (tonnes)") %>% 
   #Add a table title
   #Notice the `md` function allows us to write the title using markdown syntax (which allows HTML)
   tab_header(title = md("Comparison of per capita CO<sub>2</sub> emissions from domestic aviation (2018)")) %>% 
   #Add a data source footnote
   tab_source_note(source_note = "Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]"))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (tonnes)
United States 0.386
Australia 0.267
Norway 0.209
New Zealand 0.174
Canada 0.168
Japan 0.074
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]

A few lines of code and the table is already much better. To me there is still one issues that needs to be fixed before we’ve finished the basics. While it’s technically correct to report emissions in tonnes I feel the data would be much more suitable in kilograms. For this we’ll use fmt_number().

TIP: There are fmt_xxx() functions for many different data types, including fmt_currency() and fmt_date().

(emissions_table <- emissions_table %>% 
   #Format numeric column. Use `scale_by` to divide by 1,000. (Note: we'll need to rename the column again)
   fmt_number(columns = vars(Per.capita.domestic.aviation.CO2),
              scale_by = 1000) %>%
   #Our second call to cols_label overwrites our first
   cols_label(Per.capita.domestic.aviation.CO2 = "Per capita emissions (kg)"))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
United States 385.52
Australia 267.17
Norway 209.23
New Zealand 174.19
Canada 168.27
Japan 73.96
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]


The touchup: Adding some style and colour


We’ve got a working table, but it won’t be winning any points for style. Our next step is to change the style of different cells to help the reader more clearly find the information they’re after. For this we’ll use the tab_style() function. The style choices here follow some of the ‘Ten Guidelines for Better Tables’ from Jon Schwabish.

Firstly, we need to more clearly distinguish between the column headers and the body of the table (and while we’re at it, also the title!)

(emissions_table <- emissions_table %>% 
   #Apply new style to all column headers
   tab_style(
     locations = cells_column_labels(columns = everything()),
     style     = list(
       #Give a thick border below
       cell_borders(sides = "bottom", weight = px(3)),
       #Make text bold
       cell_text(weight = "bold")
     )
   ) %>% 
   #Apply different style to the title
   tab_style(
     locations = cells_title(groups = "title"),
     style     = list(
       cell_text(weight = "bold", size = 24)
     )
   ))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
United States 385.52
Australia 267.17
Norway 209.23
New Zealand 174.19
Canada 168.27
Japan 73.96
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]

As our reader is interested in comparing emissions values between countries, we can add a heatmap to our cells to more clearly show the differences. This will require us to set a colour palette and apply a conditional colouring using data_color(). We’ll use the same palette employed in Hannah Ritchie’s map above.

#Apply our palette explicitly across the full range of values so that the top countries are coloured correctly
min_CO2 <- min(emissions_data$Per.capita.domestic.aviation.CO2)
max_CO2 <- max(emissions_data$Per.capita.domestic.aviation.CO2)
emissions_palette <- col_numeric(c("#FEF0D9", "#990000"), domain = c(min_CO2, max_CO2), alpha = 0.75)

(emissions_table <- emissions_table %>% 
    data_color(columns = vars(Per.capita.domestic.aviation.CO2),
               colors = emissions_palette))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
United States 385.52
Australia 267.17
Norway 209.23
New Zealand 174.19
Canada 168.27
Japan 73.96
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]


The finer details: Table options


We’ve added some colour and style, but if we really want to customise our table and tweak the fine details we need to start using the opt_xxx() and tab_options() functions. The opt_xxx() functions adjust specific elements of the table, while tab_options() is similar to the theme() function used with ggplot2. Below, we’ll adjust the options to resemble tables from fivethirtyeight (working from Thomas Mock’s great blog).

(emissions_table <- emissions_table %>% 
   #All column headers are capitalised
   opt_all_caps() %>% 
   #Use the Chivo font
   #Note the great 'google_font' function in 'gt' that removes the need to pre-load fonts
   opt_table_font(
     font = list(
       google_font("Chivo"),
       default_fonts()
     )
   ) %>%
   #Change the width of columns
   cols_width(vars(Per.capita.domestic.aviation.CO2) ~ px(150),
              vars(Entity) ~ px(400)) %>% 
   tab_options(
     #Remove border between column headers and title
     column_labels.border.top.width = px(3),
     column_labels.border.top.color = "transparent",
     #Remove border around table
     table.border.top.color = "transparent",
     table.border.bottom.color = "transparent",
     #Reduce the height of rows
     data_row.padding = px(3),
     #Adjust font sizes and alignment
     source_notes.font.size = 12,
     heading.align = "left"
   ))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
United States 385.52
Australia 267.17
Norway 209.23
New Zealand 174.19
Canada 168.27
Japan 73.96
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]


Bonus round 1: Adding images


I’m already really happy with this table. It clearly allows a comparison of exact emissions between countries, and the colour provides additional assistance to the reader. We could leave it here, but for a little bonus let’s add country flags to our plot. With the CIA World Factbook we can query any country’s flag using the US Federal Information Processing Standard (FIPS) country code.

We have the 3-letter code in the ‘Code’ column, so we’ll first need to convert this to FIPS equivalent. The code for this is not really relevant for using gt but you can see it below if you’re interested. The end goal is to create a column containing the URL of a flag image for each country.

Details…

#To convert country codes
library(countrycode)
#To create custom URL
library(glue)

flag_data <- emissions_data %>% 
  #Convert iso3 codes to FIPS
  mutate(iso2 = countrycode(sourcevar = Code, origin = "iso3c", destination = "fips", warn = FALSE),
         #Create custom URL for each country
         flag_URL = glue('https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/{iso2}-flag.jpg')) %>% 
  select(flag_URL, Entity, everything())

#We'll need to refit our table using this new data
#Code below with comments removed.
emissions_table <- head(flag_data) %>% 
  gt() %>% 
  cols_hide(columns = vars(Code, iso2)) %>% 
  cols_label(Entity = "Country",
             Per.capita.domestic.aviation.CO2 = "Per capita emissions (tonnes)") %>% 
  tab_header(title = md("Comparison of per capita CO<sub>2</sub> emissions from domestic aviation (2018)")) %>% 
  tab_source_note(source_note = "Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]") %>% 
  fmt_number(columns = vars(Per.capita.domestic.aviation.CO2),
             scale_by = 1000) %>%
  cols_label(Per.capita.domestic.aviation.CO2 = "Per capita emissions (kg)") %>% 
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style     = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold")
     )
   ) %>% 
   tab_style(
     locations = cells_title(groups = "title"),
     style     = list(
       cell_text(weight = "bold", size = 24)
     )
   ) %>% 
  data_color(columns = vars(Per.capita.domestic.aviation.CO2),
             colors = emissions_palette) %>% 
  opt_all_caps() %>% 
  opt_table_font(
    font = list(
      google_font("Chivo"),
      default_fonts()
    )
  ) %>%
  cols_width(vars(Per.capita.domestic.aviation.CO2) ~ px(150),
             vars(Entity) ~ px(400)) %>% 
  tab_options(
    column_labels.border.top.width = px(3),
    column_labels.border.top.color = "transparent",
    table.border.top.color = "transparent",
    table.border.bottom.color = "transparent",
    data_row.padding = px(3),
    source_notes.font.size = 12,
    heading.align = "left")


head(flag_data)
##                                                                                    flag_URL
## 1 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/US-flag.jpg
## 2 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/AS-flag.jpg
## 3 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/NO-flag.jpg
## 4 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/NZ-flag.jpg
## 5 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/CA-flag.jpg
## 6 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/JA-flag.jpg
##          Entity Code Per.capita.domestic.aviation.CO2 iso2
## 1 United States  USA                            0.386   US
## 2     Australia  AUS                            0.267   AS
## 3        Norway  NOR                            0.209   NO
## 4   New Zealand  NZL                            0.174   NZ
## 5        Canada  CAN                            0.168   CA
## 6         Japan  JPN                            0.074   JA

We can now use the text_transform() function to add our country flags. text_transform() allows us to apply any custom function to a column. We can use the web_image function in gt to convert a URL to an embedded image.

emissions_table %>% 
  gt::text_transform(
    #Apply a function to a column
    locations = cells_body(vars(flag_URL)),
    fn = function(x) {
      #Return an image of set dimensions
      web_image(
        url = x,
        height = 12
      )
    }
  ) %>% 
  #Hide column header flag_URL and reduce width
  cols_width(vars(flag_URL) ~ px(30)) %>% 
  cols_label(flag_URL = "")
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
United States 385.52
Australia 267.17
Norway 209.23
New Zealand 174.19
Canada 168.27
Japan 73.96
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]


Bonus round 2: Within-group comparison


It’s pretty clear by now that the US and Australia are the worst for domestic aviation emissions, but what if we wanted to compare within continents. Which country has the highest emissions within Africa or S. America? For this, we can use the row grouping functionality in gt.

Before we can do this, we’ll need to assign each country to its corresponding continent. As above, this isn’t really gt relevant, but the code is included below if you’re interested.

Details…

continent_data <- flag_data %>% 
  #Convert iso3 codes to FIPS
  mutate(continent = countrycode(sourcevar = Code, origin = "iso3c", destination = "continent", warn = FALSE)) %>% 
  select(continent, flag_URL, Entity, Per.capita.domestic.aviation.CO2)


head(continent_data)
##   continent
## 1  Americas
## 2   Oceania
## 3    Europe
## 4   Oceania
## 5  Americas
## 6      Asia
##                                                                                    flag_URL
## 1 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/US-flag.jpg
## 2 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/AS-flag.jpg
## 3 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/NO-flag.jpg
## 4 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/NZ-flag.jpg
## 5 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/CA-flag.jpg
## 6 https://www.cia.gov/library/publications/the-world-factbook/attachments/flags/JA-flag.jpg
##          Entity Per.capita.domestic.aviation.CO2
## 1 United States                            0.386
## 2     Australia                            0.267
## 3        Norway                            0.209
## 4   New Zealand                            0.174
## 5        Canada                            0.168
## 6         Japan                            0.074

In this case, we need to start from the beginning apply grouping to our table before we begin.

(emissions_table_continent <- continent_data %>%
  #Just take the top 5 from each continent for our example
  group_by(continent) %>% 
  slice(1:5) %>% 
  #Just show Africa and Americas for our example
  filter(continent %in% c("Africa", "Americas")) %>%
  #Group data by continent
  gt(groupname_col = "continent") %>% 
  #Add flag images as before
  gt::text_transform(
    locations = cells_body(vars(flag_URL)),
    fn = function(x) {
      web_image(
        url = x,
        height = 12
      )
    }
  ) %>% 
  cols_width(vars(flag_URL) ~ px(30)) %>% 
  cols_label(flag_URL = "") %>% 
  #Original changes as above.
  cols_label(Entity = "Country",
             Per.capita.domestic.aviation.CO2 = "Per capita emissions (tonnes)") %>% 
  tab_header(title = md("Comparison of per capita CO<sub>2</sub> emissions from domestic aviation (2018)")) %>% 
  tab_source_note(source_note = "Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]") %>% 
  fmt_number(columns = vars(Per.capita.domestic.aviation.CO2),
             scale_by = 1000) %>%
  cols_label(Per.capita.domestic.aviation.CO2 = "Per capita emissions (kg)") %>% 
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style     = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold")
     )
   ) %>% 
   tab_style(
     locations = cells_title(groups = "title"),
     style     = list(
       cell_text(weight = "bold", size = 24)
     )
   ) %>% 
  data_color(columns = vars(Per.capita.domestic.aviation.CO2),
             colors = emissions_palette) %>% 
  opt_all_caps() %>% 
  opt_table_font(
    font = list(
      google_font("Chivo"),
      default_fonts()
    )
  ) %>%
  cols_width(vars(Per.capita.domestic.aviation.CO2) ~ px(150),
             vars(Entity) ~ px(400)) %>% 
  tab_options(
    column_labels.border.top.width = px(3),
    column_labels.border.top.color = "transparent",
    table.border.top.color = "transparent",
    table.border.bottom.color = "transparent",
    data_row.padding = px(3),
    source_notes.font.size = 12,
    heading.align = "left",
    #Adjust grouped rows to make them stand out
    row_group.background.color = "grey"))
Comparison of per capita CO2 emissions from domestic aviation (2018)
Country Per capita emissions (kg)
Africa
South Africa 25.09
Namibia 11.49
Mauritius 8.75
Kenya 7.10
Algeria 4.43
Americas
United States 385.52
Canada 168.27
Chile 70.46
Brazil 42.86
Mexico 39.93
Data: Graver, Zhang, & Rutherford (2019) [via Our World in Data]


The Conclusion: Tables can be nice!


A package like gt really unlocks the power of a table to display data. Using a combination of style and colour it’s easy to create a clear output that allows a reader to compare individual values. More over, we have the potential to easily include neat additions such as embedded images or graphs. The one draw back of gt is that it only generates static tables, to build interactive tables with filters or tabs we will need to move into other packages like reactable. Still, it’s a fun tool to add to the data visualisation toolbox of anybody working in R.