Visualizations Continued & ggplot – STAT GU4206/GR5206 Statistical Computing & Introduction to Data Science
Visualizations Continued &
ggplot
STAT GU4206/GR5206 Statistical Computing & Introduction to Data
Science
Gabriel Young
Columbia University
October 1, 2021
Gabriel Young Set 4: Visualizations Continued October 1, 2021 1 / 67
Course Notes
Gabriel Young Set 4: Visualizations Continued October 1, 2021 2 / 67
Last Time
Base R Graphics
Gabriel Young Set 4: Visualizations Continued October 1, 2021 3 / 67
Section I
Some More Plotting with Base R
Gabriel Young Set 4: Visualizations Continued October 1, 2021 4 / 67
Basics of Plotting
Recall,
• Visualization variation (of a single variable):
• hist() – Histograms.
• barplot() – Bargraphs.
• Visualizing covariation (of multiple variables):
• plot() – Scatterplots.
• boxplot() – Boxplots (box-and-whisker plots).
Gabriel Young Set 4: Visualizations Continued October 1, 2021 5 / 67
Basics of Plotting
The plot() function.
• The foundation of many of R’s graphics functions.
• Often one builds up the graph in stages with plot() as a base.
• Each call to plot() begins a new graph window.
• Takes arguments, called graphical parameters, to change various
aspects of the plot. (?par)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 6 / 67
Diamonds Dataset
• Recall the diamonds data set. (diamonds.csv)
• Run diamonds <- read.csv("diamonds.csv", as.is = TRUE). > diamonds <- read.csv("diamonds.csv", as.is = T) > diamonds$cut <- factor(diamonds$cut) > diamonds$color <- factor(diamonds$color) > diamonds$clarity <- factor(diamonds$clarity) > set.seed(1)
> rows <- dim(diamonds)[1] > diam <- diamonds[sample(1:rows, 1000), ] Gabriel Young Set 4: Visualizations Continued October 1, 2021 7 / 67 Building a Visualization: An Example > plot(log(diam$carat), log(diam$price), col = diam$cut)
> legend(“bottomright”, legend = levels(diam$cut),
+ fill = 1:length(levels(diam$cut)), cex = .5)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 8 / 67
Building a Visualization: An Example
Let’s instead plot a regression line for each cut separately.
> cuts <- levels(diam$cut) > col_counter <- 1 > for (i in cuts) {
+ this_cut <- diam$cut == i + this_data <- diam[this_cut, ] + this_lm <- lm(log(this_data$price) + ~ log(this_data$carat)) + abline(this_lm, col = col_counter) + col_counter <- col_counter + 1 + } Gabriel Young Set 4: Visualizations Continued October 1, 2021 9 / 67 Building a Visualization: An Example Gabriel Young Set 4: Visualizations Continued October 1, 2021 10 / 67 Building a Visualization: An Example We add a new point for a diamond that is $898 and 0.67 carats. > points(-0.4, 6.8, pch = “*”, col = “purple”)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 11 / 67
Building a Visualization: An Example
We add text to the new point we just added.
> text(-0.4, 6.8 – .2, “New Diamond”, cex = .5)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 12 / 67
Useful Graphical Parameters
The table below lists a selection of R’s graphical parameters. More info at
http://www.statmethods.net/advgraphs/parameters.html or using
?par.
Parameter Description
pch Point Character. Character of the points in the plot.
main Title of the plot.
xlab, ylab Axes labels.
lty Line Type. E.g. ‘dashed’, ‘dotted’, etc.
lwd Line Width. Line width relative to default = 1.
cex Character Expand. Character size relative to default = 1.
xlim, ylim The limits of the axes.
mfrow Plot figures in an array (e.g. next to each other).
col Plotting color.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 13 / 67
http://www.statmethods.net/advgraphs/parameters.html
Check Yourself
Exercise:
Use the built-in iris dataset.
• Create a new column Setosa that takes a 1 if the iris is a setosa and
a 0 otherwise.
• Plot iris Sepal.Width on the x-axis and Sepal.Length on the y-axis.
Color the points according to whether the iris is a setosa or not.
• Plot two regression lines on the plot, one for the setosa iris and one
for non-setosa iris.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 14 / 67
Section II
Data Visualization
Gabriel Young Set 4: Visualizations Continued October 1, 2021 15 / 67
Section II
Good Visualizations
In data science, good visualizations should give you more information than
you can see in just the data table itself.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 16 / 67
Good Visualizations – John Snow 1854 (Wikipedia)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 17 / 67
https://en.wikipedia.org/wiki/1854_Broad_Street_cholera_outbreak
Good Visualizations – Charles Joseph Minard 1896
(Wikipedia)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 18 / 67
https://en.wikipedia.org/wiki/Charles_Joseph_Minard
Good Visualizations – Charles Joseph Minard 1896
(Wikipedia)
Minard Graph
Minard shows six variables:
• Number of soldiers,
• Direction of the march,
• Location coordinates,
• Temperature on the return journey,
• Location on dates in November and December.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 19 / 67
https://en.wikipedia.org/wiki/Charles_Joseph_Minard
Bad Visualizations
Even statisticians are sometimes bad at making visualizations!
Gabriel Young Set 4: Visualizations Continued October 1, 2021 20 / 67
Bad Visualizations – Nate Silver (Outdated)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 21 / 67
Bad Visualizations – Nate Silver
Gabriel Young Set 4: Visualizations Continued October 1, 2021 22 / 67
Bad Visualizations
• A piechart is a bad graphic for visualizing categorical data.
• This is a biased opinion from a statistician.
• Statisticians typically do not like pie charts.
• Barcharts display the same information as a piechart and the graphic
is easier to interpret.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 23 / 67
Barchart vs. Piechart
Gabriel Young Set 4: Visualizations Continued October 1, 2021 24 / 67
Task
Identify what the following function does.
Code
> pie.chart <- function(data) { + print("I suck") + } The pie.chart function prints ”I suck” > pie.chart(c(“Red”,”Red”,”Blue”))
[1] “I suck”
Please don’t be offended if you like pie charts 🙂
Gabriel Young Set 4: Visualizations Continued October 1, 2021 25 / 67
Task
Identify what the following function does.
Code
> pie.chart <- function(data) { + print("I suck") + } The pie.chart function prints ”I suck” > pie.chart(c(“Red”,”Red”,”Blue”))
[1] “I suck”
Please don’t be offended if you like pie charts 🙂
Gabriel Young Set 4: Visualizations Continued October 1, 2021 25 / 67
Task
Identify what the following function does.
Code
> pie.chart <- function(data) { + print("I suck") + } The pie.chart function prints ”I suck” > pie.chart(c(“Red”,”Red”,”Blue”))
[1] “I suck”
Please don’t be offended if you like pie charts 🙂
Gabriel Young Set 4: Visualizations Continued October 1, 2021 25 / 67
Side note: plots per window
Change graphical parameters
• Use the par() function to change graphical parameters.
• Change plots per window with mfrow
• The default is mfrow=c(1,1)
> par(mfrow=c(1,2))
> barplot(height = table(diamonds$cut),
+ names.arg = names(table(diamonds$cut)),
+ main=”Barchart”)
> pie(table(diamonds$cut), labels = names(table(diamonds$cut)),
+ main=”Pie Chart”,cex=.75)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 26 / 67
Barchart vs. Piechart
Gabriel Young Set 4: Visualizations Continued October 1, 2021 27 / 67
Side note: plot margins
Change graphical parameters
• Change margins with mar or mai
• Note: mar=c(bottom,left,top,right)
• The default is mar=c(5.1, 4.1, 4.1, 2.1))
> par(mfrow=c(1,2),mai=c(.5,.4,.5,.4))
> barplot(height = table(diamonds$cut),
+ names.arg = names(table(diamonds$cut)),
+ main=”Barchart”)
> pie(table(diamonds$cut), labels = names(table(diamonds$cut)),
+ main=”Pie Chart”,cex=.75)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 28 / 67
Barchart vs. Piechart
Gabriel Young Set 4: Visualizations Continued October 1, 2021 29 / 67
Bad Visualizations
• Some more bad visualizations
• See the link: businessinsider
Gabriel Young Set 4: Visualizations Continued October 1, 2021 30 / 67
https://www.businessinsider.com/the-27-worst-charts-of-all-time-2013-6
Good Visualizations
• Keep things simple in terms of color and presentation!
• Try not adding non-needed dimensions to a plot, i.e., 3D bar chart
describing one categorical variable.
• Showing more dimensions on lower a dimensional plot is encouraged,
i.e, diamond price versus carat split by cut.
• Barcharts are a better way to summarize categorical data compared
to piecharts. (Prof. Young’s Opinion)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 31 / 67
Section III
Advanced Visualization Techniques
Gabriel Young Set 4: Visualizations Continued October 1, 2021 32 / 67
ggplot2
• R has several systems for making graphs (we’ve looked at the base R
functions).
• ggplot2 is one of the most elegant and flexible.
• ggplot2 uses a coherent system (or ‘grammar’) for describing and
building graphs.
Need to run install.packages(“ggplot2”) now and
library(“ggplot2”) every time you want to use it!
Gabriel Young Set 4: Visualizations Continued October 1, 2021 33 / 67
ggplot2
• R has several systems for making graphs (we’ve looked at the base R
functions).
• ggplot2 is one of the most elegant and flexible.
• ggplot2 uses a coherent system (or ‘grammar’) for describing and
building graphs.
Need to run install.packages(“ggplot2”) now and
library(“ggplot2”) every time you want to use it!
Gabriel Young Set 4: Visualizations Continued October 1, 2021 33 / 67
ggplot2
We study ggplot2 using the mpg dataset. Let’s try to answer the question:
do cars with bigger engines use more fuel than cars with small engines?
Read about the data using ?mpg.
> dim(mpg)
[1] 234 11
> head(mpg, 3)
# A tibble: 3 x 11
manufacturer model displ year cyl trans drv cty
1 audi a4 1.8 1999 4 autoâĂ
↪
e f 18
2 audi a4 1.8 1999 4 manuâĂ
↪
e f 21
3 audi a4 2 2008 4 manuâĂ
↪
e f 20
# âĂ
↪
e with 3 more variables: hwy
We look at displ, a car’s engine size in liters, and hwy, a car’s fuel
efficiency on the highway in miles per gallon (mpg).
Gabriel Young Set 4: Visualizations Continued October 1, 2021 34 / 67
ggplot2
We study ggplot2 using the mpg dataset. Let’s try to answer the question:
do cars with bigger engines use more fuel than cars with small engines?
Read about the data using ?mpg.
> dim(mpg)
[1] 234 11
> head(mpg, 3)
# A tibble: 3 x 11
manufacturer model displ year cyl trans drv cty
1 audi a4 1.8 1999 4 autoâĂ
↪
e f 18
2 audi a4 1.8 1999 4 manuâĂ
↪
e f 21
3 audi a4 2 2008 4 manuâĂ
↪
e f 20
# âĂ
↪
e with 3 more variables: hwy
We look at displ, a car’s engine size in liters, and hwy, a car’s fuel
efficiency on the highway in miles per gallon (mpg).Gabriel Young Set 4: Visualizations Continued October 1, 2021 34 / 67
A First Plot
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 35 / 67
A First Plot
Let’s break apart the code:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
• Begin a plot with ggplot().
• It creates the coordinate axis that you add to.
• The first argument is the dataset
• Next you want to add layers to the plot.
• In our example: geom_point() adds a layer of points.
• Lots of different geom functions doing different things.
• geom functions take mapping arguments.
• Defines how variables in your dataset are mapped to visual properties.
• Always paired with aes().
• The x and y arguments specify which variables to map to the axes.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 36 / 67
A First Plot
Let’s break apart the code:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
• Begin a plot with ggplot().
• It creates the coordinate axis that you add to.
• The first argument is the dataset
• Next you want to add layers to the plot.
• In our example: geom_point() adds a layer of points.
• Lots of different geom functions doing different things.
• geom functions take mapping arguments.
• Defines how variables in your dataset are mapped to visual properties.
• Always paired with aes().
• The x and y arguments specify which variables to map to the axes.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 36 / 67
A First Plot
Let’s break apart the code:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
• Begin a plot with ggplot().
• It creates the coordinate axis that you add to.
• The first argument is the dataset
• Next you want to add layers to the plot.
• In our example: geom_point() adds a layer of points.
• Lots of different geom functions doing different things.
• geom functions take mapping arguments.
• Defines how variables in your dataset are mapped to visual properties.
• Always paired with aes().
• The x and y arguments specify which variables to map to the axes.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 36 / 67
A First Plot
Let’s break apart the code:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
• Begin a plot with ggplot().
• It creates the coordinate axis that you add to.
• The first argument is the dataset
• Next you want to add layers to the plot.
• In our example: geom_point() adds a layer of points.
• Lots of different geom functions doing different things.
• geom functions take mapping arguments.
• Defines how variables in your dataset are mapped to visual properties.
• Always paired with aes().
• The x and y arguments specify which variables to map to the axes.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 36 / 67
A First Plot
General structure:
ggplot(data = ) +
To create a plot, replace the bracketed sections in the code above with a
datatset, a geom function, and a set of mappings.
From this template, we can make many different kinds of graphs using
ggplot.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 37 / 67
Check Yourself
Tasks
• Plot just ggplot(data = mpg). What do you get?
• Make a scatterplot of hwy vs. cyl.
• Make a scatterplot of class vs. drv. Why is this plot not useful?
Gabriel Young Set 4: Visualizations Continued October 1, 2021 38 / 67
Aesthetic Mappings
The blue points seem to have a different trend than the rest – possibly
hybrids? We study car class to find out.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 39 / 67
Aesthetic Mappings
• We can add a third variable to a scatterplot by mapping it to an
aesthetic.
• An aesthetic is a visual property of the objects in the plot.
• Things like size, color, shape of points.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 40 / 67
Mapping Aesthetics
ggplot(data = mpg) +
geom_point(mapping = aes(x=displ, y=hwy, color=class))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 41 / 67
Check Yourself
Tasks
• Instead of mapping class to the color aesthetic, map it to the
alpha aesthetic or the size aesthetic.
• Instead of mapping class to the color aesthetic, map it to the
shape aesthetic. Note that ggplot() will only use 6 shapes at a
time. What does this mean for our plot?
• What does the following code do?
ggplot(data = mpg) +
geom_point(mapping = aes(x=displ, y=hwy), color=”blue”)
• Map a continuous variable in the mpg dataset, like cty, to the alpha,
shape, and size aesthetics. What does this do?
Gabriel Young Set 4: Visualizations Continued October 1, 2021 42 / 67
Facets
• We saw we could add categorical variables to plots using aesthetics.
• Can also do this by splitting the plot into facets, which are subplots
that each display one subset of the data.
• Use the fact_wrap() command to facet a plot by a single variable.
• The argument is a formula created with ˜ followed by a variable
name.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 43 / 67
Facets
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 44 / 67
Check Yourself
Tasks
• Facet on two variables use the facet_grid() command. An example
is the following:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ class)
What do the empty cells mean?
• Look at ?facet_wrap. What do nrow and ncol do? Why doesn’t
facet_grid() have nrow and ncol arguments?
• What happens if you facet on a continuous variable?
Gabriel Young Set 4: Visualizations Continued October 1, 2021 45 / 67
Geometric Objects
Gabriel Young Set 4: Visualizations Continued October 1, 2021 46 / 67
Geometric Objects
• In the previous slide, each plot used a different visual object to
represent the data.
• Produce this by using different geoms.
• A geom is a geometrical object used to represent data in a plot.
• Often describe plots by the type of geom they use. For example, bar
graphs use bar geoms.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 47 / 67
Geometric Objects
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 48 / 67
Geometric Objects
• Every geom takes a mapping argument but not every aesthetic works
with every geom.
• E.g., you can set the shape of a point, but not a line. You can set the
linetype of a line.
• ggplot2 has around 30 different geoms.
• Can get help with ?geom_smooth, for example.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 49 / 67
Geometric Objects
Some Commonly-used geoms
geom Name Used to… Aesthetics
geom_histogram Visualize a Continuous Variable x .
geom_bar Visualize a Discrete Variable x.
geom_point Visualize a Two Continuous Variables x, y.
geom_text Add Labels to a Plot x, y, label.
geom_boxplot Visualize Continuous and Discrete Variables x, y.
geom_jitter Visualize a Two Variables x, y.
many more …
Gabriel Young Set 4: Visualizations Continued October 1, 2021 50 / 67
Layering geoms
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 51 / 67
Adding Axis Labels
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(x=3, y=30), color = “purple”) +
geom_text(mapping = aes(x=3, y=31, label = “New Point”), size=4) +
labs(title = “New Plot”, x = “Engine Weight”, y = “Highway mpg”)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 52 / 67
Layering geoms
> ggplot(data = diamonds) +
+ geom_point(mapping = aes(x = carat, y = price),
+ alpha = 1/10)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 53 / 67
Check Yourself
Exercise:
Use the built-in iris dataset.
• Plot iris Sepal.Width on the x-axis and Sepal.Length on the y-axis.
Color the points according to whether the iris is a setosa or not.
• Plot two regression lines on the plot, one for the setosa iris and one
for non-setosa iris. Hint: Use geom_abline(intercept, slope) or
geom_smooth() with method = “lm”.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 54 / 67
A few more examples
Barplot
> ggplot(diamonds)+
+ geom_bar(aes(x=cut))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 55 / 67
A few more examples
• Split price by the 75th percentile.
• Plot cut and expensive on one barchart.
• Center title
• Remove the word cut from xaxis label.
Barplot
> upper <- diamonds$price > quantile(diamonds$price,probs = .75)
> diamonds$Expensive <- ifelse(upper,"high","not-high") > theme_update(plot.title = element_text(hjust = 0.5))
> ggplot(data=diamonds) +
+ geom_bar(aes(x=cut,fill=factor(Expensive)))+
+ theme(axis.text.x = element_text(angle = 90, hjust = 1))+
+ labs(title = “Title is Centered”,fill=”Price”,x=””)
>
Gabriel Young Set 4: Visualizations Continued October 1, 2021 56 / 67
A few more examples
Gabriel Young Set 4: Visualizations Continued October 1, 2021 57 / 67
A few more examples
• Plot simulated standard normal and its density
> x <- seq(-5,5,by=.01) > hist_data <- data.frame(x.var=rnorm(1000)) > plot_data <- data.frame(x=x,f=dnorm(x)) > ggplot(hist_data)+
+ geom_histogram(mapping=aes(x=x.var,y=..density..),
+ col=”blue”,fill=”white”,binwidth=.2)+
+ geom_line(plot_data,mapping = aes(x = x, y = f),
+ col=”red”)+
+ labs(title = “Nomal Example”,x=”x”,y=”Density”)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 58 / 67
A few more examples: Plot simulated normal and its
density
Gabriel Young Set 4: Visualizations Continued October 1, 2021 59 / 67
A few more examples: iris data
• Plot Petal.Length versus Sepal.Length split by Species
• Fit smooth regression functions to each level of species.
• Match legend and colors accordingly.
• A first attempt!
> ggplot(data=iris)+
+ geom_point(mapping = aes(x=Sepal.Length,y=Petal.Length,
+ color = Species))+
+ geom_smooth(mapping=aes(x=Sepal.Length,y=Petal.Length,
+ color =Species))
Gabriel Young Set 4: Visualizations Continued October 1, 2021 60 / 67
A few more examples: iris data
Gabriel Young Set 4: Visualizations Continued October 1, 2021 61 / 67
A few more examples: homework problem
Gabriel Young Set 4: Visualizations Continued October 1, 2021 62 / 67
A few more examples: multiple time series graph
Time Series Simulation
• The random walk is defined:
Xt = Xt−1 + �t , t = 1, . . . , n
• Assume a normal distribution on �t , i.e.,
�t
iid∼ Normal(0, 1), t = 1, . . . , n
• Plot three random walk realizations.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 63 / 67
A few more examples: multiple time series graph
Time Series Simulation
> # Random Walk
> my.randomwalk <- function(n) { + #n <- 100 + X.seq <- rep(NA,n) + X.seq[1] <- 0 + for (i in 2:n) { + X.seq[i] <- X.seq[i-1] + rnorm(1) + } + return(X.seq) + } > set.seed(0)
> n <- 100 > RW.1 <- my.randomwalk(n=n) > RW.2 <- my.randomwalk(n=n) > RW.3 <- my.randomwalk(n=n) Gabriel Young Set 4: Visualizations Continued October 1, 2021 64 / 67 A few more examples: multiple time series graph Time Series Simulation > RW.new <- data.frame(Time=rep(1:n,3), + RW=c(RW.1,RW.2,RW.3), + Sim=c(rep("RW1",n),rep("RW2",n),rep("RW3",n)) + ) > ggplot(data = RW.new) +
+ geom_line(mapping = aes(x = Time, y = RW,
+ color=Sim))+
+ labs(title = “Three Random Walks”,
+ x = “t”, y = “X_t”,
+ color=”Sim”)
Gabriel Young Set 4: Visualizations Continued October 1, 2021 65 / 67
A few more examples: iris data
Gabriel Young Set 4: Visualizations Continued October 1, 2021 66 / 67
Closing remarks
Base R versus ggplot
• ggplot is arguably easier to use than base R.
• The grammar used in ggplot synthesizes with the tidyverse.
• Many professionals prefer ggplot over base R graphics.
• Many professionals prefer base R graphics over ggplot.
• If you know the intricate details of base R graphics, you can construct
very beautiful graphs.
Gabriel Young Set 4: Visualizations Continued October 1, 2021 67 / 67