Grades returned this afternoon.
Review regrade policy and late work policy if you have questions.
MP#02 - Hollywood Movies – Due October 23rd
Pay attention to the rubric
Tentative Topics
Proposal feedback soon (need to do 2 more before releasing)
Brightspace - Wednesdays at 11:45
Next pre-assignment is October 23rd
Thank you for FAQs and (honest) team feedback. Keep it coming!
Due Wednesday at 11:45pm:
ggplot2
ggplot2
: Elegant Visualizations for Data Analysisggplot2
vs Tableau
ggplot2
ggplot2
vs matplotlib
ggplot2
gg
matplotlib
Matlab
plottingClosest Python analogue to ggplot2
is seaborn
+
instead of |>
ggplot2
is older than |>
ggplot3
ever gets made, will use |>
I tried an interactive plot with \(n=132,000\) points, but it brought my computer to a halt. [Ed. Paraphrased]
That’s a lot of plots!!
ggplot2
is itself pretty fast, but it depends on (possibly slow) graphics backends
Large data sets can lead to overplotting:
Ways to address:
geom_jitter
geom_hex
Jitter: add a bit of random noise so points don’t step on each other
Little “heatmaps” of counts. Hexagons to avoid weird rounding artifacts
aes()
aes
maps data to values. Outside of aes
, set constant value
aes()
aes
maps data to values. Outside of aes
, set constant value
geom_
specific aes()
ggplot()
apply to entire plotgeom
apply there only
Two “modes”
Three types of color palettes:
When mapping quantitative variables to palettes (sequential/diverging), two approaches:
library(dplyr)
data <- data.frame(x = rnorm(5),
y = rnorm(5),
group = c("a", "a", "b", "b", "b"))
data |>
group_by(group) |>
mutate(n_count = n()) |>
ungroup() |>
mutate(color = ifelse(n_count == max(n_count), "red", "black")) |>
ggplot(aes(x=x, y=y, shape=group, color=color)) +
geom_point() +
scale_color_identity()
Built-in themes + ggthemes
package:
library(ggplot2); library(ggthemes);
library(palmerpenguins); library(ggpmisc)
p <- ggplot(penguins,
aes(x=flipper_length_mm,
y=body_mass_g,
color=species)) +
geom_point() +
stat_poly_line(se=FALSE,
color="black") +
stat_poly_eq() +
xlab("Flipper Length (mm)") +
ylab("Body Mass (g)") +
scale_color_brewer(type="qual",
palette=2,
name="Species") +
facet_wrap(~species)
Default theme (ggplot2::theme_grey()
):
Black and White theme (ggplot2::theme_bw()
):
Minimal theme (ggplot2::theme_minimal()
):
Light theme (ggplot2::theme_light()
):
Dark theme (ggplot2::theme_dark()
):
Excel theme (ggthemes::theme_excel()
):
Google Docs theme (ggthemes::theme_gdocs()
):
The Economist theme (ggthemes::theme_economist()
):
The Economist theme (ggthemes::theme_economist()
):
Solarized theme (ggthemes::theme_solarized()
):
Solarized2 theme (ggthemes::theme_solarized_2()
):
Stata theme (ggthemes::theme_stata()
):
Tufte theme (ggthemes::theme_tufte()
):
Wall Street Journal theme (ggthemes::theme_wsj()
):
Many more online:
Order of layers technically matters, but the effect is small
p1 <- ggplot(penguins, aes(x=bill_length_mm, y=flipper_length_mm)) +
geom_point(color="black") +
geom_smooth(color="blue", method="lm") + ggtitle("Line on points")
p2 <- ggplot(penguins, aes(x=bill_length_mm, y=flipper_length_mm)) +
geom_smooth(color="blue", method="lm") +
geom_point(color="black") + ggtitle("Points on line")
p1 + p2
Order matters more with theme. Adding a theme_*()
will override any theme()
customization you did:
stat_poly_{line,eq}
vs geom_smooth
By default geom_smooth
fits a generalized additive model (GAM)
ggpmisc::stat_poly_{line,eq}
fit linear models, so they can expose more machinery.
What is a GAM? Take 9890 with me (Spring, Tuesdays at 6) to find out!
Perceptually:
Humans are better at:
Facets are group_by
for plots. Useful for
How can I implement a dual (twin) axis plot in
ggplot2
?
Disfavored. But if you must …
Doesn’t allow arbitrary secondary axes; allows transformed axes (e.g., Celsius and Fahrenheit)
ggplot
See the ggimage
or ggflags
package for images as “points”:
See cowplot::draw_image()
for image background:
library(cowplot)
p <- ggplot(iris, aes(x = Sepal.Length, fill = Species)) +
geom_density(alpha = 0.7) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
theme_half_open(12)
logo_file <- system.file("extdata", "logo.png", package = "cowplot")
ggdraw() +
draw_image(
logo_file, scale = .7
) +
draw_plot(p)
ggplot2
Data Sets:
diamonds
from the ggplot2
packagecdiac
from the CVXR
packagegapminder
from the gapminder
packageYou need to install CVXR
and gapminder
now.
Exercise: Lab #07
ggplot2
: Learning GoalsToday:
geom
sNext Week:
Room | Team | Room | Team | |
---|---|---|---|---|
1 | Rat Pack | 6 | Ca$h VZ | |
2 | Subway Surfers | 7 | Listing Legends | |
3 | Chart Toppers | 8 | TDSSG | |
4 | Metro Mindset | 9 | Broker T’s | |
5 | Apple Watch | 10 | EVengers |