Grades returned this afternoon.
Review regrade policy and late work policy if you have questions.
MP#02 - Hollywood Movies – Due October 23rd
Pay attention to the rubric
Tentative Topics
Proposal feedback soon (need to do 2 more before releasing)
Brightspace - Wednesdays at 11:45
Next pre-assignment is October 23rd
Thank you for FAQs and (honest) team feedback. Keep it coming!
Due Wednesday at 11:45pm:
ggplot2ggplot2: Elegant Visualizations for Data Analysisggplot2 vs Tableauggplot2
ggplot2 vs matplotlibggplot2
ggmatplotlib
Matlab plottingClosest Python analogue to ggplot2 is seaborn
+ instead of |>ggplot2 is older than |>ggplot3 ever gets made, will use |>I tried an interactive plot with \(n=132,000\) points, but it brought my computer to a halt. [Ed. Paraphrased]
That’s a lot of plots!!
ggplot2 is itself pretty fast, but it depends on (possibly slow) graphics backends
Large data sets can lead to overplotting:
Ways to address:
geom_jittergeom_hexJitter: add a bit of random noise so points don’t step on each other
Little “heatmaps” of counts. Hexagons to avoid weird rounding artifacts
aes()aes maps data to values. Outside of aes, set constant value
aes()aes maps data to values. Outside of aes, set constant value
geom_ specific aes()ggplot() apply to entire plotgeom apply there only
Two “modes”
Three types of color palettes:
When mapping quantitative variables to palettes (sequential/diverging), two approaches:
library(dplyr)
data <- data.frame(x = rnorm(5),
y = rnorm(5),
group = c("a", "a", "b", "b", "b"))
data |>
group_by(group) |>
mutate(n_count = n()) |>
ungroup() |>
mutate(color = ifelse(n_count == max(n_count), "red", "black")) |>
ggplot(aes(x=x, y=y, shape=group, color=color)) +
geom_point() +
scale_color_identity()Built-in themes + ggthemes package:
library(ggplot2); library(ggthemes);
library(palmerpenguins); library(ggpmisc)
p <- ggplot(penguins,
aes(x=flipper_length_mm,
y=body_mass_g,
color=species)) +
geom_point() +
stat_poly_line(se=FALSE,
color="black") +
stat_poly_eq() +
xlab("Flipper Length (mm)") +
ylab("Body Mass (g)") +
scale_color_brewer(type="qual",
palette=2,
name="Species") +
facet_wrap(~species)Default theme (ggplot2::theme_grey()):
Black and White theme (ggplot2::theme_bw()):
Minimal theme (ggplot2::theme_minimal()):
Light theme (ggplot2::theme_light()):
Dark theme (ggplot2::theme_dark()):
Excel theme (ggthemes::theme_excel()):
Google Docs theme (ggthemes::theme_gdocs()):
The Economist theme (ggthemes::theme_economist()):
The Economist theme (ggthemes::theme_economist()):
Solarized theme (ggthemes::theme_solarized()):
Solarized2 theme (ggthemes::theme_solarized_2()):
Stata theme (ggthemes::theme_stata()):
Tufte theme (ggthemes::theme_tufte()):
Wall Street Journal theme (ggthemes::theme_wsj()):
Many more online:
Order of layers technically matters, but the effect is small
p1 <- ggplot(penguins, aes(x=bill_length_mm, y=flipper_length_mm)) +
geom_point(color="black") +
geom_smooth(color="blue", method="lm") + ggtitle("Line on points")
p2 <- ggplot(penguins, aes(x=bill_length_mm, y=flipper_length_mm)) +
geom_smooth(color="blue", method="lm") +
geom_point(color="black") + ggtitle("Points on line")
p1 + p2Order matters more with theme. Adding a theme_*() will override any theme() customization you did:
stat_poly_{line,eq} vs geom_smoothBy default geom_smooth fits a generalized additive model (GAM)
ggpmisc::stat_poly_{line,eq} fit linear models, so they can expose more machinery.
What is a GAM? Take 9890 with me (Spring, Tuesdays at 6) to find out!
+ggtitle("text") is just shorthand for +labs(title="text")
Perceptually:
Humans are better at:
Facets are group_by for plots. Useful for
How can I implement a dual (twin) axis plot in
ggplot2?
Disfavored. But if you must …
Doesn’t allow arbitrary secondary axes; allows transformed axes (e.g., Celsius and Fahrenheit)
ggplotSee the ggimage or ggflags package for images as “points”:
See cowplot::draw_image() for image background:
library(cowplot)
p <- ggplot(iris, aes(x = Sepal.Length, fill = Species)) +
geom_density(alpha = 0.7) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
theme_half_open(12)
logo_file <- system.file("extdata", "logo.png", package = "cowplot")
ggdraw() +
draw_image(
logo_file, scale = .7
) +
draw_plot(p)ggplot2Data Sets:
diamonds from the ggplot2 packagecdiac from the CVXR packagegapminder from the gapminder packageYou need to install CVXR and gapminder now.
Exercise: Lab #07
ggplot2: Learning GoalsToday:
geomsNext Week:
| Room | Team | Room | Team | |
|---|---|---|---|---|
| 1 | Rat Pack | 6 | Ca$h VZ | |
| 2 | Subway Surfers | 7 | Listing Legends | |
| 3 | Chart Toppers | 8 | TDSSG | |
| 4 | Metro Mindset | 9 | Broker T’s | |
| 5 | Apple Watch | 10 | EVengers |