STA/OPR 9750 - Week 9

Michael Weylandt

STA/OPR 9750 Mini-Project #02

Thank you for peer feedback! I had a lot of fun reading these.

STA/OPR 9750 Mini-Project #03

Now online

Due November 13th

  • GitHub post (used for peer feedback) AND Brightspace

  • Three Weeks: don’t wait until the very end

  • Should be less demanding than MP #01 and MP#02

    • Lots of little files. No big files!
    • Maps and election retrodiction

Pay attention to the rubric

Going Forward

STA/OPR 9750 Course Project

Proposal feedback sent by email this afternoon. Please contact if not received.

Next Week: Mid-Term Check-In Presentations

  • 6 minutes
  • Locking in on specific questions
  • Engagement with existing literature

Upcoming Mini-Projects

Tentative Topics

  • MP#04: Something financial

Any requests?

Pre-Assignments

Brightspace - Wednesdays at 11:45

  • Reading, typically on course website
  • Brightspace auto-grades
    • I have to manually change to completion grading

Next pre-assignment is November 13th

Thank you for FAQs and (honest) team feedback. Keep it coming!

Course Support

  • Synchronous
    • Office Hours 4x / week
      • MW Office Hours on Monday + Thursday
      • CR Tuesday + Friday
      • No OH during Thanksgiving break
  • Asynchronous
    • Piazza (\(38\) minute average response time)

Change: MW Thursday Zoom OH will be 4:00pm to 5:00pm

Upcoming Week

  • Mid-Project Check-Ins

Pre-Assignment #09 FAQs

FAQ: File Import

No FAQs today. (New topic - data import)

Today

Warm-Up

From FiveThirtyEight

Breakout Rooms

Room Team Room Team
1 Rat Pack 6 Ca$h VZ
2 Subway Surfers 7 Listing Legends
3 Chart Toppers 8 TDSSG
4 Metro Mindset 9 Broker T’s
5 Apple Watch 10 EVengers

Warm-Up

Data can be found at https://raw.githubusercontent.com/fivethirtyeight/data/refs/heads/master/candy-power-ranking/candy-data.csv

Read into R (readr::read_csv) and make 3 plots:

  • Do people like more sugary candy?
  • Do people like more expensive candy?
  • Open-Ended

Getting Data into R

Two topics:

  • How internet data transfer actually works
  • How to handle non-rectangular data formats

🎃End a bit early for Halloween🎃

URLs

From abstrax.io

JSON

JSON:

  • Short for JavaScript Object Notation
  • Popular plain-text representation for hierarchical data.
  • Closer to Python objects (dicts of dicts of dicts) than R data.frames
  • Widely used for Application Programming Interfaces (APIs)

JSON

Example:

{
    "data": {
        "id": 27992,
        "title": "A Sunday on La Grande Jatte — 1884",
        "image_id": "1adf2696-8489-499b-cad2-821d7fde4b33"
    },
    "config": {
        "iiif_url": "https://www.artic.edu/iiif/2",
    }
}

JSON

Read JSON in R with jsonlite package (alternatives exist)

library(jsonlite)
# A JSON array of primitives
json <- '["Mario", "Peach", null, "Bowser"]'

# Simplifies into an atomic vector
fromJSON(json)
[1] "Mario"  "Peach"  NA       "Bowser"

JSON

json <-
'[
  {"Name" : "Mario", "Age" : 32, "Occupation" : "Plumber"}, 
  {"Name" : "Peach", "Age" : 21, "Occupation" : "Princess"},
  {},
  {"Name" : "Bowser", "Occupation" : "Koopa"}
]'
mydf <- fromJSON(json)
mydf
    Name Age Occupation
1  Mario  32    Plumber
2  Peach  21   Princess
3   <NA>  NA       <NA>
4 Bowser  NA      Koopa

JSON

fromJSON("http://worldtimeapi.org/api/timezone/America/New_York")[1:5]
$utc_offset
[1] "-04:00"

$timezone
[1] "America/New_York"

$day_of_week
[1] 4

$day_of_year
[1] 305

$datetime
[1] "2024-10-31T17:57:16.489043-04:00"

Compare to browser access

Data Transfer: download.file

args(download.file)
function (url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE, 
    extra = getOption("download.file.extra"), headers = NULL, 
    ...) 
NULL

Basic file download capabilities:

  • url: source
  • destfile: where on your computer to store it
  • method: what software to use in the background to download

Data Transfer: HTTP

HTTP

  • HyperText Transfer Protocol
  • Most common (but not only) internet protocol
  • Also ftp, smtp, ssh, …

“Low-level” mechanism of internet transfer

  • Many R packages add a friendly UX
  • httr2 for low-level work (today)

HTTP

HTTP has two stages:

  • Request
    • URL (Host + Path)
    • Method (VERB)
    • Headers
    • Content
    • Cookies
  • Response
    • Status Code
    • Headers
    • Content

Modern (easy) APIs put most of the behavior in the URL

HTTP in the Browser

In Firefox: Right-Click + Inspect

In Chrome: Right-Click + Developer Tools

HTTP with httr2

httr2 (pronounced “hitter-2”) is low-level manipulation of HTTP.

library(httr2)
request(example_url())

Pretty simple so far:

  • example_url() starts a tiny local web host
  • 127.0.0.1 is localhost

Live Demo Time

httr2 Requests

Build a request:

  • req_method
  • req_body_*
  • req_cookies_set
  • req_auth_basic / req_oauth

Behaviors:

  • req_cache
  • req_timeout

Execution:

  • req_perform

httr2 Responses

Request status

  • resp_status / resp_status_desc

Content: - resp_header* - resp_body_*

Exercise

Art Institute of Chicago API