| Breakout | Team |
|---|---|
| 1 | TBD |
Today: Lecture #07: Data Import and File System Usage; APIs
quarto) ✅R Basics ✅R ✅R ✅R ⬅️
RCharles Ramirez is our GTA
MP#03 - TBD
Due 2026-04-24 at 11:59pm ET
Topics covered:
We owe you:
Posted 2025-11-01
Course Project should be your main focus for rest of course
Week #10 - Mid-Semester Check-In Presentations:
When using ‘found’ data, two important questions to ask:
Context and Novelty:
A research project is not just summarization of other work: how can you contribute something new?
General Advice:
I’ll try to write up some informal advice on:
‘Plain text’ files:
Read into R with readr functions (e.g., read_csv)
From FiveThirtyEight
Data can be found at https://raw.githubusercontent.com/fivethirtyeight/data/refs/heads/master/candy-power-ranking/candy-data.csv
Read into R (readr::read_csv) and make 3 plots:
| Breakout | Team |
|---|---|
| 1 | TBD |
The file system is the way your computer organizes and provides access to files:
/STA9750-2025-FALL/data/mp01/data_file.csv/ on Mac/LinuxC:/)Typically, all user files are stored in a “home directory”:
/Users/YOURNAME on Mac/LinuxC:/Users/YOURNAME on WindowsDownloads, Desktop, Documents, etc~
~/Desktop~/STA9750-2026-SPRINGTwo ways to specify a file:
/Users/michaelweylandt/STA9750/docs/index.htmlgetwd()) and gives directionsSTA9750, path is just docs/index.html./ means “this directory”: could also write ./docs/index.html../ means “up one level”
STA9750/docs, source at ../index.qmdUse the fs package to interact with the file system:
dir_ls(), dir_create(), dir_exists(), dir_delete()path(), path_home(), path_abs(), path_rel()file_create(), file_exists(), file_delete(), file_info()Return to breakout rooms to practice file system usage:
STA9750-2026-SPRING directoryURLs are an extension of file paths for the internet:
From abstrax.io

R’s basic download.file can be used for downloading simple files:
function (url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE,
extra = getOption("download.file.extra"), headers = NULL,
...)
NULL
Basic file download capabilities:
url: sourcedestfile: where on your computer to store itCustomizable behavior, but defaults often work well:
method: what software to use in the background to downloadmode: is this a text or binary filecacheOK: are you ok with a cached version of the fileheaders: do you need to send any additional info in your requestNote use of relative path here, so saves in current working directory
Be polite:
!file_exists(destfile)JSON:
JavaScript Object Notationdicts of dicts of dicts) than R data.framesExample:
{
"data": {
"id": 27992,
"title": "A Sunday on La Grande Jatte — 1884",
"image_id": "1adf2696-8489-499b-cad2-821d7fde4b33"
},
"config": {
"iiif_url": "https://www.artic.edu/iiif/2",
}
}
Read JSON in R with jsonlite package (alternatives exist)
[1] "Mario" "Peach" NA "Bowser"
Name Age Occupation
1 Mario 32 Plumber
2 Peach 21 Princess
3 <NA> NA <NA>
4 Bowser NA Koopa
$type
[1] "general"
$setup
[1] "Did you hear about the chameleon who couldn't change color?"
$punchline
[1] "They had a reptile dysfunction."
$id
[1] 85
Compare to browser access
HTTP
ftp, smtp, ssh, …“Low-level” mechanism of internet transfer
R packages add a friendly UXhttr2 for low-level work (today)HTTP has two stages:
Modern (easy) APIs put most of the behavior in the URL
In Firefox: Right-Click + Inspect
In Chrome: Right-Click + Developer Tools
httr2 (pronounced “hitter-2”) is low-level manipulation of HTTP.
<httr2_request>
GET http://127.0.0.1:57999/
Body: empty
Pretty simple so far:
example_url() starts a tiny local web host127.0.0.1 is localhostBuild a request:
requestreq_methodreq_body_*req_cookies_setreq_auth_basic / req_oauthhttr2 RequestsBehaviors:
req_cachereq_timeoutExecution:
req_performhttr2 ResponsesRequest status
resp_status / resp_status_descContent:
resp_header*resp_body_*Demo: Using httr2 to get a random joke from
See Lab #09
Web Data Access
download.fileUpcoming work from course calendar
Topics for after presentations: