Today:
Lecture #02: Tools for Reproducible Research
Final enrollment: 98(!)
Lots of students: please follow instructions closely so I can automate course management.
Course project draft description is now online
Detailed discussion of:
Will be finalized next week - 2025-09-09.
Please send me questions in advance!
First step: by 2025-09-30, email me your group members.
GTA - Charles Ramirez
Responsibilities TBD. Likely:
Won’t start for a few weeks - CUNY is slow
Instructor Tip: Before committing a team with someone, you can look up their GitHub and see how they did on MP#00 and MP#01. This might be helpful to find teammates whose standards are calibrated to your own.
Pre-Assignment #02
Pre-Assignment #03
Verification of Enrollment - Required to stay enrolled in class
New optional DataCamp course:
R Markdown is proto-Quarto:
Markdown:
Source code:
Output:
This is what Markdown looks like!
I can write in italics or bold.
Lists are easy:
- First
- Second
- Third
as are links to helpful info.
Markdown is just a language:
Pandoc can essentially convert anything to anything:
Markdown is a WYSIWYM format:
Extensions allow finer control of formatting
pandoc
for finer control (not in this class)In this class, web sites:
qmd
document becomes a single html
file (page)Resources:
Official Cheat Sheets:
Data Camp RStudio Tutorial (Free)
RStudio has native Quarto support: just hit “Render”
What is Markdown?
Per Wikipedia: “Markdown is a light-weight, plain-text, markup language specification”
What is Markdown?
Other than text formatting, does Markdown have any other uses?
On its own, Markdown is just text formatting (but that’s a lot!)
We will use Quarto which augments markdown for reproducible research.
We can embed code-and its output-inside Markdown documents.
What uses Markdown?
So much! Used by Bitbucket, GitHub, OpenStreetMap, Reddit, Stack Exchange, Drupal, ChatGPT, Discord, MS Teams and many more!
Tools like Pandoc/Quarto render Markdown to:
What is the difference between a Code section and a Nested List? Are they just different ways of indenting?
No. Nested lists are ‘just’ text
Code formatting enables much more if rendering engine supports it:
How are we going to use Markdown?
All written work (mini-projects and final project) in this course will be submitted using Markdown (by way of Quarto).
Specifically:
You are also encouraged (but not required) to use Markdown for presentation slides (like these!)
How can I create Tables in Markdown?
Markdown has two table syntaxes:
If you are making complex tables, I recommend using the list-table extension.
(See syllabus.qmd
in course repo for examples.)
How to create images and links?
Basic hyperlinks look like this:
[link text](https://the.url/goes/here)
If you want to embed the contents of a link, prepend it with an exclamation point. This is most useful for images:

You can even put a link inside an image to be fancy:
[](https://en.wikipedia.org/wiki/Elephant)
How to create images and links?
Quarto automatically embeds the results of plotting code:
Here, Quarto handles all the file creation and link targeting for us. If I change the code, the figure will change automatically.
git
is the industry-standard tool for source code management (SCM)
Courtesy of PhD Comics, this is what we want to avoid:
Git tracks changes to files
main
) is just a stand-in for “latest commit”For this course, I like the mental model of a box:
git add
: Commands to put changes in the boxgit commit
: Seal the box and put shipping info on the outsidegit push
: Send the box (to GitHub)From GitHub user @design-web-victo
:
Once you create a project, RStudio has built-in git
integration
M
are files that have been Modified since the last commit?
are files I have never told git
to trackCheck box next to a file name to stage (or add) it in git
Create a commit with the Commit
button
Commit
a second timegit push
(green up arrow button) to move files to GitHub
Resulting commit now visible on GitHub
Automatically updates website as well
File no longer listed in Git
pane - no new changes
M
For us, GitHub
is just a “dumping ground”
GitHub
from a single computerWarning: Do not make any edits using the GitHub website. This will cause tons of pain. All changes should be made from one computer and pushed upwards only.
GitHub provides GitHub Pages
We will use GitHub pages for all work submission in this class
qmd
files allows automatic checksBy default, URLs like:
https://<GITHUB_ID>.github.io/<GITHUB_REPONAME>
. . .
If you create a repo called <GITHUB_ID>.github.io
, URL simplifies:
https://<GITHUB_ID>.github.io/
. . .
If you own a domain, you can map to arbitrary base URL:
https://<DOMAIN_NAME>/<GITHUB_REPONAME>
E.g., https://michael-weylandt.com/STA9750
Throughout course instructions, you will see things like <GITHUB_ID>
. . .
For me, if instructions say
STA 9750 <GITHUB ID> Mini Project #00
I will type
STA 9750 michaelweylandt Mini Project #00
(no brackets)
Professional programming is at least half looking things up; at beginning stages, the fraction is even higher.
So it’s important to know how to see help the smart way:
R
and its packages are no exception. Everything we will use in this class has solid documentation.Most programming challenges have been faced by somebody before, so Google it!
Tips:
R
or rstats
in your search queryStackOverflow
for specific code questions; blogs and course materials are better for “big picture” questionsProgramming fora, like StackOverflow
, are full of great resources. Most of what you need is already there. But if you need to ask a new question, make sure to create a minimal reproducible example
Make it easy for your helper to help you.
Pro-Tip: You’ll solve over 50% of your problems in trying to create an MRE.
Tips:
The reprex R
package helps with this: see this talk.
For this class, rely on Piazza!
Upcoming work from course calendar
Course Project:
In the near future, you may want recommendations from your professors. Some advice on getting good recommendations:
Professional: For a job. Typically a brief survey or quick reference check.
Academic: For awards, fellowships, research positions, graduate admissions. A personalized letter from me.
I can usually do professional recommendations quickly (4-5 business days). Academic recommendations require a minimum of two weeks.
A strong recommendation is one that supplements what is already in your application.
The better I know you, the stronger a letter I can write.
You can ask if a professor can write you a strong recommendation. If they say they can write, but it may not be strong, consider asking elsewhere.
Ugly:
Student N took my class and received an A.
Bad:
Student N took my XYZ class and received an A. My XYZ class covers ABC at an advanced level and provides N a strong foundation for your program.
Good:
I first met student N in my XYZ class. She was an active participant in class, regularly attended my office hours, and regularly asked for additional advanced material. She had a remarkable final project in which she ABC. It is clear that her drive and passion for XYZ make her an excellent candidate for your fellowship.
Under some parts of US law, you have the right to see what letters for you I write unless you specifically waive that right.
Waive the right
Readers will assume you have read a non-confidential letter and discount whatever I say.
I will write a letter for any student who has passed one of my classes (C or higher) or is on track to do so.
No guarantee it is strong unless you ask. I understand that sometimes you are just trying to make sure you have enough.
Give me at least 3 weeks, ideally more. A rushed letter is a short and generic letter.
Send me:
Once a letter is written, it’s easy for me to update and re-use.
Help me help you.