[1] 15
Today:
Lecture #03: Coding in R: Variables, Control Flow, Packages, Function Calls
Mini-Project #00 is due 2025-09-12 at 11:59pm ET
As of 2025-09-11:
Recall that you need to post on both GitHub and Piazza for this assignment - sending GitHub ID is most important
Mini-Project #00 is ungraded to:
Lessons learned will be helpful in future assignments
Some sites look really great - excited to see headshots, resumes, etc. and to learn more about you
On 2025-09-15, I will assign peer-feedback on GitHub:
Aims:
“Good artists copy; great artists steal.” – Steve Jobs
More discussion next week
Asynchronous Support: Piazza
Synchronous Support: Office Hours
Pre-Assignment #03
Pre-Assignment #04
R
: Data Frames, Functions, Packages and Control FlowIn Mini-Project #00, your goal is to get rendered HTML to GitHub
Once you do this once, cycle becomes easier
For ongoing changes / updates:
Render
button in RStudio
- generate new / change HTMLCommit
and use modal to make a Commit
- Commit a new version in Git
Push
- Send new version to GitHub. Website will update automaticallyTips:
docs
Storage is cheap - your time isn’t
Live demo
Full description online
Process:
First step: By 2025-09-30, email me your group members.
Proposal Presentations:
R
It’s now time for us to start writing code in R
No more copy and trust
Goals:
Three models of executing code:
Console
.R
) fileqmd
fileBasic arithmetic in R
runs as expected
PEMDAS Ordering: Parentheses, Exponentials, Multiplication/Division, Addition/Subtraction
\[3^{2 * 5 - 1} / 24^5\]
When in doubt, extra parentheses don’t hurt
To go beyond arithmetic, need to invoke functions
\[ \cos(\pi) + \tan\left(\frac{\pi}{4}\right) + \sqrt{\sin(\pi/2)} - e^1\]
All function calls have a fundamental syntax:
name()
e.g.,
To get help with any function in R
, type ?name
Most interesting functions require input:
name(argument)
Here, the argument is passed as input to the function:
Type a name without ()
to see its implementation
and
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
ret.x <- x
...
Conceptually sqrt
vs sqrt(4)
is “concept of square rooting” vs “the actual square root of 4, i.e., 2”
Most important for users are the first few lines (args
)
function (x, base = exp(1))
Two arguments:
x
: the inputbase
: optional 2nd argument with default \(e\) (natural log)Every argument has a name, but not always required
R
is usually smart about knowing what you meant
Often when dealing with data, we want to transform related data similarly:
Dangerous to only do part
Most functions in R
try to vectorize, but not always possible
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
[9] 3.000000 3.162278
[1] 0.7071068 0.0000000 -0.7071068 -1.0000000 -0.7071068 0.0000000 0.7071068
[8] 1.0000000
The [1]
you sometimes see is R
just letting you know where in a vector you are
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
[9] 3.000000 3.162278 3.316625 3.464102 3.605551 3.741657 3.872983 4.000000
[17] 4.123106 4.242641 4.358899 4.472136 4.582576 4.690416 4.795832 4.898979
[25] 5.000000
When we do 2D data (later), we get column and row indices
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
Often, we want to save a several values as a single ‘thing’
If you need to write a vector ‘by hand’, use the c
function:
then
We won’t usually hand-write vectors like this:
Use []
operator to get individual elements of a vector:
New topics to cover:
Everything in R
has a type or class:
A package is a set of code (and data) packaged up for distribution and use
R
has many helpful packages - these are distributed via CRAN
(currently over 200,000)
Using packages is a two-step process:
R
(every time)Think of regular software: you download MS Office once but need to start it whenever you want to use it
The install.packages
function will download and install a package:
If that package uses other packages, R
will sort that out automatically
The best comments don’t just say what you are doing. They say why you are doing something in the way it is being done
More discussion of comments later as we write more complex code
Default vector printing:
Each line gets a new index:
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 2.828427
[9] 3.000000 3.162278
More complex objects have alternate print styles:
Print width is controlled by getOption("width")
.
Warning in x + y: longer object length is not a multiple of shorter object
length
[1] 5 7 7
Not a problem per se, but often a sign that something has gone wrong.
Most built-in functions can’t / shouldn’t be changed.
Some allow alternate behavior via additional arguments:
If you want different behavior, write your own function:
Always try ?name
to see documentation.
Three key commands:
git add
: add some changes to a ‘box’git commit
: seal the ‘box’git push
: send the ‘box’ to GitHubGit
pane in RStudio
shows uncommited changes, not files.
If a file ‘vanishes’ after a commit, that’s good!
Introduction to R
:
Upcoming work from course calendar
Course Project:
As a student, you have many free and discounted benefits.
I have collected some of these on the course page, but there are many more if you look around.
Places love to give discounts to students - use them!
Comments
Comments are text inside the code that
R
ignoresCompare
with