Due at midnight tonight - take a moment to do it now if you haven’t already!
Brightspace - Wednesdays at 11:45
Next (and final!) pre-assignment is December 4th
Thank you for FAQs and (honest) team feedback. Keep it coming!
Returned:
We owe you:
FYI: At the end of the course, I curve individual peer grades.
Example: If grader \(X\) is on average, 5 points lower, I re-center all their grades, raising the gradees by an average of 1.25.
Try to be consistent over the semester so I can calibrate this correctly.
Make sure you check GitHub notifications, via email or at https://github.com/notifications to make sure you get all peer feedback assignments.
I tag you in other folks’ repo when you are supposed to review
People tagged in your repo are evaluating you
Change: MW Thursday Zoom OH now 4:00pm to 5:00pm
Nov 27 - Thanksgiving Holiday (No Class on Nov 28)
Good - Purpose (“Business Logic”):
Good - Higher Level Structure (Example from googledrive
package):
# https://github.com/gaborcsardi/rencfaq#with-base-r
write_utf8 <- function(text, path = NULL) {
# sometimes we use writeLines() basically to print something for a snapshot
if (is.null(path)) {
return(base::writeLines(text))
}
# step 1: ensure our text is utf8 encoded
utf8 <- enc2utf8(text)
upath <- enc2utf8(path)
# step 2: create a connection with 'native' encoding
# this signals to R that translation before writing
# to the connection should be skipped
con <- file(upath, open = "w+", encoding = "native.enc")
withr::defer(close(con))
# step 3: write to the connection with 'useBytes = TRUE',
# telling R to skip translation to the native encoding
base::writeLines(utf8, con = con, useBytes = TRUE)
}
More Advice on StackOverflow
When to use the
^
and$
anchors?
Start and end of a line.
str_split
str_detect
has a negate
option:
str_detect
vs str_match
vs str_extract
str_detect
is there a ‘fit’?str_extract
extract the whole ‘fit’str_match
extract specific groupsstr_match(group=)
is useful for complex data extraction.
x <- c("Michael Weylandt teaches STA9750", "KRR teaches STA9891")
pattern <- c("(.*) teaches (.*)")
stringr::str_extract(x, pattern, group=1)
[1] "Michael Weylandt" "KRR"
[1] "STA9750" "STA9891"
(Not sure what negatives do here…)
Also allows named groups:
x <- c("Michael Weylandt teaches STA9750 on Thursday", "KRR teaches STA9891 on Wednesday")
pattern <- c("(?<instructor>.*) teaches (?<course>.*) on (?<weekday>.*)")
stringr::str_match(x, pattern) |> as.data.frame()
V1 instructor course
1 Michael Weylandt teaches STA9750 on Thursday Michael Weylandt STA9750
2 KRR teaches STA9891 on Wednesday KRR STA9891
weekday
1 Thursday
2 Wednesday
Why?
?
Symbol (Quantifiers)Quantifiers (multiple matches):
.{a, b}
: anywhere from a
to b
copies (inclusive).{, b}
: no more than b
copies.{a,}
: at least a
copies.?
: zero-or-one, same as .{0,1}
.*
: zero-or-more, same as .{0,}
.+
: one-or-more, same as {1,}
stringr
vs grep
/ grepl
Ultimately the same functionality, but stringr
has a more consistent interface.
Conversion table online
All stringr
functions work well in dplyr
pipelines (“vectorized”):
library(dplyr); library(stringr)
df <- data.frame(lower_letters = letters)
df |> mutate(upper_letters = str_to_upper(lower_letters))
lower_letters upper_letters
1 a A
2 b B
3 c C
4 d D
5 e E
6 f F
7 g G
8 h H
9 i I
10 j J
11 k K
12 l L
13 m M
14 n N
15 o O
16 p P
17 q Q
18 r R
19 s S
20 t T
21 u U
22 v V
23 w W
24 x X
25 y Y
26 z Z
If you know the source encoding:
If you don’t know the source, ….
Order | Team | Order | Team | |
---|---|---|---|---|
1 | Rat Pack | 6 | Ca$h VZ | |
2 | Subway Surfers | 7 | Listing Legends | |
3 | Chart Toppers | 8 | TDSSG | |
4 | Metro Mindset | 9 | Broker T’s | |
5 | Apple Watch | 10 | EVengers |
Comments
Bad - Trivial:
Bad - Opaque:
Bad - Redundant / Explaining Code