STA 9890 - Course Syllabus
All syllabus and course schedule provisions subject to change with suitable advance notice.
- Professor Michael Weylandt
- Department of Information Systems & Statistics
- Zicklin School of Business
- Baruch College, CUNY
Course Meetings
Lecture / Lab Sessions
- Tuesdays 6:05pm - 9:00pm
- Baruch Main Campus (1 Bernard Baruch Way)
Newman Vertical Campus (NVC) 5-175
Office Hours
- In-Person
- Baruch Main Campus (1 Bernard Baruch Way)
- Newman Vertical Campus (NVC) 11-246
- Tuesdays 4:30-5:30pm
- Subject to periodic cancellation (communicated via Brightspace)
- Virtual:
- Thursdays 4:30pm-5:30pm
- Zoom link provided via Brightspace
STA 9890 is a(n advanced) master’s-level course in Statistical Machine Learning. As such, by the end of the course, students should be able to capably apply standard machine learning techniques to important scientific and business applications, to extend and adapt standard machine learning techniques in novel fashions, to implement complex estimation and prediction algorithms, and to critically engage with modern machine learning literature. The grading in this course reflects this diversity of objectives.
Specifically, the grading in this course draws on three separate modes of assessment: i) in-class ‘midterms’ which assess students’ fluency with the theoretical and mathematical aspects of machine learning; ii) three ‘research reports’ designed to build students’ skills in the computational and methodological aspects of machine learning; and iii) a course prediction competition, designed to build students’ skills in applications of machine learning to important scientific problems. All three elements are equally weighted as follows:
- 33.3% Mid-Semester Tests (Best two of three; 100 points each, 200 points total)
- Test 1: 2025-03-04 - Theory of Supervised Learning; Regression
- Test 2: 2025-04-08 - Classification; Ensemble Learning; Tree-Based Methods
- Test 3: 2025-05-06 - Unsupervised Learning
- 33.3% Research Reports (Best two of three; 100 points each, 200 points total)
- Report 1: Due 2025-03-07
- Report 2: Due 2025-04-18
- Report 3: Due 2025-05-09
- 33.3% Prediction Competition (200 points total)
Final course grades will be curved in accordance with relevant program, departmental, school, and college policies.1
Weekly Quizzes
In lieu of homework, I will provide a short list of practice problems and suggested readings following each lecture. At the start of the following lecture, a short three question (\(\approx\) 15 minute) quiz will be administered. The quiz questions will not be verbatim from the practice problems, but if you can answer the practice problems quickly and fluently, the quiz should pose little difficulty.
These in-class weekly quizzes will generate extra credit applied to your final aggregate score. Each quiz will receive a score out of 3 added directly to your final score. Because the final aggregate score is out of 600, perfect scores on all 9 weekly quizzes can raise your final aggregate score (pre-curve) up to 5%.
To take part in the weekly quizzes, please come to class with both i) a black or blue pen; and ii) a red pen (for peer grading) each week.
Make-up opportunities for the weekly quizzes will only be allowed in exceptional and unforeseeable circumstances.
Regrading Policy
If you feel an assignment has been improperly graded, please contact the instructor by private message on the course discussion board within 48 hours of the graded assignment being returned. Note that the instructor will regrade the assignment de novo, so your grade may be adjusted upwards or downwards.
Late Work Policy
Students are given 4 Late Days that can be used at any time during the semester, without instructor pre-approval or permission, on any of the submitted elements of this course (competition final report, research reports). When late days are used, work is accepted without penalty; late work submitted beyond the four late days will not be accepted. Late Days are used on a “first come, first serve” basis, so if 2 Late Days are used on the first two research reports, any late submission on the final research report will not be accepted.
Outside of the use of Late Days, no late work will be accepted. Late Days cannot be applied to in-class activities such as tests or presentations.
Individual exceptions will be made i) with prior written approval of instructor; or ii) ex post with written note from the Office of the Dean of Students.
Attendance Policy
Attendance is not required for this course and absences do not need to be excused. Make-up opportunities for Quizzes and Tests missed while absent will not provided except with i) with prior written approval of instructor; or ii) ex post with written note from the Office of the Dean of Students.
Pre- and Post-Reading Suggestions
Students learn material most effectively when exposed to it on multiple occasions, ideally using alternative presentations strategies and formats.2 To this end, suggested pre-reading and post-reading is provided for each week of the course. Students are encouraged to pre-read the recommended text, which typically presents that week’s material in a less technical / more intuitive manner, before each week’s course session. Similarly, students are encouraged to review the post-reading for each week after lecture to see additional examples of topics covered.
While lectures will focus primarily on ‘big picture’ and ‘major themes’, the recommended reading, especially the post-reading, provides additional coverage of relevant technical detail. Note that pre- and post-reading are entirely optional and will not be directly assessed. I have included (overlapping) presentations from various text books to encourage students to find the style and presentation that works best for them.
Tentative Course Schedule
Week | Lecture Date | Topics | Pre-Reading | Post-Reading | Research Reports | Additional Notes | |||||
1 |
2025-01-28 |
Course Overview & Introduction to ML:
None |
ISL §2 HR §1-4 UML §1-2, 5, 19 |
2 |
2025-02-04 |
Regression I:
DFO §2.1-2.6, §3.1-3.4, §5.1-5.2, §5.8, §7.1, §7.3, §9.1-9.2 ISL §3 |
BV §2-4 HR §5, 14 UML §12, §14 |
Research Report #01 released on 2025-02-04 - due on 2025-03-07 |
3 |
2025-02-11 |
Regression II:
DFO §8.1-8.3 ISL §6 |
BV §6 SLS §2, 4, 5 PML-1 §11.1-11.4 UML §11, 13, 25.1 |
2025-02-18 | No class: Baruch on Monday Schedule (President’s Day Make-Up) | ||||||||||
4 |
2025-02-25 |
Regression III:
DFO §12.4 ISL §7 |
PML-1 §11.5-11.8, §17.1 UML §16 |
5 | 2025-03-04 |
Mid-Term Test I: Regression Introduction to Classification |
6 |
2025-03-11 |
Classification I:
DFO §11.1-11.5 ISL §4 |
PML-1 §10, 12 UML §9, 24 |
Research Report #01 Due on 2025-03-07 (NB - Friday before class) Research Report #02 released on 2025-03-11 - due on 2025-04-18 |
7 |
2025-03-18 |
Classification II:
DFO §12.1-12.6 ISL §9 |
SLS §3 PML-1 §9 UML §15 |
8 |
2025-03-25 |
Ensemble Learning & Resampling Methods |
ISL §5 |
SF §1-3, 5 UML §10 |
9 |
2025-04-01 |
Tree-Based Methods |
ISL §8 |
PML-1 §18 UML §18 |
10 | 2025-04-08 |
Mid-Term Test II: Classification, Ensemble Learning, Tree-Based Methods Introduction to Unsupervised Learning |
2025-04-15 | No class: Baruch Spring Recess | ||||||||||
11 |
2025-04-22 |
Unsupervised Learning I
DFO §3.5, §3.8, §4.2-4.6, §10.1-10.8 ISL §12.1-12.3 |
SLS §7-8 PML-1 §20.1-20.2 UML §23.1 |
Research Report #02 Due on 2025-04-18 (NB - Friday before class) Research Report #03 released on 2025-04-22 - due on 2025-05-09 |
12 |
2025-04-29 |
Unsupervised Learning II
ISL §12.4-12.5 |
PML-1 §20.3-21.6 UML §22 |
13 | 2025-05-06 |
Mid-Term Test III: Unsupervised Learning Introduction to Generative Models |
14 | 2025-05-13 |
Course Project Presentations Course Wrap-Up |
Research Report #03 Due on 2025-05-09 (NB - Friday before class) |
Workload Expectations
The following approximate breakdown of expected course workload is intended to help you properly prepare for and schedule the out-of-class work associated with this course. Note that, persuant to relevant Federal and State regulations, a 3-credit course taken over a 15 week semester should require approximately 6 hours of out-of-class work from an average student, or 90 hours total over the course of the semester.34 This time should be roughly allocated as:
- Weekly Pre-Reading (9 hours - 1 hour per lecture)
- Weekly Post-Reading and Review (18 hours - 2 hours per lecture)
- Test Preparation (15 hours total - 5 hours per test)
- Research Reports (27 hours total - 9 hours each)
- Prediction Competition (21 hours total)
- Ongoing Submissions (20 hours - approximately 2 per week)
- Final Presentation Preparation (1 hour)
Note that, for the course prediction competition, you will need to make regular progress throughout the semester. If you attempt to “back-load” your work, you will do poorly.
Coding Requirements
STA 9890 is, at its heart, a machine learning course and as such use of a machine, i.e. coding, is required, even though there is no formal coding prerequisite for this course. Per the External Resources Policy, you are allowed (and encouraged) to use freely available coding assistance technologies, including generative tools like GitHub CoPilot.5
You may use whatever programming language you prefer to complete the course assignments, subject to instructor approval: R
, python
, julia
, and matlab
are pre-approved for all students by default. (I will approve most other languages as well, provided they are not too obscure.)
Tools like quarto
or Jupyter Notebooks
will be useful for completing the Research Reports required for this course. If you have not used these previously, many useful free resources can be found online, including my STA9750
course materials.
All syllabus and course schedule provisions subject to change with suitable advance notice.
Theoretically, this may result in scores equivalent to an
in an un-curved course receiving a lower grade in this course. In practice, the instructor will design course assessments to induce a range of scores and does not anticipate “down-curving” happening.↩︎Haoyu Chen and Jiongjiong Yang. “Multiple Exposures Enhance Both Item Memory and Contextual Memory over Time”. Frontiers in Psychology 11. November 2020. DOI:10.3389/fpsyg.2020.565169↩︎
For this course, an average student is a student who enters the course with:
- Fluency with statistical and numerical software at the level of (at least) STA 9750
- Fluency with univariate and multivariate regression at the level of (at least) STA 9700
- Familiarity with probability and linear algebra
and is earning a B-range grade. If you have less background or are aiming for a higher grade, you should expect to commit proportionally more time to this course.↩︎
The CUNY Graduate Center has a useful summary of these expectations. Baruch courses follow the same standards.↩︎
As a student, you have free access to GitHub CoPilot once you create a student GitHub account and register for the Student Developer Pack.↩︎