Software Tools for Data Analysis
STA 9750
Michael Weylandt
Week 1 – Thursday 2026-01-29
Last Updated: 2026-01-27

Today’s Logistics

Today: Lecture #01: Course Overview and Key Infrastructure

These slides can be found online at:

https://michael-weylandt.com/STA9750/slides/slides01.html

In-class activities (if any) can be found at:

https://michael-weylandt.com/STA9750/labs/lab01.html

Upcoming TODO

Upcoming student responsibilities:

Date Time Details
2026-02-05 6:00pm ET Pre-Assignment #02 Due
2026-02-12 11:59pm ET Syllabus Quiz / Verification of Enrollment
2026-02-12 NA Classes Cancelled (Lincoln’s Birthday)
2026-02-19 6:00pm ET Pre-Assignment #03 Due
2026-02-19 11:59pm ET Team Roster Submission
2026-02-20 11:59pm ET Mini-Project #00 Due
2026-02-26 6:00pm ET Team Contract Due
2026-02-26 6:00pm ET Project Proposal Presentation Slides Due

STA 9750 Week 1

Weekly Course Updates:

  • Brief updates and reminders about course logistics
  • Syllabus and Brightspace are binding
    • If something is left out of here, it still happens!

Today: Introduction to STA 9750

STA 9750: Software Tools for Data Analysis

Course Overview

STA 9750 is:

  • Establishing a professional data analyst portfolio
  • Increasing your effectiveness in other courses
  • Building your networks with faculty and classmates
  • Getting s#!t done

Learning Objectives

Formal learning objectives can be found here

Key aims:

  • Deal with real data, no matter how messy and unhelpful
  • Engage with important technologies and learn to quickly adopt new tools
  • Communicate to both technical and non-technical audiences
  • Perform substantial and thoughtful analyses

Course Website

All course information can be found on the course website

No authentication required - mobile friendly

Source code on my GitHub

Course Website

Look online for:

Instructor

Me: Michael Weylandt

Assistant (i.e., new-ish) Professor in the Department of Info. Systems and Statistics

Previously:

  • Researcher at Sandia National Labs (US Government Lab)
  • Postdoc with US Intelligence Community
  • Ph.D. in Statistics at Rice University (Houston, TX)
  • “Quant” at Morgan Stanley and a (defunct) hedge fund

See my website for more on me and my research

Course Schedule

Lecture / Lab sessions:

  • Thursdays on Zoom (here!) at 6:05pm
    • Rescheduled class on Tuesday on 2026-04-21 (“Conversion Day”)
  • 14 weeks. Holidays on Feb 12 + Apr 02 + Apr 09
  • No penalty for missing class.
    • Slides and in-class activities online
    • Lectures generally not posted

Course Schedule

Presentation days:

  • Student Presentations on Feb 26 + Mar 26 + May 07
  • First two can be one per team, so only required day is final presentation
  • Order will be randomly selected by instructor
    • I expect you to be able to attend all of class on presentation day

Getting Help

Online Office Hours:

  • Before class (on days we have class) at 5:00pm
  • Different Zoom Link

In-person Office Hours:

  • Wednesdays at 5pm (on days we have class)

Asynchronous

  • Piazza course discussion platform

DataCamp

This semester, I am providing free access to DataCamp courses

  • 100% optional
  • Additional exercises and review of class material
  • Access through sign-up link on Brightspace

Restricted to @baruchmail.cuny.edu emails - Piazza message if you have a different CUNY email and I’ll special-case you

Course Schedule

Major Topics:

  • Web Communication and Literate Programming Technologies - 2 weeks
  • Introduction to R - 1 week
  • Tidy Data Manipulation (SQL-type operations) - 2 weeks
  • Plotting and Data Visualization - 2 weeks
  • Data Import, Web Scraping, and Text Cleaning - 3 weeks
  • Statistical Analysis - 1 week

Grading

  • 24% Pre-Assignments
  • 30% Mini-Projects
  • 46% Course Project

Extra credit for contribution to course materials or ‘above and beyond’ on Piazza

Final grades curved to match ZSB grading guidelines

Course Policies

See syllabus for fine print:

  • Regrading: Must request within 48 hours; total regrade of assignment
  • Late work: Not without prior permission or DoS letter ex post
  • Grace period: two days on mini-project initial submission, none otherwise
  • Grade Curve: One at the end of the semester
  • External Resources / AI: Free to use for coding only, not writing
  • Absences: No excuse required, but must attend on presentation days
  • Accommodations: ADA (SDS) or Religious (direct with instructor)

Pre-Assignments

Pre-Assignments:

  • Weekly reading before class followed by a short quiz
  • Introduce new material + helps me know where folks are confused
  • Starts next week
  • Every week except three presentation weeks

Pre-Assignments

Pre-Assignments:

  • 24% of Overall Grade
  • Mix of new material and review questions
  • Lowest 2 dropped (best 8 of 10)
  • Take as many times as needed to get 100% (30/30)
  • Brightspace configuration is fragile, so Piazza if issues

Mini-Projects

Traditional homework assignments - “mini-projects”

  • Guided real-data analysis using course technologies
  • Increasingly ambitious over the semester
  • Practice real data analysis
  • Communication and coding
  • Professional portfolio
  • One ungraded ‘set-up’ (“MP#00”) + best 3 of 4 graded

More later in the course

Mini-Projects

Due Dates:

  • MP#00: 2026-02-20 at 11:59pm ET (22 days from today)
  • MP#01: 2026-03-13 at 11:59pm ET (21 days)
  • MP#02: 2026-04-03 at 11:59pm ET (21 days)
  • MP#03: 2026-04-24 at 11:59pm ET (21 days)
  • MP#04: 2026-05-15 at 11:59pm ET (21 days)

These take time - “due date” \(\neq\) “do date”

Mini-Projects

Suggestions welcome - see Archives for previous semesters

Mini-Projects

100 points per (graded) MP:

  • 80 points for submission
  • 20 points for peer review

Important to learn to read and write code

  • Single-blind peer review process
  • Grade from instructor on quality of comments

You can’t do peer review if you don’t submit (auto 0)

Best 3 of 4 (total score out of 100) used

Mini-Projects

A mix of ‘warm-up’ activities + final deliverable analysis:

Each project is graded on an 8 part rubric:

  • Project Skeleton
  • Written Communication
  • Tables & Presentation
  • Data Visualization
  • Exploratory Data Analysis
  • Code Quality
  • Data Preparation
  • Analysis and Findings

For earlier projects, automatic 10/10 (if you do the bare minimum) on some elements

Course Project

In lieu of exams, semester-long group project

  • Three presentations + two reports
    • Project Proposal
    • Mid-Semester Check-In
    • Final Presentation
    • Individual Final Report
    • Group Final Report

More online

Course Project

46% of grade:

  • Blend of individual (20%) and group (26%) grades
  • Timeline moved up a bit to give more end of semester breathing room
  • Start looking for teammates: form team by Feb 19
    • Proposal presentation and team contract due Feb 26
  • First two presentations are mainly about making sure project is well-posed
    • Analysis due in final presentation + individual reports

Workload Expectations

Per federal and state requirements, 9 hours weekly = 6 hours outside of class

  • 10 hours of Pre-Assignments
  • 5 hours of Post-Class Review
  • 45 hours of Mini-Projects (homework)
  • 30 hours of Course Project

Note that course project = sum of all mini-projects - be ambitious!

Care Resources

See Care Resources for Students for helpful resources:

  • Mental health support
  • Physical health / medical care
  • Food Security
  • Financial Security
  • Immigration Support

Any questions?

Syllabus Quiz

“Syllabus quiz”:

  • Due Thu Feb 12
  • Brightspace
  • Unlimited attempts
  • Unlocks all subsequent assignments so do early

Getting Started with R and RStudio

Lab 01

Open Lab #01 and follow the instructions to get started with R and RStudio


Random assignment to Zoom breakout rooms

I will visit rooms to provide support


Call everyone back at 8:30pm

Mini-Project #00

You can now start thinking about Mini-Project #00

  • Create an account on GitHub + GitHub student developer pack
  • Setting up your own website using quarto (like my course site)
    • Future Mini-Projects submitted here
  • ‘Road test’ to make sure everything is working correctly
  • Register on course discussion board (Piazza)

Due 2026-02-20 at 11:59pm ET

Wrap-Up

Course Agenda

  • Communicating Results (quarto) ⬅️
  • R Basics
  • Data Manipulation in R
  • Data Visualization in R
  • Getting Data into R
  • Statistical Modeling in R

Course Introduction

  • All materials on course website
  • Review of course structure and key policies
  • More details about mini-projects and course project to follow

Getting Started with R

  • Install R, RStudio, git, and quarto
  • Run a basic bit of R code to confirm things work well

Next Week

Syllabus Quiz on Brightspace

Writing documents using Markdown and Quarto

  • First Pre-Assignment due 2026-02-05 at 6:00pm ET
    1. Online reading
    2. Submit quiz on Brightspace

Life Tip of the Week

Weekly feature: “Life Tip of the Week”

  • Advice about Baruch, finances, law, etc.

Getting the most out of your time here

Life Tip of the Week

Office Hours

  • What: Time set aside by Faculty for “drop-in” student interactions

  • Why: Homework help, review, diving deeper, chit-chat, connections to other courses, career advice - anything you want!

  • Where/When: Before class on Zoom

Build relationships with professors before you ask for things!

Happy to just ‘hang out’ but I will prioritize course related questions

Musical Treat