STA 9750 Mini-Project #01: TBD
Due Dates
- Released to Students: 2025-09-16
- Initial Submission: 2026-10-03 11:59pm ET on GitHub and Brightspace
-
Peer Feedback:
- Peer Feedback Assigned: 2026-10-06 on GitHub
- Peer Feedback Due: 2026-10-13 11:59pm ET on GitHub
Estimated Time to Complete: 9 Hours
Estimated Time for Peer Feedback: 1 Hour
Welcome to STA 9750 Mini Projects!
In the STA 9750 Mini-Projects, you will perform basic data analyses intended to model best practices for your course final project. (Note, however, that these are mini-projects; your final course project is expected to be far more extensive than any single MP.)
For purposes of MPs, we are dividing the basic data analytic workflow into several major stages:
- Data Ingest and Cleaning: Given a single data source, read it into
R
and transform it to a reasonably useful standardized format. - Data Combination and Alignment: Combine multiple data sources to enable insights not possible from a single source.
- Descriptive Statistical Analysis: Take a data table and compute informative summary statistics from both the entire population and relevant subgroups
- Data Visualization: Generate insightful data visualizations to spur insights not attainable from point statistics
- Inferential Statistical Analysis and Modeling: Develop relevant predictive models and statistical analyses to generate insights about the underlying population and not simply the data at hand.
In this course, our primary focus is on the first four stages: you will take other courses that develop analytical and modeling techniques for a variety of data types. As we progress through the course, you will eventually be responsible for the first four steps. Specifically, you are responsible for the following stages of each mini-project:
Ingest and Cleaning | Combination and Alignment | Descriptive Statistical Analysis | Visualization | |
---|---|---|---|---|
Mini-Project #01 | ✓ | |||
Mini-Project #02 | ✓ | ✓ | ½ | |
Mini-Project #03 | ½ | ✓ | ✓ | ✓ |
Mini-Project #04 | ✓ | ✓ | ✓ | ✓ |
In early stages of the course, such as this MP, I will ‘scaffold’ much of the analysis for you, leaving only those stages we have discussed in class for you to fill in. As the course progresses, the mini-projects will be more self-directed and results less standardized.
Rubric
STA 9750 Mini-Projects are evaluated using peer grading with meta-review by the course staff. Specifically, variants of the following rubric will be used for the mini-projects:
Course Element | Excellent (9-10) | Great (7-8) | Good (5-6) | Adequate (3-4) | Needs Improvement (1-2) | Extra Credit |
---|---|---|---|---|---|---|
Written Communication | Report is well-written and flows naturally. Motivation for key steps is clearly explained to reader without excessive detail. Key findings are highlighted and appropriately given context. | Report has no grammatical or writing issues. Writing is accessible and flows naturally. Key findings are highlighted, but lack suitable motivation and context. | Report has no grammatical or writing issues. Key findings are present but insufficiently highlighted. | Writing is intelligible, but has some grammatical errors. Key findings are obscured. | Report exhibits significant weakness in written communication. Key points are difficult to discern. | Report includes extra context beyond instructor provided information. |
Project Skeleton | Code completes all instructor-provided tasks correctly | Response to one instructor provided task is skipped, incorrect, or otherwise incomplete. | Responses to two instructor provided tasks are skipped, incorrect, or otherwise incomplete. | Response to three instructor provided tasks are skipped, incorrect, or otherwise incomplete. | Less than half of the instructor-provided tasks were successfully completed. | Report exhibits particularly creative insights beyond instructor specifications. |
Formatting & Display | Tables have well-formatted column names, suitable numbers of digits, and attractive presentation. Table has a suitable caption. | Column names and digits are well-chosen, but formatting could be improved. | Bad column names (opaque variable names or other undefined acronyms) | Unfiltered ‘data dump’ instead of curated table. | No tables. | Report includes one or more high-quality graphics (created using R ). |
Code Quality |
Code is (near) flawless. Code passes all |
Comments give context of the analysis, not simply defining functions used in a particular line. | Code has well-chosen variable names and basic comments. | Code executes properly, but is difficult to read. | Code fails to execute properly. | Code takes advantage of advanced Quarto features to improve presentation of results. |
Data Preparation | Automatic (10/10). Out of scope for this mini-project | Report modifies instructor-provided import code to use additional columns or data sources in a way that creates novel insights. |
Note that this rubric is designed with copious opportunities for extra credit if students go above and beyond the instructor-provided scaffolding. Students pursuing careers in data analytics are strongly encouraged to go beyond the strict ambit of the mini-projects to i) further refine their skills; ii) learn additional techniques that can be used in the final course project; and iii) develop a more impressive professional portfolio.
Because students are encouraged to use STA 9750 mini-projects as the basis for a professional portfolio, the basic skeleton of each project will be released under a fairly permissive usage license. Take advantage of it!
Submission Instructions
After completing the analysis, write up your findings, showing all of your code, using a dynamic quarto
document and post it to your course repository. The qmd
file should be named mp01.qmd
so the rendered document can be found at docs/mp01.html
in the student’s repository and served at the URL:1
https://<GITHUB_ID>.github.io/STA9750-2025-FALL/mp01.html
Once you confirm this website works (substituting <GITHUB_ID>
for the actual GitHub username provided to the professor in MP#00 of course), open a new issue at
https://github.com/michaelweylandt/STA9750-2025-FALL/issues/new
.
Title the issue STA 9750 <GITHUB_ID> MiniProject #01
and fill in the following text for the issue:
Hi @michaelweylandt!
I've uploaded my work for MiniProject #**01** - check it out!
https://<GITHUB_ID>.github.io/STA9750-2025-FALL/mp01.html
Once the submission deadline passes, the instructor will tag classmates for peer feedback in this issue thread.
Additionally, a PDF export of this report should be submitted on Brightspace. To create a PDF from the uploaded report, simply use your browser’s ‘Print to PDF’ functionality.
NB: The analysis outline below specifies key tasks you need to perform within your write up. Your peer evaluators will check that you complete these. You are encouraged to do extra analysis, but the bolded Tasks are mandatory.
NB: Your final submission should look like a report, not simply a list of facts answering questions. Add introductions, conclusions, and your own commentary. You should be practicing both raw coding skills and written communication in all mini-projects. There is little value in data points stated without context or motivation.
Mini-Project #01: TBD
In this mini-project, you will:
- TBD
- Practice Use of
dplyr
for analysis of tabular data - Practice Use of
quarto
and Reproducible Research Tools for Effective Communication of Data Analysis Results
Recall that you are evaluated on writing and communication in these Mini-Projects. You are required write a report in the prescribed style, here an internal policy briefing ‘white paper’. A submission that performs the instructor-specified tasks, but does not write and give appropriate context and commentary will score very poorly on the relevant rubric elements.
In particular, if a submission is not in “white paper” style, peer evaluators should judge it to have “Good” quality Written Communication (at best) as key findings are not conveyed appropriately.
Quarto’s cold folding functionality is useful for “hiding” code so that it doesn’t break the flow of your writing.
You can also make use of Quarto’s contents
shortcode to present code and findings in an order other than how the code should be executed. This is particularly useful if you want to include a figure or table in an “Executive Summary” at the top of your submission.
Extra Credit Opportunities
This work ©2025 by Michael Weylandt is licensed under a Creative Commons BY-NC-SA 4.0 license.
Footnotes
Throughout this section, replace
<GITHUB_ID>
with your GitHub ID from Mini-Project #00, making sure to remove the angle brackets. Note that the automated course infrastructure will be looking for precise formatting, so follow these instructions closely.↩︎