Live Session
1
Probability Spaces
1.1
Learning Objectives
1.2
Course Learning Objectives
1.2.1
Understand the building blocks of probability theory that prepare learners for the study of statistical models
1.2.2
Understand and apply statistical models in common situations
1.2.3
Analyze a research question using a linear regression framework
1.2.4
Interpret the results of a model and communicate them in manner appropriate to the audience
1.2.5
Contribute proficient, basic work, using industry standard tools and coding practices to a modern data science team.
1.3
Introductions
1.3.1
Instructor Introductions
1.3.2
What does a statistician look like? You!
1.4
Student Introductions [Breakout One]
1.5
Student Introductions [Breakout Two]
1.6
Probability Theory
1.7
Axiomatic Probability
1.8
Definition vs. Theorem
1.9
Working with a Sample Space
1.9.1
Working with a Sample Space, Part I
1.9.2
Working with a Sample Space, Part II
1.10
Independence
1.11
A practice problem
1.12
Student Tasks to Complete
2
Defining Random Variables
2.1
Learning Objectives
2.2
Introduction to the Materirals
2.3
Class Announcements
Homework
Study Groups
Course Resources
2.4
Using Definitions of Random Variables
2.4.1
Random Varaible
2.5
Pieces of a Random Variable
2.5.1
Functions of Functions
2.5.2
Probability Density Functions and Cumulative Distribution Functions
2.6
Discrete & Continuous Random Variables
2.7
Moving Between PDF and CDF
2.8
Joint Density
2.8.1
Example: Uniform Joint Density
2.8.2
Examples: Thinking Through Many Plots
2.8.3
Triangle Math
2.8.4
Saddle Sores
2.9
Computing Different Distributions.
2.10
Conditional Probability
2.11
Visualizing Distributions Via Simulation
2.11.1
Example: The Uniform Distribution
2.11.2
Example: The Normal Distribution
2.12
Review of Terms
3
Summarizing Distributions
3.1
Learning Objectives
3.2
Class Announcements
3.2.1
What is in the rearview mirror?
3.2.2
Today’s Lesson
3.2.3
Future Attractions
3.3
Discussion of Terms
3.3.1
Expected Value
3.5
Computing Examples
3.5.1
Expected Value of Education [discrete random variable]
3.5.2
Using a formula
3.6
Computing by Hand
3.6.1
Compute the Expected Value
3.6.2
Playing a Gnome Game, Part 1
3.6.3
Compute the Variance
3.6.4
Playing a Gnome Game, Part 2
3.7
Expected Value by Code
3.7.1
Expected Value of a Six-Sided Die
3.7.2
Variance of a Six-Sided Die
3.8
Practice Computing
3.8.1
Single Variable
3.8.2
Joint Density
3.9
Write Code
4
Conditional Expectation and The BLP
4.1
Thunder Struck
4.2
Learning Objectives
4.3
Class Announcements
4.3.1
Test 1 is releasing to you today.
4.4
Roadmap
4.4.1
Rearview Mirror
4.4.2
This week
4.4.3
Coming Attractions
4.5
Conditional Expectation Function (CEF),
4.5.1
Part I
4.5.2
Part II
4.5.3
Part III
4.6
Computing the CEF
4.6.1
Simple Quantities
4.6.2
Conditional Quantities
4.6.3
Conditional Expectation
4.7
Minimizing the MSE
4.7.1
Minimizing MSE
4.7.2
The pudding (aka: “Where the proof is”)
4.7.3
The Implication
4.8
Working with the BLP
4.9
Joint Distribution Practice
4.9.1
Professorial Mistakes (Discrete RVs)
4.9.2
Continuous BLP
5
Learning from Random Samples
5.1
Goals, Framework, and Learning Objectives
5.1.1
Class Announcements
5.1.2
Learning Objectives
5.1.3
Roadmap
5.2
Key Terms and Assumptions
5.2.1
IID
5.3
Estimators
5.3.1
Three properties of estimators
5.4
Estimator Property: Biased or Unbiased?
5.4.1
Is it unbiased, with data?
5.5
Estimator Property: Consistency
5.6
Understanding Sampling Distributions
5.7
Write Code to Demo the Central Limit Theorem (CLT)
5.7.1
Part 1
5.7.2
Part 2
5.7.3
Part 3
5.7.4
Discussion Questions About the CLT
5.8
Errors with Standard Errors
6
Hypothesis Testing
What is Frequentist testing doing?
6.1
Learning Objectives
6.2
Class Announcements
6.3
Roadmap
6.4
What does a hypothesis test do?
6.5
Madlib prompt
6.6
Madlib completed
6.7
“Accepting the Null”
6.8
Manually Computing a t-Test
6.9
Falling Ill (The General Form of a Hypothesis Test)
6.10
Data Exercise
6.11
Assumptions Behind the t-test
7
Comparing Two Groups
7.1
Learning Objectives
7.2
Class Announcements
7.3
Roadmap
7.3.1
Rearview Mirror
7.3.2
Today
7.3.3
Looking ahead
7.4
Teamwork Discussion
7.4.1
Working on Data Science Teams
7.4.2
The Problematic Psychology of Data Science
7.4.3
What Makes an Effective Team?
7.4.4
We All Belong
7.5
Team Kick-Off
7.6
A Quick Review
7.7
Rank Based Tests
7.8
Comparing Groups R Exercise
7.9
The Questions
7.9.1
Set 1
7.9.2
Set 2
7.9.3
Set 3
7.9.4
Apply to a New Type of Data
7.10
Simulating the Effects of Test Choices
7.10.1
Should we use a t-test or a wilcox sign-rank?
7.11
7.11.1
The Poisson Distribution
7.11.2
Write a Simulation
7.11.3
What if a distribution is much
more
skewed?
7.11.4
False Rejection Rates
7.11.5
What about Power to Reject
7.11.6
Paired compared to unpaired tests
8
OLS Regression Estimates
8.1
Learning Objectives
8.2
Class Announcements
8.3
Roadmap
8.4
Discussion Questions
8.5
Best Linear Predictor and OLS Regression as a Predictor
8.6
The Regression Anatomy Formula
8.6.1
Estimate an OLS Regression
8.6.2
Regression Anatomy and Fritch Waugh Lovell
8.7
Coding Activity:R Cheat Sheet
8.8
R Exercise
8.8.1
Assess the Relationship between Price and Square Footage
8.9
Regression Plots and Discussion
8.9.1
Plot 1
9
OLS Regression Inference
9.1
Learning Objectives
9.2
Class Announcements
9.3
Roadmap
9.4
Uncertainty in OLS
9.4.1
Discussion Questions
9.5
Understanding Uncertainty
9.5.1
Question 1
9.5.2
Question 2
9.5.3
Question 3
9.6
Understanding Uncertainty
9.7
R Exercise
10
Descriptive Model Building
10.1
Learning Objectives
10.2
Class Announcements
10.3
Roadmap
10.4
Discussion
10.4.1
Three modes of model building
10.4.2
The statistical modeling process in different modes
10.5
R Activity: Measuring the return to education
10.5.1
Transformations
10.5.2
Estimation
11
Explanatory Model Building
11.1
Learning Objectives
11.2
Class Announcements
11.3
Roadmap
11.4
Discussion
11.4.1
Path Diagrams
11.5
An Interlude
11.5.1
Omitted Variable Bias
11.6
R Exercise
11.6.1
Omitted Variable Bias in R
11.6.2
Questions:
11.7
Research Design Strategies
11.8
Discussion
12
The Classical Linear Model
12.1
Learning Objectives
12.2
Class Announcements
12.3
Roadmap
12.4
The Classical Linear Model
12.4.1
Part 1
12.4.2
Part 2
12.4.3
Part 3
12.4.4
Part 4
12.4.5
Part 5
12.5
R Exercise
12.5.1
Questions:
13
Reproducible Research
13.1
Learning Objectives
13.2
Class Announcements
13.3
Roadmap
13.4
What data science hopes to accomplish
13.5
Learning from Data
13.6
Data Science and Statistics
13.7
Why Statistics?: A Closing Argument for Statistics
13.8
Course Goals
13.8.1
Course Section III: Purpose-Driven Models
13.8.2
Course Section II: Sampling Theory and Testing
13.8.3
Course Section I: Probability Theory
13.8.4
Statistics as a Foundation for MIDS
13.9
Reproducibility Discussion
13.9.1
Discussion
14
Maximum Likelihood Estimation
14.1
Learning Objectives
14.2
Class Announcements
14.3
Roadmap
14.4
What is a model?
14.5
Estimation
14.6
Discussion of Maximum Likelihood Estimation
14.7
Optimization in R
14.7.1
Optimization Example: Optimum Price
14.8
MLE for Poisson Random Variables
14.8.1
MLE for Poisson Random Variables: Data
14.8.2
MLE Estimation
14.9
Confidence Intervals
14.10
Maximum Likelihood Example: Printers
Appendix
Bloom’s Taxonomy
Statistics for Data Science
14.2
Class Announcements