Class 7 - Model Comparison
Zoom
: 12:00pm - 2:45pm, February 28, 2022
Required Readings
Chapter 7: Ulysses' Compass
Chapter 9: Markov Chain Monte Carlo
Optional Readings
Chapter 8: Conditional Manatees
Lecture
Lecture 7
Lecture 8
Comprehension questions
What are three ways in which cross-validation and information theory aid in model evaluation?
They provide useful expectations of predictive accuracy, rather than merely fit to sample. So they compare models where it matters.
They give us an estimate of the tendency of a model to overfit. This will help us to understand how models and data interact, which in turn helps us to design better models.
They help us to spot highly influential observations.
Compare and contrast the four ways to calculate posteriors covered in this class.
Analytical approach: mathematical approach that relies closed form (aka pure math) solutions that are accurate but cover only limiting circumstances (e.g., memorizing conjugate distributions).
Grid approximation: very limited approach that relies on brute force counting. This approach is helpful for simple models (e.g., single variable) but becomes too computationally complex with multiple variables.
Quadratic approximation: Laplace’s approximation that relies on a Gaussian (normal) distribution assumption. This is fast and works well with simple to moderate models. This approach begins having issues with more complex models (e.g., multilevel)
Markov Chain Monte Carlo: A family of approaches (e.g., Metropolis, Hamiltonian) that relies on drawing samples from posterior distribution. Depending on the version, it scales well to many dimensions and has beneficial mathematical guarantees (e.g., with many draws long run estimate in proportion to population size). It is used in Stan and other modern PPL’s.
Deliverables
Due before class: Monday, February 28 at 11:59am
Mid-Semester Feedback
Optional mid-semester course feedback
Lab for Class 7
Problem Set 5
Due by next class: Monday, March 14 at 11:59am