Class 7 - Model Comparison

Zoom

: 12:00pm - 2:45pm, February 28, 2022

Chapter 7: Ulysses' Compass

Chapter 9: Markov Chain Monte Carlo

Chapter 8: Conditional Manatees

What are three ways in which cross-validation and information theory aid in model evaluation?

They provide useful expectations of predictive accuracy, rather than merely fit to sample. So they compare models where it matters.
They give us an estimate of the tendency of a model to overfit. This will help us to understand how models and data interact, which in turn helps us to design better models.
They help us to spot highly influential observations.

Compare and contrast the four ways to calculate posteriors covered in this class.

Analytical approach: mathematical approach that relies closed form (aka pure math) solutions that are accurate but cover only limiting circumstances (e.g., memorizing conjugate distributions).
Grid approximation: very limited approach that relies on brute force counting. This approach is helpful for simple models (e.g., single variable) but becomes too computationally complex with multiple variables.
Quadratic approximation: Laplace’s approximation that relies on a Gaussian (normal) distribution assumption. This is fast and works well with simple to moderate models. This approach begins having issues with more complex models (e.g., multilevel)
Markov Chain Monte Carlo: A family of approaches (e.g., Metropolis, Hamiltonian) that relies on drawing samples from posterior distribution. Depending on the version, it scales well to many dimensions and has beneficial mathematical guarantees (e.g., with many draws long run estimate in proportion to population size). It is used in Stan and other modern PPL’s.