Dissertation

Better Predictions when Models are Wrong or Underspecified

Promotor: P.D. Grünwald

Author: Matthijs van Ommen
Date: 10 June 2015
Links: Thesis in Leiden Repository

Many statistical methods rely on models of reality in order to learn from data and to make predictions about future data. By necessity, these models usually do not match reality exactly, but are either wrong (none of the hypotheses in the model provides an accurate description of reality) or underspecified (the hypotheses in the model describe only part of the data). In this thesis, we discuss three scenarios involving models that are wrong or underspecified. In each case, we find that standard statistical methods may fail, sometimes dramatically, and present different methods that continue to perform well even if the models are wrong or underspecified. The first two of these scenarios involve regression problems and investigate AIC (Akaike's Information Criterion) and Bayesian statistics. The third scenario has the famous Monty Hall problem as a special case, and considers the question how we can update our belief about an unknown outcome given new evidence when the precise relation between outcome and evidence is unknown.