Cross Validation

1. Motivation

We want to know how our model will generalize
We want to make sure that our results aren't simply the result of getting lucky with a particular train/val/test split
See also: bootstrapping (statistics)
Related: jackknifing

Split the data into \(k\) groups.
For each group, hold it out as a validation set and train on the remaining \(k-1\) groups. Evaluate on the held out group and write down the evaluation results.
Summarize the \(k\) evaluation results, e.g. mean and standard deviation

exhaustive – every possible division of the dataset into train/val is considered – for all possible sizes of the val set
leave-out-p – every possible division, for a validation set of size p. There are \(\binom{n}{p}\) splits to consider
leave-out1 – leave-out-p for \(p=1\)
k-fold – partition the dataset into \(k\) pieces. In turn, treat each one of those pieces as the val split
stratified k-fold – k-fold, but make sure that each val split has the same proportion of each target label (what about the proportions in the train set?)

don't create cross-validated summary statistics and then aggregate for all folds
- each fold has fewer data points – much higher chance that your explained variance is small
instead, save the predictions for each fold, concatenate them, and then compute a summary statistic, e.g. fit, on the entire dataset