5. Regularization and Model Selection

Forward search

1. Start off with no features.
2. Find the optimal feature to add.
Add each feature, evaluate generalization error w cross validation. Pick best feature that minimizes cross validation.
3. Repeat above until you've exhausted all features.

Regularization and Model Selection

Hold-out cross validation

Randomly split S into S(train) and S(cv). Train on S(train) and test on S(cv) with various hypotheses, pick one with least generalization error.

Sucks though

Because you waste a large chunk to test on that chunk.
A problem when data is scarce.

k-fold cross validation

1. Split S into k disjoint subsets of m/k training examples each.
2. For j = 1, ... k
Train model on all but that jth subset, and test on that jth subset.
Then, calculate avg generalization error for each hypothesis.
3. Pick hypothesis w lowest generalization error.

Cross Validation

Feature Selection

Backward search

Similar to forward, except now removing features one at a time.

Filter feature selection

Description

Frequentist vs Bayesian

Frequentist view - Î¸ is constant-valued (not a variable) and unknown
Bayesian view - Î¸ is a random variable and unknown

Bayesian Statistics