4. Learning Theory

Learning Theory

Bias vs variance tradeoff

Bias - underfitting
Variance - overfitting (to the variance)

Useful lemma #1 - the union bound

Let A1, A2, ... Ak be k different events. Then:

(intuitively, the probability of one event occurring is at most the sums of the probabilities of the k different events)

Note

Description

Useful lemma #2 - Hoeffding inequality

All Z's are IID random variables drawn from Beroulli distributions. Then, phi hat is the mean of all these random variables, and choose a gamma > 0. Then:

(basically, if we take the average of m Bernoulli random variables to be our new estimate of phi, the probability of being far from that true value is small - so long as m is large)

Generalization error

Probability that, drawing a new example from the distribution D h will misclassify it.

Training error

Empirical risk minimization

Given a set of hypothesis (hypothesis class H), we want to find the best hypothesis:

Background

Uniform Convergence