Take a large corpus of text data, and convert it into (input, output) pairs, using a rolling window of fixed length.
- Example dataset: "the quick brown fox jumped over the lazy dog". If window size =1, windows are "the quick brown", "quick brown fox", "brown fox jumped", etc.