Goal 
Train distributed vector representation of words. So, we'll construct a toy task to learn these representations in an unsupervised setting.
Why? The motivation.
So that you can use these vectors for downstream tasks.

Word Vector Representations

Count-Based Models

Neural Network-Based Models

SVD
Singular value decomposition.
PPMI
Positive pointwise mutual information.
Co-occurence Matrix
Count based models utilize a large corpus of text data to build a cooccurence matrix.
Directly Train Word Vectors
Build a toy task, given a large corpus of text data, to directly train word vectors in an unsupervised fashion.
Word2vec
Skip-gram / CBOW.
GloVe
Global vectors for word representation.
Ex. Linear Substructures
Trained vectors often have interesting linear substructures. Plotting to 2D, for example:
Word Vector Example
Each dimension encodes some sort of information about the word. 
Note: these dimensions are discovered and learned automatically. Often, we don't really know what the dimensions refer to, and meaning often lies across many dimensions.
Feedback | How-To