1. Advanced Word Vector Representations

Goal

Train distributed vector representation of words. So, we'll construct a toy task to learn these representations in an unsupervised setting.

Why? The motivation.

So that you can use these vectors for downstream tasks.

- See "The amazing power of word vectors".

Word Vector Representations

Count-Based Models

Neural Network-Based Models

SVD

Singular value decomposition.

PPMI

Positive pointwise mutual information.

Co-occurence Matrix

Count based models utilize a large corpus of text data to build a cooccurence matrix.

Directly Train Word Vectors

Build a toy task, given a large corpus of text data, to directly train word vectors in an unsupervised fashion.

Word2vec

Skip-gram / CBOW.

GloVe

Global vectors for word representation.

Ex. Linear Substructures

Trained vectors often have interesting linear substructures. Plotting to 2D, for example:

Word Vector Example

Each dimension encodes some sort of information about the word.

Note: these dimensions are discovered and learned automatically. Often, we don't really know what the dimensions refer to, and meaning often lies across many dimensions.