CRNN

Model: CNN --> RecNN --> Softmax

Key Points

- Distinction between regular CNN using full RGB-D as input: process RGB and D separately then concatenate feature representations later.

- CNN filters unsupervised training (no back-prop): patches extracted, whitened and normalized then clustered using k-means.

Innovation: Omitting RNN Training

Backprop through RNN is expensive. Instead, just randomly initialize lots of RNNs and simply just use those out of the box (i.e. don't train them). Motivation is, hopefully one or more of them capture the correct weightages.

New Note

asdf