Neural Turing Machine

Neural Turing Machines

What is a neural turing machine?

It is an extension of a neural network into a differential computer by giving it read/write access to external memory.

Controller

It is a neural network usually an LSTM. It gets an input and produces an output like a normal NN but also deals with a memory.

Architecture

Memory

It is a large NxM matrix of real numbers where N is the number of memory locations and M is the size of vector stored at each location.

Read/write heads

Read takes data from memory and returns to controller while write head takes data from controller to modify memory

Reading/Writing

Read vector

Where Mt is the contents of the N ×M memory matrix at time t and wt be a vector of weightings over the N locations emitted by a read head at time t.

Write vector- erase vector

where 1 is a row-vector of all 1-s, and the multiplication of erase vector(whose M elements lie in range (0,1)) against the memory location acts point-wise.

add vector

Parts of memory is erased according to the weighting and then new information is added to location specified by the weighting using the add vector.

Addressing

Content Based Addressing

Each head ﬁrst produces a length M key vector k that is compared to each vector Mt(i) by a similarity measure K[.,.].

where β is a positive scalar called key strength.

Location Based Addressing

Interpolation: The idea is to bled wt-1 and wc t.

where gt is a scalar parameter between 0 and 1.

Shifting

After interpolation, each head emits a shift weighting st that deﬁnes a normalised distribution over the allowed integer shifts N:

Sharpening

Each head emits one further scalar γt ≥ 1 whose effect is to sharpen the ﬁnal weighting as follows: