Different Types of RNN
Language Modelling → Cost Function
Exploding Gradients are easy to capture as parameters just blow up and you might often see NaNs (Not a numbers → results of numerical overflow, in Neural network computation) → Apply Gradient Clipping i.e. Look at Gradient Vectors and if it is bigger than some threshold, re-scale some of your gradient vector so that is not too big.
Peephole Connections → Gate Values may depend not just on a_t-1 and x_t but also on previous memory cell value
EXERCISE → 02