The backward pass is linear, so if the error is double, then the derivative is double in the backward pass
The order of input matters in an RNN
The output of a timestep t is not always the input to the same network at t+1
When to use autogressive network for RNN