The backward pass is linear, so if the error is double, then the derivative is double in the backward pass

The order of input matters in an RNN

The output of a timestep t is not always the input to the same network at t+1

When to use autogressive network for RNN