Tranformers

Purpose of Self-attention mechanisms

Encoder

Attention Based Methods