Transformers: Bad Band-Aids and Missing Memory
Monthly paper reminder that Transformer architecture is still a stop-gap solution. Here authors create tasks to test generalization of formal language and find that positional encoding is a bad band-aid and augmented memory is dearly missing.