Bengio's Unified Theory: From Attention to Causality
·41 words·1 min
This is a great talk, highly recommended. I had not thought of attention as Bengio has described here (switching from list to sparse sets). Various threads on OOD generalization, meta learning, few shot learning, causality have a common unifying fabric.