Back to Glossary
Attention Mechanism
Models & Architectures
Weights relevant parts of the input dynamically.
Attention computes weighted combinations of representations to focus on important information.
- Types: Self-, cross-, and causal attention.
- Benefits: Better context handling and potential interpretability (attention maps).
- Costs: Quadratic complexity in full self-attention; long-context optimizations are needed.