Neural Networks
Transformer
Neural Networks· Advanced
Definition
The dominant neural network architecture in modern AI, introduced in 'Attention Is All You Need' (2017). Uses self-attention mechanisms to process all input tokens in parallel, capturing long-range dependencies efficiently. Foundation of GPT, Claude, Gemini, and BERT.
Maxx Stacks Context
Key insight: Self-attention allows each token to relate to all other tokens simultaneously — enabling true contextual understanding.
Tags
#attention#self-attention#GPT#BERT#architecture
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University
Keep learning. Keep building.
250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.