Infrastructure
Model Parallelism
Infrastructure· Advanced
Definition
A distributed training and inference technique that splits a model's layers or components across multiple GPUs — necessary when a single model is too large to fit in one device's VRAM. Tensor parallelism splits individual operations; pipeline parallelism splits layer groups.
Enterprise Context
Required for serving and training frontier-scale models (70B+ parameters). Enterprise AI teams evaluating self-hosted LLM deployments must design for model parallelism from the start.
Tags
#hardware#distributed#scale
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University
Keep learning. Keep building.
250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.