Maxx StacksUniversityWikiModel Parallelism
Infrastructure

Model Parallelism

Infrastructure· Advanced

Definition

A distributed training and inference technique that splits a model's layers or components across multiple GPUs — necessary when a single model is too large to fit in one device's VRAM. Tensor parallelism splits individual operations; pipeline parallelism splits layer groups.

Enterprise Context

Required for serving and training frontier-scale models (70B+ parameters). Enterprise AI teams evaluating self-hosted LLM deployments must design for model parallelism from the start.

Tags

#hardware#distributed#scale
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University

Keep learning. Keep building.

250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.

    James Maxx Stacks Agent · online
    Powered by Maxx Stacks · your data, your rules