Evaluation
Human Evaluation
Evaluation· Intermediate
Definition
Direct assessment of AI model outputs by human judges — rating quality, accuracy, coherence, helpfulness, or safety. Human evaluation remains the gold standard for measuring qualities automated metrics miss: nuance, creativity, factual accuracy, and real-world usefulness.
Tags
#quality#judgment#safety#RLHF#annotation
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University
Keep learning. Keep building.
250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.