Maxx StacksUniversityWikiHuman Evaluation
Evaluation

Human Evaluation

Evaluation· Intermediate

Definition

Direct assessment of AI model outputs by human judges — rating quality, accuracy, coherence, helpfulness, or safety. Human evaluation remains the gold standard for measuring qualities automated metrics miss: nuance, creativity, factual accuracy, and real-world usefulness.

Tags

#quality#judgment#safety#RLHF#annotation
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University

Keep learning. Keep building.

250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.

    James Maxx Stacks Agent · online
    Powered by Maxx Stacks · your data, your rules