Evaluation
G-Eval
Evaluation· Advanced
Definition
A framework for NLG (natural language generation) evaluation using LLMs as evaluators with chain-of-thought reasoning. G-Eval defines explicit evaluation criteria, prompts a model to reason about each criterion, and produces score distributions rather than point estimates — showing stronger correlation with human judgments than prior metrics.
Enterprise Context
Used in enterprise LLM quality assurance pipelines to evaluate summarization, translation, and generation quality in a way that better reflects human preferences.
Tags
#evaluation#NLG#framework
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University
Keep learning. Keep building.
250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.