Evaluation
BLEU Score
Bilingual Evaluation Understudy
Evaluation· Intermediate
Definition
A metric for evaluating machine-generated text quality by comparing n-gram overlap between model output and human reference texts. BLEU scores range from 0 to 1. Despite limitations with semantic equivalence, BLEU remains widely used for translation and summarization evaluation.
Tags
#translation#NLP#metrics#n-gram#quality
MS
Maxx Stacks Editorial
Reviewed by enterprise AI practitioners
Maxx University
Keep learning. Keep building.
250+ terms. 5 learning paths. AI maturity assessment. Jargon translator. All free, always.