ModelJudgeEvaluator

An evaluator that uses a language model as a judge to score responses.

Declaration

struct ModelJudgeEvaluator<Input> where Input : ModelSampleProtocol

Mentioned in

Scoring with model-as-judge evaluators
Designing effective model-as-judge evaluators
Designing specific, measurable criteria in an evaluation suite
Evaluating language model responses

Overview

ModelJudgeEvaluator sends the query, response, and optional reference data to a judge model, which returns scores for one or more dimensions. The response is automatically serialized as JSON, because OutputType is Codable, or is customizable via ModelJudgePrompt.

Topics

ModelJudgeEvaluator

Declaration

Mentioned in

Overview

Topics

Creating a single-dimension evaluator

Creating a multi-dimension evaluator

Creating a pairwise evaluator

Configuring the judge prompt

Inspecting the evaluator

Errors

Relationships

Conforms To

See Also

Model-as-judge evaluations