Contents

pairwise(_:scale:judge:scoringMode:evaluationTarget:)

Creates a pairwise comparison evaluator that compares the model’s response against the sample’s expected value.

Declaration

static func pairwise(_ name: String, scale: ScoringScale, judge: any LanguageModel, scoringMode: ScoringMode = .discrete, evaluationTarget: (@Sendable (Input.ExpectedValue) -> String)? = nil) -> ModelJudgeEvaluator<Input>

Parameters

name:
The metric name that corresponds to the DataFrame column.
scale:
Scoring scale for the comparison.
judge:
The language model to use as judge.
scoringMode:
A value that indicates whether scores are discrete (default) or allow any floating-point value.
evaluationTarget:
An optional closure that converts the value to a string. Both responses use this target.

Mentioned in

Scoring with model-as-judge evaluators

Discussion

The judge sees the model’s output under “Response” and the expected value from input.expected under “Baseline Response” in the Context section.

See Also

Creating a pairwise evaluator

pairwise(judge:dimensions:scoringMode:evaluationTarget:)