pairwise(_:scale:judge:scoringMode:evaluationTarget:)
Creates a pairwise comparison evaluator that compares the model’s response against the sample’s expected value.
Declaration
static func pairwise(_ name: String, scale: ScoringScale, judge: any LanguageModel, scoringMode: ScoringMode = .discrete, evaluationTarget: (@Sendable (Input.ExpectedValue) -> String)? = nil) -> ModelJudgeEvaluator<Input>Parameters
- name:
The metric name that corresponds to the DataFrame column.
- scale:
Scoring scale for the comparison.
- judge:
The language model to use as judge.
- scoringMode:
A value that indicates whether scores are discrete (default) or allow any floating-point value.
- evaluationTarget:
An optional closure that converts the value to a string. Both responses use this target.
Mentioned in
Discussion
The judge sees the model’s output under “Response” and the expected value from input.expected under “Baseline Response” in the Context section.