---
title: Validating inference correctness against a reference run
framework: coreai
role: article
role_heading: Article
path: coreai/validating-inference-correctness-against-a-reference-run
---

# Validating inference correctness against a reference run

Measure numerical divergence in a Core AI model against a reference run.

## Overview

Overview Quantization and model specialization can introduce numerical drift between a Core AI model and the original source model. Core AI Debugger pairs each operation in your Core AI asset with its counterpart in a reference run, then automatically measures similarity for every matched pair.

Prepare a reference run An .aimodelintermediates file records the intermediate tensor values produced at each operation of a PyTorch reference run. To generate the file, use the save_intermediates API, passing both the model you want to validate and the original source model. The result is a per-operation mapping between the PyTorch run and the Core AI model that Core AI Debugger can use to compare inference results. Start a comparison session To compare your Core AI model against an .aimodelintermediates file: Open your .aimodel file in Core AI Debugger. In the toolbar, click the Comparison button to start a comparison session. Under Configuration A, set the Target, Function, Compute Unit, and Graph Visualization, and specify your model inputs. Under Configuration B, click the Target menu and select Intermediates File under Load Reference Run. Click the folder icon and select your .aimodelintermediates file. Click Compare.

note: You can return to single-session mode at any time by clicking the Comparison button. Read comparison results in the Navigator When a comparison session starts, the Navigator populates with sync points — operation pairs that combine a Core AI operation with its PyTorch counterpart. Each sync point shows both operation names alongside a similarity score and a color-coded indicator dot: Green: close match Yellow: moderate divergence Red: large error

Sort by Similarity to identify the most divergent pairs, or by Operation to see whether failures cluster in a specific part of the model. Click any sync point to see that operation in the Structure Viewer, Source Viewer, and Inspector. Review comparison metrics Core AI Debugger reports five metrics for each sync point. Color indicators are metric-aware, so green always signals a good result regardless of which metric you choose. The default metric is PSNR. The other metrics offer different lenses depending on what kind of divergence you want to surface: Investigate a divergent operation Select a sync point with a low similarity score to begin investigating. In the Inspector, the tensor outputs from both runs are displayed side by side alongside a visual difference, letting you see directly where the values diverge.

Use the Source Viewer to trace the operation back to its origin in the PyTorch code. The module hierarchy at the top of the Source Viewer tells you which PyTorch module the operation belongs to. If low-similarity sync points cluster in the same module, the divergence is localized there, giving you a precise target for changes to your model. If only specific operations diverge, use the Source Viewer to understand their implementation and identify what may be causing the discrepancy.

## See Also

### Model inspection and validation

- [Inspecting Core AI models with Core AI Debugger](coreai/inspecting-core-ai-models-with-core-ai-debugger.md)
