Skip to main content
A Test Run executes every request in a Test Set using a specific configuration, then aggregates results so you can measure quality and compare runs over time.

Create a Test Run

  1. Open a Test Set
  2. Go to the Test Runs tab
  3. Click New Test Run
The Portal walks you through a short wizard:
  • 1) Description
    • Provide a human-readable description for the run (required).
    • This is what you’ll use later to remember why you ran it (e.g. “new system prompt v3”, “o3 reasoning high”, “temp=0.2”).
  • 2) Configuration
    • Region: choose the backend endpoint/region used to execute the run.
    • Model & settings: configure the run using the same configuration panel you use elsewhere in the Portal (model selection and other runtime settings).
    • The starting configuration is pre-populated from the Intent Group’s current config.
  • 3) Review
    • Confirm the description and configuration, then create the run.

What happens after creation

After you create the run:
  • It appears at the top of the Test Runs table with a status.
  • While the run is PENDING or RUNNING, the Portal disables opening the run details.
  • Once it’s COMPLETED (or ERROR), click the row to open the detailed results page.

Comparing runs

From the Test Runs tab on a Test Set:
  • Click Compare Runs
  • Select 2 or more completed runs
  • Click Compare to open a side-by-side view across runs (scores per request + response comparisons)

Deleting a run

You can delete a Test Run from the Test Runs table (via the row actions menu). This is useful for cleaning up runs created with incorrect configs or experiments you no longer need.