Execute Test Runs

Test Runs allow you to validate model configurations against your Test Sets to track improvements at the Intent level. While Maitai automatically executes Test Runs whenever a new model is fine-tuned, you can also manually create Test Runs to evaluate different models or configuration changes.

Creating a Test Run

Navigate to your Test Set Overview page
Click New Test Run in the upper right corner of the Test Runs table

Configure your Test Run:
- Add a descriptive name to easily identify this run later
- Modify the configuration that will be used for all requests in this Test Set
- The default configuration matches what’s currently set for the Intent

Review your settings and click Create Test Run
You’ll be redirected back to the Test Set Overview where your new Test Run will appear at the top of the table

Test Runs may take a few minutes to complete. While the status is PENDING, you won’t be able to access the Test Run details.

Understanding Test Run Results

Once your Test Run completes, you can analyze the results in detail:

Overview

The Test Set Overview page displays the Pass Rate for each Test Run - this represents the percentage of requests that scored “satisfactory” or higher (3/5 or better).

Detailed Analysis

Click on any completed Test Run to see:

Details Section
- Configuration used for the Test Run
- Description and execution timestamps
- Overall performance metrics
Results Section
- Macro-level metrics across all requests
- Score distribution visualization
- Hover over distribution segments for detailed breakdowns

Requests Table
- Individual scores for each request
- Hover over scores to see evaluation criteria
- Compare original vs Test Run responses side-by-side

Use the “Compare” feature to understand exactly how configuration changes affected specific responses.

Automated Test Runs

Maitai automatically executes Test Runs in these scenarios:

When a new model is fine-tuned for your Intent
After significant model updates
During A/B testing of configurations

This ensures continuous monitoring of your model’s performance and helps identify any regressions quickly.

Want to learn more about creating Test Sets? Check out our Test Sets guide.

Get Started

Examples

Features

SDK Reference

Execute Test Runs

Creating a Test Run

Understanding Test Run Results

Overview

Detailed Analysis

Automated Test Runs

Get Started

Examples

Features

SDK Reference

​Creating a Test Run

​Understanding Test Run Results

​Overview

​Detailed Analysis

​Automated Test Runs

Creating a Test Run

Understanding Test Run Results

Overview

Detailed Analysis

Automated Test Runs