Quality Loop (Fault → Fix → Regression Test)

This recipe connects the most common “day 2” Portal workflows:

find failures with Evaluations
inspect a Request
refine a Sentinel
promote the case into a Test Set
verify changes with a Test Run

1) Find a real failure

In the Portal, go to Test > Evaluations, then:

Add a filter for Status = FAULT
Optionally add Intent Group (to narrow to one area)
Expand a row to see which sentinel(s) faulted
Open the request via the Request action

See: Evaluation Results

2) Understand the failure on the Request page

On the Request page:

Review the request messages and response
Scroll to Evaluation to see the per-sentinel results for this request
If a failure includes a Suggested Correction, treat it as debugging guidance (not an automatic change)

See: Evaluating a Request

3) Promote the request into a Test Set

On the Request page, click Add to Test Set:

Choose an existing Test Set, or create one inline
Add tags (this flow enforces a max of 5 tags)
Submit

See: Test Set Creation

4) Correct the expected response (when production output was wrong)

If the original production response is wrong, update the Test Set’s expected response:

Open the Test Set.
Go to Requests.
Use the request Edit action to update the final assistant message (and tool calls if applicable).

This makes the Test Set reflect “what should have happened”, which is what you want for regression testing.

5) Verify with a Test Run (baseline vs candidate)

From the Test Set:

Go to Test Runs
Click New Test Run
Configure the candidate model/config and start the run
When complete, use Compare Runs and per-request comparison to see what changed

See:

6) Fix the root cause with a Sentinel (when applicable)

If the failure is something you want to enforce at the Intent Group level:

Open the Intent Group.
Go to Sentinels.
Edit the sentinel (or create a new one) and iterate until it matches your expectation.

See:

7) Close the loop

After updating Sentinels and/or configuration:

Re-check Test > Evaluations (filter by sentinel + FAULT) to confirm the fix is behaving as expected on real traffic.
Re-run the relevant Test Set to ensure you didn’t introduce regressions.

Get Started

Observe

Test

Build

Examples

SDK Reference

Quality Loop (Fault → Fix → Regression Test)

1) Find a real failure

2) Understand the failure on the Request page

3) Promote the request into a Test Set

4) Correct the expected response (when production output was wrong)

5) Verify with a Test Run (baseline vs candidate)

6) Fix the root cause with a Sentinel (when applicable)

7) Close the loop

Get Started

Observe

Test

Build

Examples

SDK Reference

​1) Find a real failure

​2) Understand the failure on the Request page

​3) Promote the request into a Test Set

​4) Correct the expected response (when production output was wrong)

​5) Verify with a Test Run (baseline vs candidate)

​6) Fix the root cause with a Sentinel (when applicable)

​7) Close the loop

1) Find a real failure

2) Understand the failure on the Request page

3) Promote the request into a Test Set

4) Correct the expected response (when production output was wrong)

5) Verify with a Test Run (baseline vs candidate)

6) Fix the root cause with a Sentinel (when applicable)

7) Close the loop