When this is useful
Evaluating a single request is great for:- Debugging a specific failure mode
- Verifying whether a proposed Sentinel would have flagged a request
- Comparing evaluation outcomes between the original request and a rerun
What you’ll see on the Request page
The Evaluation section shows one or more evaluation blocks. Each block includes:- The set of selected sentinels
- A request-level summary (pass count / total)
- Per-sentinel results (status, description, eval time)
- Real-time: evaluations associated with the original request
- Batch Eval: results that came from an Evaluation Run (the block links back to the run)
- Manual: one-off evaluations you run from the Request page
Run a manual (one-off) evaluation
- Open a request and scroll to the Evaluation section.
- Click Add Evaluation to create a new manual evaluation block.
- Select one or more sentinels to run.
- Click Run Evaluation.
You can only run an evaluation once the request has a final assistant message to evaluate. If you want to try a different sentinel set after running, add a new evaluation block.