Technical risk

AI agent evaluation areas

Agent systems need repeatable evaluation across capability, safety, refusal behavior, and tool use.

Focus 1

Capability

Measure whether the agent can complete intended tasks without hidden manual work.

Focus 2

Safety

Evaluate unsafe action prevention, least-privilege tool access, and policy boundaries.

Focus 3

Jailbreak resistance

Test prompt-injection handling across user input, retrieved content, and tool outputs.

Focus 4

Tool use

Validate schema use, authorization, timeouts, and error handling for every tool.

Focus 5

Refusal calibration

Check that refusals are neither too broad nor too narrow for risky tasks.

Focus 6

Reproducibility

Record prompts, tool versions, seeds where available, and source artifacts.