Public no authBounded executionTrace-linked

Eval Runs

Run bounded eval artifacts inside Sandbox workspaces and send public-safe summaries to Evals when that integration is reachable.

Open Playground
Bounded mode
Public-safe status
Eval requests are rate-limited and trace-linked.
Remote scoring
Public-safe status
Remote Evals scoring is marked degraded until the integration confirms execution.
Help