EvaluationSubject Fields
| Field | Values | Description |
|---|---|---|
kind | custom_agent | Always custom_agent for external agents |
displayName | string | Human-readable name shown in the dashboard |
framework | raw_python, openai, anthropic, langchain, llamaindex, crewai, autogen, n8n, flowise, other | Framework used |
runtime | local, ci, customer_hosted | Where the agent runs |
version | string | Optional version tag for the agent |
endpoint | URL string | Optional, for HTTP-based agents |
Return value from your agent function
Your callable can return any of:
| Return type | Behavior |
|---|---|
str | Used directly as the output text |
dict with "output" key | Output text from output, rest stored as metadata |
EvaluationResult | Full control - pass rating, justification, trace, timings |
Security and privacy
The SDK automatically scrubs secrets from outputs and metadata before uploading:
sk-...API keys- Bearer tokens
- Authorization headers
- Password-like fields
Raw agent outputs, prompts, and CoT reasoning are never uploaded. Only the text response, metadata you explicitly include, and optional observable trace summaries.
Updated about 8 hours ago
