EvaluationSubject Fields

FieldValuesDescription
kindcustom_agentAlways custom_agent for external agents
displayNamestringHuman-readable name shown in the dashboard
frameworkraw_python, openai, anthropic, langchain, llamaindex, crewai, autogen, n8n, flowise, otherFramework used
runtimelocal, ci, customer_hostedWhere the agent runs
versionstringOptional version tag for the agent
endpointURL stringOptional, for HTTP-based agents

Return value from your agent function

Your callable can return any of:

Return typeBehavior
strUsed directly as the output text
dict with "output" keyOutput text from output, rest stored as metadata
EvaluationResultFull control - pass rating, justification, trace, timings

Security and privacy


The SDK automatically scrubs secrets from outputs and metadata before uploading:

  • sk-... API keys
  • Bearer tokens
  • Authorization headers
  • Password-like fields

Raw agent outputs, prompts, and CoT reasoning are never uploaded. Only the text response, metadata you explicitly include, and optional observable trace summaries.