Example
All examples follow the same pattern: wrap your framework's output in a function that accepts an EvaluationCase and returns a str, dict, or EvaluationResult.
Python callable
from agentx.evaluations.models import EvaluationCase
def my_agent(case: EvaluationCase) -> str:
return f"Answer to: {case.query}"
report = (
client.evaluations
.run(dataset_id="...", subject={"kind": "custom_agent", "displayName": "My Bot", "framework": "raw_python"})
.execute(my_agent)
.finalize()
.analyze()
)Full example: basic_callable_eval
Updated about 8 hours ago
