Build Dataset
Create evaluation datasets and add test cases:
dataset = (
client.evaluations.datasets
.builder(
name="Support Agent v2",
number_of_requests=3, # runs per case
acceptance_criteria="Accurate, concise, grounded in docs.",
rejection_criteria="No hallucinated policies.",
)
.add_case(
query="How do I reset my password?",
expected_results="Explain the password reset process step by step.",
)
.add_case(
query="What payment methods do you accept?",
expected_results="List supported payment methods clearly.",
)
.publish()
)
print(dataset.id) # use this id in .run()Create evaluation datasets directly from CSV
dataset = client.evaluations.datasets.from_csv(
path="cases.csv",
name="My Dataset",
number_of_requests=2,
acceptance_criteria="...",
rejection_criteria="...",
)CSV format:
query,expected_results
"How do I reset my password?","Navigate to example.com/login and click reset."
"What is your refund policy?","30 days refund no question ask."
Optional case_id column for stable idempotency keys across re-runs.
Updated about 8 hours ago
