Skip to content

Evaluation Test Cases

List Evaluation Test Cases
client.Agents.EvaluationTestCases.List(ctx) (*AgentEvaluationTestCaseListResponse, error)
get/v2/gen-ai/evaluation_test_cases
Create Evaluation Test Case.
client.Agents.EvaluationTestCases.New(ctx, body) (*AgentEvaluationTestCaseNewResponse, error)
post/v2/gen-ai/evaluation_test_cases
List Evaluation Runs by Test Case
client.Agents.EvaluationTestCases.ListEvaluationRuns(ctx, evaluationTestCaseUuid, query) (*AgentEvaluationTestCaseListEvaluationRunsResponse, error)
get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs
Retrieve Information About an Existing Evaluation Test Case
client.Agents.EvaluationTestCases.Get(ctx, testCaseUuid, query) (*AgentEvaluationTestCaseGetResponse, error)
get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
Update an Evaluation Test Case.
client.Agents.EvaluationTestCases.Update(ctx, testCaseUuid, body) (*AgentEvaluationTestCaseUpdateResponse, error)
put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
ModelsExpand Collapse
type APIEvaluationTestCase struct{…}
ArchivedAt Timeoptional
formatdate-time
CreatedAt Timeoptional
formatdate-time
CreatedByUserEmail stringoptional
CreatedByUserID stringoptional
formatuint64
Dataset APIEvaluationTestCaseDatasetoptional
CreatedAt Timeoptional

Time created at.

formatdate-time
DatasetName stringoptional

Name of the dataset.

DatasetUuid stringoptional

UUID of the dataset.

FileSize stringoptional

The size of the dataset uploaded file in bytes.

formatuint64
HasGroundTruth booloptional

Does the dataset have a ground truth column?

RowCount int64optional

Number of rows in the dataset.

formatint64
DatasetName stringoptional
DatasetUuid stringoptional
Description stringoptional
LatestVersionNumberOfRuns int64optional
formatint32
Metrics []APIEvaluationMetricoptional
Category APIEvaluationMetricCategoryoptional
Accepts one of the following:
const APIEvaluationMetricCategoryMetricCategoryUnspecified APIEvaluationMetricCategory = "METRIC_CATEGORY_UNSPECIFIED"
const APIEvaluationMetricCategoryMetricCategoryCorrectness APIEvaluationMetricCategory = "METRIC_CATEGORY_CORRECTNESS"
const APIEvaluationMetricCategoryMetricCategoryUserOutcomes APIEvaluationMetricCategory = "METRIC_CATEGORY_USER_OUTCOMES"
const APIEvaluationMetricCategoryMetricCategorySafetyAndSecurity APIEvaluationMetricCategory = "METRIC_CATEGORY_SAFETY_AND_SECURITY"
const APIEvaluationMetricCategoryMetricCategoryContextQuality APIEvaluationMetricCategory = "METRIC_CATEGORY_CONTEXT_QUALITY"
const APIEvaluationMetricCategoryMetricCategoryModelFit APIEvaluationMetricCategory = "METRIC_CATEGORY_MODEL_FIT"
Description stringoptional
Inverted booloptional

If true, the metric is inverted, meaning that a lower value is better.

IsMetricGoal booloptional
MetricName stringoptional
MetricRank int64optional
formatint64
MetricType APIEvaluationMetricMetricTypeoptional
Accepts one of the following:
const APIEvaluationMetricMetricTypeMetricTypeUnspecified APIEvaluationMetricMetricType = "METRIC_TYPE_UNSPECIFIED"
const APIEvaluationMetricMetricTypeMetricTypeGeneralQuality APIEvaluationMetricMetricType = "METRIC_TYPE_GENERAL_QUALITY"
const APIEvaluationMetricMetricTypeMetricTypeRagAndTool APIEvaluationMetricMetricType = "METRIC_TYPE_RAG_AND_TOOL"
MetricUuid stringoptional
MetricValueType APIEvaluationMetricMetricValueTypeoptional
Accepts one of the following:
const APIEvaluationMetricMetricValueTypeMetricValueTypeUnspecified APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_UNSPECIFIED"
const APIEvaluationMetricMetricValueTypeMetricValueTypeNumber APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_NUMBER"
const APIEvaluationMetricMetricValueTypeMetricValueTypeString APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_STRING"
const APIEvaluationMetricMetricValueTypeMetricValueTypePercentage APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_PERCENTAGE"
RangeMax float64optional

The maximum value for the metric.

formatfloat
RangeMin float64optional

The minimum value for the metric.

formatfloat
Name stringoptional
StarMetric APIStarMetricoptional
MetricUuid stringoptional
Name stringoptional
SuccessThreshold float64optional

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat
SuccessThresholdPct int64optional

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32
TestCaseUuid stringoptional
TotalRuns int64optional
formatint32
UpdatedAt Timeoptional
formatdate-time
UpdatedByUserEmail stringoptional
UpdatedByUserID stringoptional
formatuint64
Version int64optional
formatint64
type APIStarMetric struct{…}
MetricUuid stringoptional
Name stringoptional
SuccessThreshold float64optional

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat
SuccessThresholdPct int64optional

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32