Evaluation Test Cases

List Evaluation Test Cases

client.Agents.EvaluationTestCases.List(ctx) (*AgentEvaluationTestCaseListResponse, error)

get/v2/gen-ai/evaluation_test_cases

Create Evaluation Test Case.

client.Agents.EvaluationTestCases.New(ctx, body) (*AgentEvaluationTestCaseNewResponse, error)

post/v2/gen-ai/evaluation_test_cases

List Evaluation Runs by Test Case

client.Agents.EvaluationTestCases.ListEvaluationRuns(ctx, evaluationTestCaseUuid, query) (*AgentEvaluationTestCaseListEvaluationRunsResponse, error)

get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs

Retrieve Information About an Existing Evaluation Test Case

client.Agents.EvaluationTestCases.Get(ctx, testCaseUuid, query) (*AgentEvaluationTestCaseGetResponse, error)

get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}

Update an Evaluation Test Case.

client.Agents.EvaluationTestCases.Update(ctx, testCaseUuid, body) (*AgentEvaluationTestCaseUpdateResponse, error)

put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}

ModelsExpand Collapse

type APIEvaluationTestCase struct{…}

ArchivedAt Timeoptional

formatdate-time

CreatedAt Timeoptional

formatdate-time

CreatedByUserEmail stringoptional

CreatedByUserID stringoptional

formatuint64

Dataset APIEvaluationTestCaseDatasetoptional

CreatedAt Timeoptional

Time created at.

formatdate-time

DatasetName stringoptional

Name of the dataset.

DatasetUuid stringoptional

UUID of the dataset.

FileSize stringoptional

The size of the dataset uploaded file in bytes.

formatuint64

HasGroundTruth booloptional

Does the dataset have a ground truth column?

RowCount int64optional

Number of rows in the dataset.

formatint64

DatasetName stringoptional

DatasetUuid stringoptional

Description stringoptional

LatestVersionNumberOfRuns int64optional

formatint32

Metrics []APIEvaluationMetricoptional

Category APIEvaluationMetricCategoryoptional

Accepts one of the following:

const APIEvaluationMetricCategoryMetricCategoryUnspecified APIEvaluationMetricCategory = "METRIC_CATEGORY_UNSPECIFIED"

const APIEvaluationMetricCategoryMetricCategoryCorrectness APIEvaluationMetricCategory = "METRIC_CATEGORY_CORRECTNESS"

const APIEvaluationMetricCategoryMetricCategoryUserOutcomes APIEvaluationMetricCategory = "METRIC_CATEGORY_USER_OUTCOMES"

const APIEvaluationMetricCategoryMetricCategorySafetyAndSecurity APIEvaluationMetricCategory = "METRIC_CATEGORY_SAFETY_AND_SECURITY"

const APIEvaluationMetricCategoryMetricCategoryContextQuality APIEvaluationMetricCategory = "METRIC_CATEGORY_CONTEXT_QUALITY"

const APIEvaluationMetricCategoryMetricCategoryModelFit APIEvaluationMetricCategory = "METRIC_CATEGORY_MODEL_FIT"

Description stringoptional

Inverted booloptional

If true, the metric is inverted, meaning that a lower value is better.

IsMetricGoal booloptional

MetricName stringoptional

MetricRank int64optional

formatint64

MetricType APIEvaluationMetricMetricTypeoptional

Accepts one of the following:

const APIEvaluationMetricMetricTypeMetricTypeUnspecified APIEvaluationMetricMetricType = "METRIC_TYPE_UNSPECIFIED"

const APIEvaluationMetricMetricTypeMetricTypeGeneralQuality APIEvaluationMetricMetricType = "METRIC_TYPE_GENERAL_QUALITY"

const APIEvaluationMetricMetricTypeMetricTypeRagAndTool APIEvaluationMetricMetricType = "METRIC_TYPE_RAG_AND_TOOL"

MetricUuid stringoptional

MetricValueType APIEvaluationMetricMetricValueTypeoptional

Accepts one of the following:

const APIEvaluationMetricMetricValueTypeMetricValueTypeUnspecified APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_UNSPECIFIED"

const APIEvaluationMetricMetricValueTypeMetricValueTypeNumber APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_NUMBER"

const APIEvaluationMetricMetricValueTypeMetricValueTypeString APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_STRING"

const APIEvaluationMetricMetricValueTypeMetricValueTypePercentage APIEvaluationMetricMetricValueType = "METRIC_VALUE_TYPE_PERCENTAGE"

RangeMax float64optional

The maximum value for the metric.

formatfloat

RangeMin float64optional

The minimum value for the metric.

formatfloat

Name stringoptional

StarMetric APIStarMetricoptional

MetricUuid stringoptional

Name stringoptional

SuccessThreshold float64optional

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat

SuccessThresholdPct int64optional

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32

TestCaseUuid stringoptional

TotalRuns int64optional

formatint32

UpdatedAt Timeoptional

formatdate-time

UpdatedByUserEmail stringoptional

UpdatedByUserID stringoptional

formatuint64

Version int64optional

formatint64

type APIStarMetric struct{…}

MetricUuid stringoptional

Name stringoptional

SuccessThreshold float64optional

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat

SuccessThresholdPct int64optional

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32