Skip to content
  • Auto
  • Light
  • Dark

Evaluation Test Cases

Evaluation Test Cases

Evaluation Test Cases

Create Evaluation Test Case.
agents.evaluation_test_cases.create(EvaluationTestCaseCreateParams**kwargs) -> test_case_uuidstrEvaluationTestCaseCreateResponse
post/v2/gen-ai/evaluation_test_cases
List Evaluation Test Cases
agents.evaluation_test_cases.list() -> evaluation_test_caseslistEvaluationTestCaseListResponse
get/v2/gen-ai/evaluation_test_cases
List Evaluation Runs by Test Case
agents.evaluation_test_cases.list_evaluation_runs(strevaluation_test_case_uuid, EvaluationTestCaseListEvaluationRunsParams**kwargs) -> evaluation_runslistEvaluationTestCaseListEvaluationRunsResponse
get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs
Retrieve Information About an Existing Evaluation Test Case
agents.evaluation_test_cases.retrieve(strtest_case_uuid, EvaluationTestCaseRetrieveParams**kwargs) -> evaluation_test_caseAPIEvaluationTestCaseEvaluationTestCaseRetrieveResponse
get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
Update an Evaluation Test Case.
agents.evaluation_test_cases.update(strpath_test_case_uuid, EvaluationTestCaseUpdateParams**kwargs) -> test_case_uuidstrversionintEvaluationTestCaseUpdateResponse
put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
API Evaluation Test Case
APIEvaluationTestCaseclass
ShowShow
archived_atdatetime
optional
formatdate-time
created_atdatetime
optional
formatdate-time
created_by_user_emailstr
optional
created_by_user_idstr
optional
formatuint64
datasetcreated_atdatetimedataset_namestrdataset_uuidstrfile_sizestrhas_ground_truthboolrow_countintDataset
optional
Hide ParametersShow Parameters
created_atdatetime
optional

Time created at.

formatdate-time
dataset_namestr
optional

Name of the dataset.

dataset_uuidstr
optional

UUID of the dataset.

file_sizestr
optional

The size of the dataset uploaded file in bytes.

formatuint64
has_ground_truthbool
optional

Does the dataset have a ground truth column?

row_countint
optional

Number of rows in the dataset.

formatint64
dataset_namestr
optional
dataset_uuidstr
optional
descriptionstr
optional
latest_version_number_of_runsint
optional
formatint32
metricslist
optional
Optional[List[descriptionstrinvertedboolmetric_namestrmetric_typeliteralmetric_uuidstrmetric_value_typeliteralrange_maxfloatrange_minfloatAPIEvaluationMetric]]
Hide ParametersShow Parameters
descriptionstr
optional
invertedbool
optional

If true, the metric is inverted, meaning that a lower value is better.

metric_namestr
optional
metric_typeliteral
optional
Optional[Literal["METRIC_TYPE_UNSPECIFIED", "METRIC_TYPE_GENERAL_QUALITY", "METRIC_TYPE_RAG_AND_TOOL"]]
Hide ParametersShow Parameters
"METRIC_TYPE_UNSPECIFIED"
"METRIC_TYPE_GENERAL_QUALITY"
"METRIC_TYPE_RAG_AND_TOOL"
metric_uuidstr
optional
metric_value_typeliteral
optional
Optional[Literal["METRIC_VALUE_TYPE_UNSPECIFIED", "METRIC_VALUE_TYPE_NUMBER", "METRIC_VALUE_TYPE_STRING", "METRIC_VALUE_TYPE_PERCENTAGE"]]
Hide ParametersShow Parameters
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
range_maxfloat
optional

The maximum value for the metric.

formatfloat
range_minfloat
optional

The minimum value for the metric.

formatfloat
namestr
optional
star_metricAPIStarMetric
optional
test_case_uuidstr
optional
total_runsint
optional
formatint32
updated_atdatetime
optional
formatdate-time
updated_by_user_emailstr
optional
updated_by_user_idstr
optional
formatuint64
versionint
optional
formatint64
API Star Metric
APIStarMetricclass
ShowShow
metric_uuidstr
optional
namestr
optional
success_thresholdfloat
optional

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat
success_threshold_pctint
optional

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32