Skip to content
  • Auto
  • Light
  • Dark

List Results

Retrieve Results of an Evaluation Run
agents.evaluation_runs.list_results(strevaluation_run_uuid, EvaluationRunListResultsParams**kwargs) -> evaluation_runAPIEvaluationRunlinksAPILinksmetaAPIMetapromptslistEvaluationRunListResultsResponse
get/v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results

To retrieve results of an evaluation run, send a GET request to /v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results.

Parameters
evaluation_run_uuidstr
pageint
optional

Page number.

per_pageint
optional

Items per page.

Returns
EvaluationRunListResultsResponseclass

Gets the full results of an evaluation run with all prompts.

Hide ParametersShow Parameters
evaluation_runAPIEvaluationRun
optional
metaAPIMeta
optional

Meta information about the data set

promptslist
optional
Optional[List[ground_truthstrinputstrinput_tokensstroutputstroutput_tokensstrprompt_chunkslistprompt_idintprompt_level_metric_resultslistAPIEvaluationPrompt]]

The prompt level results.

Hide ParametersShow Parameters
ground_truthstr
optional

The ground truth for the prompt.

inputstr
optional
input_tokensstr
optional

The number of input tokens used in the prompt.

formatuint64
outputstr
optional
output_tokensstr
optional

The number of output tokens used in the prompt.

formatuint64
prompt_chunkslist
optional
Optional[List[PromptChunk]]

The list of prompt chunks.

Hide ParametersShow Parameters
chunk_usage_pctfloat
optional

The usage percentage of the chunk.

formatdouble
chunk_usedbool
optional

Indicates if the chunk was used in the prompt.

index_uuidstr
optional

The index uuid (Knowledge Base) of the chunk.

source_namestr
optional

The source name for the chunk, e.g., the file name or document title.

textstr
optional

Text content of the chunk.

prompt_idint
optional

Prompt ID

formatint64
prompt_level_metric_resultslist
optional
Optional[List[error_descriptionstrmetric_namestrmetric_value_typeliteralnumber_valuefloatreasoningstrstring_valuestrAPIEvaluationMetricResult]]

The metric results for the prompt.

Hide ParametersShow Parameters
error_descriptionstr
optional

Error description if the metric could not be calculated.

metric_namestr
optional

Metric name

metric_value_typeliteral
optional
Optional[Literal["METRIC_VALUE_TYPE_UNSPECIFIED", "METRIC_VALUE_TYPE_NUMBER", "METRIC_VALUE_TYPE_STRING", "METRIC_VALUE_TYPE_PERCENTAGE"]]
Hide ParametersShow Parameters
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
number_valuefloat
optional

The value of the metric as a number.

formatdouble
reasoningstr
optional

Reasoning of the metric result.

string_valuestr
optional

The value of the metric as a string.

from do_gradientai import GradientAI

client = GradientAI()
response = client.agents.evaluation_runs.list_results(
    evaluation_run_uuid="\"123e4567-e89b-12d3-a456-426614174000\"",
)
print(response.evaluation_run)
200 Example
{
  "evaluation_run": {
    "agent_deleted": true,
    "agent_name": "\"example name\"",
    "agent_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
    "agent_version_hash": "\"example string\"",
    "agent_workspace_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
    "created_by_user_email": "[email protected]",
    "created_by_user_id": "\"12345\"",
    "error_description": "\"example string\"",
    "evaluation_run_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
    "evaluation_test_case_workspace_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
    "finished_at": "2023-01-01T00:00:00Z",
    "pass_status": true,
    "queued_at": "2023-01-01T00:00:00Z",
    "run_level_metric_results": [
      {
        "error_description": "\"example string\"",
        "metric_name": "\"example name\"",
        "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
        "number_value": 123,
        "reasoning": "\"example string\"",
        "string_value": "\"example string\""
      }
    ],
    "run_name": "\"example name\"",
    "star_metric_result": {
      "error_description": "\"example string\"",
      "metric_name": "\"example name\"",
      "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
      "number_value": 123,
      "reasoning": "\"example string\"",
      "string_value": "\"example string\""
    },
    "started_at": "2023-01-01T00:00:00Z",
    "status": "EVALUATION_RUN_STATUS_UNSPECIFIED",
    "test_case_description": "\"example string\"",
    "test_case_name": "\"example name\"",
    "test_case_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
    "test_case_version": 123
  },
  "links": {
    "pages": {
      "first": "\"example string\"",
      "last": "\"example string\"",
      "next": "\"example string\"",
      "previous": "\"example string\""
    }
  },
  "meta": {
    "page": 123,
    "pages": 123,
    "total": 123
  },
  "prompts": [
    {
      "ground_truth": "\"example string\"",
      "input": "\"example string\"",
      "input_tokens": "\"12345\"",
      "output": "\"example string\"",
      "output_tokens": "\"12345\"",
      "prompt_chunks": [
        {
          "chunk_usage_pct": 123,
          "chunk_used": true,
          "index_uuid": "\"123e4567-e89b-12d3-a456-426614174000\"",
          "source_name": "\"example name\"",
          "text": "\"example string\""
        }
      ],
      "prompt_id": 123,
      "prompt_level_metric_results": [
        {
          "error_description": "\"example string\"",
          "metric_name": "\"example name\"",
          "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
          "number_value": 123,
          "reasoning": "\"example string\"",
          "string_value": "\"example string\""
        }
      ]
    }
  ]
}