Skip to content

Retrieve Results of an Evaluation Run Prompt

agents.evaluation_runs.retrieve_results(intprompt_id, EvaluationRunRetrieveResultsParams**kwargs) -> EvaluationRunRetrieveResultsResponse
get/v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results/{prompt_id}

To retrieve results of an evaluation run, send a GET request to /v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results/{prompt_id}.

ParametersExpand Collapse
evaluation_run_uuid: str
prompt_id: int
ReturnsExpand Collapse
class EvaluationRunRetrieveResultsResponse:
prompt: Optional[APIEvaluationPrompt]
evaluation_trace_spans: Optional[List[EvaluationTraceSpan]]

The evaluated trace spans.

created_at: Optional[datetime]

When the span was created

formatdate-time
input: Optional[object]

Input data for the span (flexible structure - can be messages array, string, etc.)

name: Optional[str]

Name/identifier for the span

output: Optional[object]

Output data from the span (flexible structure - can be message, string, etc.)

retriever_chunks: Optional[List[EvaluationTraceSpanRetrieverChunk]]

Any retriever span chunks that were included as part of the span.

chunk_usage_pct: Optional[float]

The usage percentage of the chunk.

formatdouble
chunk_used: Optional[bool]

Indicates if the chunk was used in the prompt.

index_uuid: Optional[str]

The index uuid (Knowledge Base) of the chunk.

source_name: Optional[str]

The source name for the chunk, e.g., the file name or document title.

text: Optional[str]

Text content of the chunk.

span_level_metric_results: Optional[List[APIEvaluationMetricResult]]

The span-level metric results.

error_description: Optional[str]

Error description if the metric could not be calculated.

metric_name: Optional[str]

Metric name

metric_value_type: Optional[Literal["METRIC_VALUE_TYPE_UNSPECIFIED", "METRIC_VALUE_TYPE_NUMBER", "METRIC_VALUE_TYPE_STRING", "METRIC_VALUE_TYPE_PERCENTAGE"]]
Accepts one of the following:
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
number_value: Optional[float]

The value of the metric as a number.

formatdouble
reasoning: Optional[str]

Reasoning of the metric result.

string_value: Optional[str]

The value of the metric as a string.

type: Optional[Literal["TRACE_SPAN_TYPE_UNKNOWN", "TRACE_SPAN_TYPE_LLM", "TRACE_SPAN_TYPE_RETRIEVER", "TRACE_SPAN_TYPE_TOOL"]]

Types of spans in a trace

Accepts one of the following:
"TRACE_SPAN_TYPE_UNKNOWN"
"TRACE_SPAN_TYPE_LLM"
"TRACE_SPAN_TYPE_RETRIEVER"
"TRACE_SPAN_TYPE_TOOL"
ground_truth: Optional[str]

The ground truth for the prompt.

input: Optional[str]
input_tokens: Optional[str]

The number of input tokens used in the prompt.

formatuint64
output: Optional[str]
output_tokens: Optional[str]

The number of output tokens used in the prompt.

formatuint64
prompt_chunks: Optional[List[PromptChunk]]

The list of prompt chunks.

chunk_usage_pct: Optional[float]

The usage percentage of the chunk.

formatdouble
chunk_used: Optional[bool]

Indicates if the chunk was used in the prompt.

index_uuid: Optional[str]

The index uuid (Knowledge Base) of the chunk.

source_name: Optional[str]

The source name for the chunk, e.g., the file name or document title.

text: Optional[str]

Text content of the chunk.

prompt_id: Optional[int]

Prompt ID

formatint64
prompt_level_metric_results: Optional[List[APIEvaluationMetricResult]]

The metric results for the prompt.

error_description: Optional[str]

Error description if the metric could not be calculated.

metric_name: Optional[str]

Metric name

metric_value_type: Optional[Literal["METRIC_VALUE_TYPE_UNSPECIFIED", "METRIC_VALUE_TYPE_NUMBER", "METRIC_VALUE_TYPE_STRING", "METRIC_VALUE_TYPE_PERCENTAGE"]]
Accepts one of the following:
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
number_value: Optional[float]

The value of the metric as a number.

formatdouble
reasoning: Optional[str]

Reasoning of the metric result.

string_value: Optional[str]

The value of the metric as a string.

trace_id: Optional[str]

The trace id for the prompt.

Retrieve Results of an Evaluation Run Prompt
from gradient import Gradient

client = Gradient(
    access_token="My Access Token",
)
response = client.agents.evaluation_runs.retrieve_results(
    prompt_id=1,
    evaluation_run_uuid="\"123e4567-e89b-12d3-a456-426614174000\"",
)
print(response.prompt)
{
  "prompt": {
    "evaluation_trace_spans": [
      {
        "created_at": "2023-01-01T00:00:00Z",
        "input": {},
        "name": "example name",
        "output": {},
        "retriever_chunks": [
          {
            "chunk_usage_pct": 123,
            "chunk_used": true,
            "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
            "source_name": "example name",
            "text": "example string"
          }
        ],
        "span_level_metric_results": [
          {
            "error_description": "example string",
            "metric_name": "example name",
            "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
            "number_value": 123,
            "reasoning": "example string",
            "string_value": "example string"
          }
        ],
        "type": "TRACE_SPAN_TYPE_UNKNOWN"
      }
    ],
    "ground_truth": "example string",
    "input": "example string",
    "input_tokens": "12345",
    "output": "example string",
    "output_tokens": "12345",
    "prompt_chunks": [
      {
        "chunk_usage_pct": 123,
        "chunk_used": true,
        "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
        "source_name": "example name",
        "text": "example string"
      }
    ],
    "prompt_id": 123,
    "prompt_level_metric_results": [
      {
        "error_description": "example string",
        "metric_name": "example name",
        "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
        "number_value": 123,
        "reasoning": "example string",
        "string_value": "example string"
      }
    ],
    "trace_id": "123e4567-e89b-12d3-a456-426614174000"
  }
}
Returns Examples
{
  "prompt": {
    "evaluation_trace_spans": [
      {
        "created_at": "2023-01-01T00:00:00Z",
        "input": {},
        "name": "example name",
        "output": {},
        "retriever_chunks": [
          {
            "chunk_usage_pct": 123,
            "chunk_used": true,
            "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
            "source_name": "example name",
            "text": "example string"
          }
        ],
        "span_level_metric_results": [
          {
            "error_description": "example string",
            "metric_name": "example name",
            "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
            "number_value": 123,
            "reasoning": "example string",
            "string_value": "example string"
          }
        ],
        "type": "TRACE_SPAN_TYPE_UNKNOWN"
      }
    ],
    "ground_truth": "example string",
    "input": "example string",
    "input_tokens": "12345",
    "output": "example string",
    "output_tokens": "12345",
    "prompt_chunks": [
      {
        "chunk_usage_pct": 123,
        "chunk_used": true,
        "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
        "source_name": "example name",
        "text": "example string"
      }
    ],
    "prompt_id": 123,
    "prompt_level_metric_results": [
      {
        "error_description": "example string",
        "metric_name": "example name",
        "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
        "number_value": 123,
        "reasoning": "example string",
        "string_value": "example string"
      }
    ],
    "trace_id": "123e4567-e89b-12d3-a456-426614174000"
  }
}