Retrieve Results of an Evaluation Run Prompt

client.agents.evaluationRuns.retrieveResults(, , ?): EvaluationRunRetrieveResultsResponse { prompt }

get/v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results/{prompt_id}

To retrieve results of an evaluation run, send a GET request to /v2/gen-ai/evaluation_runs/{evaluation_run_uuid}/results/{prompt_id}.

ParametersExpand Collapse

promptID: number

params: EvaluationRunRetrieveResultsParams { evaluation_run_uuid }

evaluation_run_uuid: string

Evaluation run UUID.

ReturnsExpand Collapse

EvaluationRunRetrieveResultsResponse { prompt }

prompt?: APIEvaluationPrompt { evaluation_trace_spans, ground_truth, input, 7 more }

evaluation_trace_spans?: Array<EvaluationTraceSpan>

The evaluated trace spans.

created_at?: string

When the span was created

formatdate-time

input?: unknown

Input data for the span (flexible structure - can be messages array, string, etc.)

name?: string

Name/identifier for the span

output?: unknown

Output data from the span (flexible structure - can be message, string, etc.)

retriever_chunks?: Array<RetrieverChunk>

Any retriever span chunks that were included as part of the span.

chunk_usage_pct?: number

The usage percentage of the chunk.

formatdouble

chunk_used?: boolean

Indicates if the chunk was used in the prompt.

index_uuid?: string

The index uuid (Knowledge Base) of the chunk.

source_name?: string

The source name for the chunk, e.g., the file name or document title.

text?: string

Text content of the chunk.

span_level_metric_results?: Array<APIEvaluationMetricResult { error_description, metric_name, metric_value_type, 3 more } >

The span-level metric results.

error_description?: string

Error description if the metric could not be calculated.

metric_name?: string

Metric name

metric_value_type?: "METRIC_VALUE_TYPE_UNSPECIFIED" | "METRIC_VALUE_TYPE_NUMBER" | "METRIC_VALUE_TYPE_STRING" | "METRIC_VALUE_TYPE_PERCENTAGE"

Accepts one of the following:

"METRIC_VALUE_TYPE_UNSPECIFIED"

"METRIC_VALUE_TYPE_NUMBER"

"METRIC_VALUE_TYPE_STRING"

"METRIC_VALUE_TYPE_PERCENTAGE"

number_value?: number

The value of the metric as a number.

formatdouble

reasoning?: string

Reasoning of the metric result.

string_value?: string

The value of the metric as a string.

type?: "TRACE_SPAN_TYPE_UNKNOWN" | "TRACE_SPAN_TYPE_LLM" | "TRACE_SPAN_TYPE_RETRIEVER" | "TRACE_SPAN_TYPE_TOOL"

Types of spans in a trace

Accepts one of the following:

"TRACE_SPAN_TYPE_UNKNOWN"

"TRACE_SPAN_TYPE_LLM"

"TRACE_SPAN_TYPE_RETRIEVER"

"TRACE_SPAN_TYPE_TOOL"

ground_truth?: string

The ground truth for the prompt.

input?: string

input_tokens?: string

The number of input tokens used in the prompt.

formatuint64

output?: string

output_tokens?: string

The number of output tokens used in the prompt.

formatuint64

prompt_chunks?: Array<PromptChunk>

The list of prompt chunks.

chunk_usage_pct?: number

The usage percentage of the chunk.

formatdouble

chunk_used?: boolean

Indicates if the chunk was used in the prompt.

index_uuid?: string

The index uuid (Knowledge Base) of the chunk.

source_name?: string

The source name for the chunk, e.g., the file name or document title.

text?: string

Text content of the chunk.

prompt_id?: number

Prompt ID

formatint64

prompt_level_metric_results?: Array<APIEvaluationMetricResult { error_description, metric_name, metric_value_type, 3 more } >

The metric results for the prompt.

error_description?: string

Error description if the metric could not be calculated.

metric_name?: string

Metric name

metric_value_type?: "METRIC_VALUE_TYPE_UNSPECIFIED" | "METRIC_VALUE_TYPE_NUMBER" | "METRIC_VALUE_TYPE_STRING" | "METRIC_VALUE_TYPE_PERCENTAGE"

Accepts one of the following:

"METRIC_VALUE_TYPE_UNSPECIFIED"

"METRIC_VALUE_TYPE_NUMBER"

"METRIC_VALUE_TYPE_STRING"

"METRIC_VALUE_TYPE_PERCENTAGE"

number_value?: number

The value of the metric as a number.

formatdouble

reasoning?: string

Reasoning of the metric result.

string_value?: string

The value of the metric as a string.

trace_id?: string

The trace id for the prompt.

Retrieve Results of an Evaluation Run Prompt

import Gradient from '@digitalocean/gradient';

const client = new Gradient({
  accessToken: 'My Access Token',
});

const response = await client.agents.evaluationRuns.retrieveResults(1, {
  evaluation_run_uuid: '"123e4567-e89b-12d3-a456-426614174000"',
});

console.log(response.prompt);

{
  "prompt": {
    "evaluation_trace_spans": [
      {
        "created_at": "2023-01-01T00:00:00Z",
        "input": {},
        "name": "example name",
        "output": {},
        "retriever_chunks": [
          {
            "chunk_usage_pct": 123,
            "chunk_used": true,
            "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
            "source_name": "example name",
            "text": "example string"
          }
        ],
        "span_level_metric_results": [
          {
            "error_description": "example string",
            "metric_name": "example name",
            "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
            "number_value": 123,
            "reasoning": "example string",
            "string_value": "example string"
          }
        ],
        "type": "TRACE_SPAN_TYPE_UNKNOWN"
      }
    ],
    "ground_truth": "example string",
    "input": "example string",
    "input_tokens": "12345",
    "output": "example string",
    "output_tokens": "12345",
    "prompt_chunks": [
      {
        "chunk_usage_pct": 123,
        "chunk_used": true,
        "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
        "source_name": "example name",
        "text": "example string"
      }
    ],
    "prompt_id": 123,
    "prompt_level_metric_results": [
      {
        "error_description": "example string",
        "metric_name": "example name",
        "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
        "number_value": 123,
        "reasoning": "example string",
        "string_value": "example string"
      }
    ],
    "trace_id": "123e4567-e89b-12d3-a456-426614174000"
  }
}

Returns Examples

{
  "prompt": {
    "evaluation_trace_spans": [
      {
        "created_at": "2023-01-01T00:00:00Z",
        "input": {},
        "name": "example name",
        "output": {},
        "retriever_chunks": [
          {
            "chunk_usage_pct": 123,
            "chunk_used": true,
            "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
            "source_name": "example name",
            "text": "example string"
          }
        ],
        "span_level_metric_results": [
          {
            "error_description": "example string",
            "metric_name": "example name",
            "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
            "number_value": 123,
            "reasoning": "example string",
            "string_value": "example string"
          }
        ],
        "type": "TRACE_SPAN_TYPE_UNKNOWN"
      }
    ],
    "ground_truth": "example string",
    "input": "example string",
    "input_tokens": "12345",
    "output": "example string",
    "output_tokens": "12345",
    "prompt_chunks": [
      {
        "chunk_usage_pct": 123,
        "chunk_used": true,
        "index_uuid": "123e4567-e89b-12d3-a456-426614174000",
        "source_name": "example name",
        "text": "example string"
      }
    ],
    "prompt_id": 123,
    "prompt_level_metric_results": [
      {
        "error_description": "example string",
        "metric_name": "example name",
        "metric_value_type": "METRIC_VALUE_TYPE_UNSPECIFIED",
        "number_value": 123,
        "reasoning": "example string",
        "string_value": "example string"
      }
    ],
    "trace_id": "123e4567-e89b-12d3-a456-426614174000"
  }
}