Streaming

Gradient SDK support streaming on every client. We provide support for streaming responses using Server Side Events (SSE).

Examples

Streaming Serverless Inference

For example, you can access streaming serverless inference using the SDK:

import os
from do_gradientai import GradientAI

inference_client = GradientAI(
    inference_key=os.environ.get(
        "GRADIENTAI_INFERENCE_KEY"
    ),
)

stream = inference_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "What is the capital of Portugal?",
        }
    ],
    stream=True,
    model="llama3.3-70b-instruct",
)

for completion in stream:
    print(completion.choices)

Async Streaming Serverless Inference

The async client uses the exact same interface

from do_gradientai import AsyncGradientAI

inference_client = AsyncGradientAI(
    inference_key=os.environ.get(
        "GRADIENTAI_INFERENCE_KEY"
    ),
)

stream = await client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?",
        }
    ],
    model="ignored",
    stream=True,
)
async for completion in stream:
    print(completion.choices)

Streaming Agent Inference

For example, you can access streaming agent inference using the SDK:

import os
from do_gradientai import GradientAI

agent_client = GradientAI(
    agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"),
    agent_endpoint="https://my-agent.agents.do-ai.run",
)

stream = agent_client.agents.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "What is the capital of Portugal?",
        }
    ],
    stream=True,
    model="ignored",
)

for completion in stream:
    print(completion.choices)

Async Streaming Agent Inference

The async client uses the exact same interface

from do_gradientai import AsyncGradientAI

agent_client = AsyncGradientAI(
    agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"),
    agent_endpoint="https://my-agent.agents.do-ai.run",
)

stream = await agent_client.agents.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "What is the capital of Portugal?",
        }
    ],
    model="ignored",
    stream=True,
)
async for completion in stream:
    print(completion.choices)