Skip to content
  • Auto
  • Light
  • Dark

Streaming

Gradient SDK support streaming on every client. We provide support for streaming responses using Server Side Events (SSE).

For example, you can access streaming serverless inference using the SDK:

Python
import os
from do_gradientai import GradientAI
inference_client = GradientAI(
inference_key=os.environ.get(
"GRADIENTAI_INFERENCE_KEY"
),
)
stream = inference_client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the capital of Portugal?",
}
],
stream=True,
model="llama3.3-70b-instruct",
)
for completion in stream:
print(completion.choices)

The async client uses the exact same interface

Python
from do_gradientai import AsyncGradientAI
inference_client = AsyncGradientAI(
inference_key=os.environ.get(
"GRADIENTAI_INFERENCE_KEY"
),
)
stream = await client.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the capital of France?",
}
],
model="ignored",
stream=True,
)
async for completion in stream:
print(completion.choices)

For example, you can access streaming agent inference using the SDK:

Python
import os
from do_gradientai import GradientAI
agent_client = GradientAI(
agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"),
agent_endpoint="https://my-agent.agents.do-ai.run",
)
stream = agent_client.agents.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the capital of Portugal?",
}
],
stream=True,
model="ignored",
)
for completion in stream:
print(completion.choices)

The async client uses the exact same interface

Python
from do_gradientai import AsyncGradientAI
agent_client = AsyncGradientAI(
agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"),
agent_endpoint="https://my-agent.agents.do-ai.run",
)
stream = await agent_client.agents.chat.completions.create(
messages=[
{
"role": "user",
"content": "What is the capital of Portugal?",
}
],
model="ignored",
stream=True,
)
async for completion in stream:
print(completion.choices)