Streaming
Gradient SDK support streaming on every client. We provide support for streaming responses using Server Side Events (SSE).
Examples
Section titled “Examples”Streaming Serverless Inference
Section titled “Streaming Serverless Inference”For example, you can access streaming serverless inference using the SDK:
import osfrom do_gradientai import GradientAI
inference_client = GradientAI( inference_key=os.environ.get( "GRADIENTAI_INFERENCE_KEY" ),)
stream = inference_client.chat.completions.create( messages=[ { "role": "user", "content": "What is the capital of Portugal?", } ], stream=True, model="llama3.3-70b-instruct",)
for completion in stream: print(completion.choices)
Async Streaming Serverless Inference
Section titled “Async Streaming Serverless Inference”The async client uses the exact same interface
from do_gradientai import AsyncGradientAI
inference_client = AsyncGradientAI( inference_key=os.environ.get( "GRADIENTAI_INFERENCE_KEY" ),)
stream = await client.chat.completions.create( messages=[ { "role": "user", "content": "What is the capital of France?", } ], model="ignored", stream=True,)async for completion in stream: print(completion.choices)
Streaming Agent Inference
Section titled “Streaming Agent Inference”For example, you can access streaming agent inference using the SDK:
import osfrom do_gradientai import GradientAI
agent_client = GradientAI( agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"), agent_endpoint="https://my-agent.agents.do-ai.run",)
stream = agent_client.agents.chat.completions.create( messages=[ { "role": "user", "content": "What is the capital of Portugal?", } ], stream=True, model="ignored",)
for completion in stream: print(completion.choices)
Async Streaming Agent Inference
Section titled “Async Streaming Agent Inference”The async client uses the exact same interface
from do_gradientai import AsyncGradientAI
agent_client = AsyncGradientAI( agent_key=os.environ.get("GRADIENTAI_AGENT_KEY"), agent_endpoint="https://my-agent.agents.do-ai.run",)
stream = await agent_client.agents.chat.completions.create( messages=[ { "role": "user", "content": "What is the capital of Portugal?", } ], model="ignored", stream=True,)async for completion in stream: print(completion.choices)