Stream tokens from abliteration.ai

To stream tokens from abliteration.ai, set stream: true on any chat completion request. The response is a sequence of server-sent events that render tokens as the model produces them — reducing time-to-first-token.

Python
Node
curl

from openai import OpenAI

client = OpenAI(base_url="https://api.abliteration.ai/v1", api_key=os.environ["ABLIT_KEY"])

stream = client.chat.completions.create(
    model="abliterated-model",
    messages=[{"role": "user", "content": "Write a haiku about streaming"}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.abliteration.ai/v1",
  apiKey: process.env.ABLIT_KEY,
});

const stream = await client.chat.completions.create({
  model: "abliterated-model",
  messages: [{ role: "user", content: "Write a haiku about streaming" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0].delta.content ?? "");
}

curl https://api.abliteration.ai/v1/chat/completions \
  -H "Authorization: Bearer $ABLIT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "abliterated-model",
    "messages": [{"role": "user", "content": "Write a haiku about streaming"}],
    "stream": true
  }'

Streamed chunks arrive as data: {...}\n\n SSE frames terminated by data: [DONE]. Most SDKs parse this for you.

Tool calls in streams

When the model calls a tool, tool_calls arrives across multiple chunks. Accumulate function.arguments string fragments until the chunk with finish_reason: "tool_calls". See tool calling for a complete example.

Last modified on May 3, 2026

ThinkingToggle the thinking step on abliteration.ai with the thinking parameter on Chat Completions and Anthropic Messages.

Get started

Capabilities

Integrations

Policy Gateway

Reference

Stream tokens from abliteration.ai

Tool calls in streams

​Tool calls in streams

Tool calls in streams