Skip to main content
Rate limits are per API key and vary by plan. Check your current limits in the console.

Headers

Every response includes:
x-ratelimit-limit-requests: 600
x-ratelimit-remaining-requests: 597
x-ratelimit-limit-tokens: 2000000
x-ratelimit-remaining-tokens: 1998342
x-ratelimit-reset-requests: 2026-04-20T18:30:00Z

Handling 429

On 429 Too Many Requests, honor the Retry-After header and back off:
import time, random
from openai import OpenAI, RateLimitError

client = OpenAI(base_url="https://api.abliteration.ai/v1", api_key=os.environ["ABLIT_KEY"])

for attempt in range(5):
    try:
        return client.chat.completions.create(model="abliterated-model", messages=msgs)
    except RateLimitError as e:
        wait = int(e.response.headers.get("retry-after", 2 ** attempt))
        time.sleep(wait + random.random())

Raising limits

Upgrade your plan or contact support@abliteration.ai for custom quotas.