Returns an estimated input token count for a message payload without generating a response. Useful for budget checks before sending expensive prompts.
Use a JWT or API key as a Bearer token.
Model id (e.g. abliterated-model).
Non-empty array of message objects. Each has role (user or assistant) and content (string or content-block array).
1Maximum number of tokens to generate.
x >= 1Sampling temperature.
0 <= x <= 2Enable Server-Sent Events streaming.
System prompt. Can be a string or an array of content blocks.
Optional moderation categories to block. Supported: harassment, hate, illicit, sexual. Self-harm and sexual/minors are always blocked.
Token count
Estimated number of input tokens.