Pioneer reuses repeated prompt prefixes (long system prompts, replayed conversation history) so you pay less and get faster responses on the cached portion. You don’t have to opt in — caching is applied automatically per provider.Documentation Index
Fetch the complete documentation index at: https://docs.pioneer.ai/llms.txt
Use this file to discover all available pages before exploring further.
What you do
Nothing. Send requests normally. Pioneer handles cache setup for you:- GPT (OpenAI) — caches prompt prefixes upstream automatically. No request changes are needed.
- Opus / Claude (Anthropic) — caches only the prefix before a
cache_controlbreakpoint. Pioneer inserts those breakpoints for you (on the system prompt, and on the latest turn of a multi-turn conversation) once a prompt is large enough to be worth caching.
cache_control breakpoints, Pioneer respects them and does not add its own.
How to read it back
Token usage on every response splits input tokens by cache status:| Field | Meaning |
|---|---|
prompt_tokens | Non-cached input tokens |
cache_read_tokens | Input tokens served from cache (discounted) |
cache_write_tokens | Input tokens written to cache this request |
completion_tokens | Output tokens |
total_tokens | Sum of the four above |
Billing
Cached input is cheaper than fresh input. Rates are relative to a model’s input price:| Provider | Cache read | Cache write |
|---|---|---|
| Claude / Opus | 0.1× input | 1.25× input |
| GPT-4 family | 0.5× input | billed at input rate |
| GPT-5 family | 0.1× input | billed at input rate |
For Anthropic models, the system prompt must be at least 1024 tokens for caching to activate. Cache entries expire after 5 minutes of inactivity.
Tips
- Keep your system prompt static. Any change to the cached prefix invalidates the cache.
- Do not inject dynamic content (timestamps, user IDs, session data) into the system prompt, move it to the user message instead.
- The Inference UI shows input and output token counts per request. Cache token breakdowns and discounted billing are visible on the Settings → Credit page.