429 Too Many Requests response.
Limits by endpoint
| Endpoint | Scope | Limit |
|---|---|---|
| All endpoints (default) | Per client IP | 1,000 / min · 10,000 / hour |
POST /inference | Per user | 1,200 / min |
POST /v1/chat/completions, /v1/completions, /v1/responses, /v1/messages | Per user | 200 / min |
POST /gliner-2/* | Per user | 15,000 / min |
POST /generate/* | Per user | 120 / min |
POST /felix/training-jobs | Per user | 20 / min |
Pro and Research plan subscribers have higher per-user rate limits than those shown above for the inference endpoints. Upgrade your plan at pioneer.ai/billing to unlock higher limits.
Handling 429 responses
When you exceed a limit, the API returns429 Too Many Requests and includes a Retry-After header that tells you how many seconds to wait before retrying.
cURL
429 responses with a simple sleep-and-retry loop:
Python

