The Problem
Agentic coding agents run hundreds to thousands of LLM calls per session. Not every call needs a frontier model — a task that classifies a diff or generates a docstring doesn’t need Claude Opus. But switching models manually mid-agent is cumbersome, and agents running autonomously have no mechanism to do it themselves.What It Does
The Pioneer Code Router is a low-latency model router trained on coding tasks. It reads the messages in each request, scores every candidate model against the task, and routes to the cheapest model that meets your quality bar — automatically, on every request. The router sits transparently in front of your inference calls. Your code sends requests exactly as before; the router decides which model executes them.Every routing decision is logged with the selected model, confidence score, expected cost, and the rule that fired — visible in the Router detail page and the inference detail view.
How It Works
Read the messages
The router classifies the task complexity from the conversation — e.g. trivial lookup vs. multi-file refactor.
Score every candidate model
It produces a calibrated success probability for each model on this specific task.
Candidate Models
The router selects from this pool. Use the Candidate Models setting to restrict it to a subset.| Model | Provider |
|---|---|
| DeepSeek V4 Flash | DeepSeek |
| Qwen 3.6 35B-A3B | Qwen |
| DeepSeek V4 Pro | DeepSeek |
| GLM 5.1 | ZhipuAI |
| Claude Haiku 4.5 | Anthropic |
| Claude Sonnet 4.6 | Anthropic |
| GPT-5.5 | OpenAI |
| Claude Opus 4.7 | Anthropic |
Parameters
The minimum calibrated success probability a candidate must reach to be selected. The router scores each model 0–1, representing its predicted likelihood of succeeding on this task. A model below this floor is never selected, even if it’s the cheapest.
- Lower values (e.g.
0.10) — more aggressive cost savings; the router will take chances on cheaper models - Higher values (e.g.
0.50) — router only routes when confident; falls back to your fallback model more often
The maximum allowed probability gap between the chosen model and the top-scoring candidate. If the cheapest qualifying model scores 0.80 but the best scores 0.96, the gap is 0.16 — above the default
0.15 — so the router moves up to a more capable model.Think of it as: never pick a model more than X worse than the best available option.- Lower values — router stays closer to the top-performing model; fewer cost savings
- Higher values — router accepts a wider quality gap in exchange for lower cost
A wider regret bound applied when the router’s internal risk classifiers don’t fire. The router detects high-stakes patterns (large multi-file refactors, security-sensitive code). When none of those signals fire, the task is considered low-risk and a wider regret budget applies — meaning the router routes to cheaper models more aggressively on simple tasks and stays conservative on complex ones.
The model used when the router declines to route — either because no candidate cleared the threshold, or because the router service was unreachable (transport error or timeout). The fallback always runs the full inference path.
If you set
allowed_models, the fallback is automatically pinned to your first allowed model unless you specify one explicitly.An optional allowlist of models the router may select from. When set, the router only considers models in this list. Useful for:
- Restricting to a specific provider (e.g. Anthropic-only)
- Removing models that underperform on your specific codebase
- Cost caps — exclude the most expensive candidates entirely
Routing Playground
The Routing Playground lets you test the router’s decisions before enabling it on live traffic. Paste any prompt or a real conversation and the playground shows which model the router would select, the confidence score, and the expected cost savings versus routing everything to your fallback.Reading Routing Decisions
Every inference detail view includes arouting block when a request went through the router.
The model that actually ran the request.
The router’s calibrated success probability for the selected model on this task.
The policy rule that determined the outcome:
threshold, max_regret, low_risk_max_regret, or fallback_declined.Cost saved versus routing the same request to the most expensive candidate.
Diagnostic codes from the router’s internal classifiers. Useful for understanding why a specific routing decision was made.