Model Router

The Problem

Agentic coding agents run hundreds to thousands of LLM calls per session. Not every call needs a frontier model. A task that classifies a diff or generates a docstring doesn’t need Claude Opus. But switching models manually mid-agent is cumbersome, and agents running autonomously have no mechanism to do it themselves.

What It Does

The Pioneer Router is a low-latency model router trained on coding tasks. It reads the messages in each request, scores every candidate model against the task, and routes to the cheapest model that meets your quality bar - automatically, on every request. The router sits transparently in front of your inference calls. Your code sends requests exactly as before; the router decides which model executes them. Send model: "pioneer/auto" on Anthropic-compatible requests (or run claude --model pioneer/auto in Claude Code). Routing details — selected model, confidence, rule, and savings — are stored in inferences.metadata.model_routing and visible in the Pioneer dashboard.

Every routing decision is logged with the selected model, confidence score, expected cost, and the rule that fired. These are visible in the Router detail page and the inference detail view.

Getting started

Model Routing currently works with coding tasks. You can try it in the playground or connect it through your coding harness.

Routing Playground

The Routing Playground lets you test the router’s decisions before enabling it on live traffic. Paste any prompt or a real conversation and the playground shows which model the router would select, the confidence score, and the expected cost savings versus routing everything to your fallback.

Integrate with your coding harness

You can also use Model Routing through any of the supported coding agents. See the integration docs for setup instructions: Claude Code, Codex, Cursor, OpenCode, OpenClaw, Hermes.

Monitoring

To see your live inference requests, where each one was routed, and the cost savings from using the Pioneer Router, go to agent.pioneer.ai/routers.

How It Works

Read the messages

The router classifies the task complexity from the conversation, e.g. trivial lookup vs. multi-file refactor.

Score every candidate model

It produces a calibrated success probability for each model on this specific task.

Apply your policy

It selects the cheapest model whose score clears your configured thresholds.

Fall back gracefully

If no candidate clears the bar, or the router is unreachable, the request falls back to your configured fallback model without error.

Candidate Models

The router selects from this pool. Use the Candidate Models setting to restrict it to a subset.

We are continue adding more coding models to the candidate pool. Check the platform for the most up-to-date list.

Model	Provider
DeepSeek V4 Flash	DeepSeek
DeepSeek V4 Pro	DeepSeek
GLM 5.2	ZhipuAI
Claude Sonnet 4.6	Anthropic
GPT-5.5	OpenAI
Claude Opus 4.7	Anthropic

Parameters

float

default:"0.20"

The minimum calibrated success probability a candidate must reach to be selected. The router scores each model 0–1, representing its predicted likelihood of succeeding on this task. A model below this floor is never selected, even if it’s the cheapest.

Lower values (e.g. 0.10) — more aggressive cost savings; the router will take chances on cheaper models
Higher values (e.g. 0.50) — router only routes when confident; falls back to your fallback model more often

float

default:"0.15"

The maximum allowed probability gap between the chosen model and the top-scoring candidate. If the cheapest qualifying model scores 0.80 but the best scores 0.96, the gap is 0.16 — above the default 0.15 — so the router moves up to a more capable model.Think of it as: never pick a model more than X worse than the best available option.

Lower values — router stays closer to the top-performing model; fewer cost savings
Higher values — router accepts a wider quality gap in exchange for lower cost

string

default:"claude-sonnet-4-6"

The model used when the router declines to route — either because no candidate cleared the threshold, or because the router service was unreachable (transport error or timeout). The fallback always runs the full inference path.

If you set allowed_models, the fallback is automatically pinned to your first allowed model unless you specify one explicitly.

string[]

default:"all candidates"

An optional allowlist of models the router may select from. When set, the router only considers models in this list. Useful for:

Restricting to a specific provider (e.g. Anthropic-only)
Removing models that underperform on your specific codebase
Cost caps — exclude the most expensive candidates entirely

Routing effort presets

Tier	Threshold	Max regret	Candidate pool
`low`	0.05	0.30	Cheapest models, max savings
`medium`	0.10	0.20	Good quality, good savings
`high`	0.20	0.15	Recommended settings
`xhigh`	0.35	0.08	Prefer stronger models
`max`	0.60	0.03	Best model every time

Reading Routing Decisions

Every inference detail view includes a routing block when a request went through the router.

string

The model that actually ran the request.

float

The router’s calibrated success probability for the selected model on this task.

string

The policy rule that determined the outcome: threshold, max_regret, or fallback_declined.

float

Cost saved versus routing the same request to the most expensive candidate.

float

Cost saved as a fraction of the most expensive candidate’s cost (0–1).

string[]

Diagnostic codes from the router’s internal classifiers. Useful for understanding why a specific routing decision was made.

Get Started

Integrations

Core Concepts

API Reference

Guides

Account

The Problem

What It Does

Getting started

Routing Playground

Integrate with your coding harness

Monitoring

How It Works

Candidate Models

Parameters

Routing effort presets

Reading Routing Decisions

​The Problem

​What It Does

​Getting started

​Routing Playground

​Integrate with your coding harness

​Monitoring

​How It Works

​Candidate Models

​Parameters

​Routing effort presets

​Reading Routing Decisions

The Problem

What It Does

Getting started

Routing Playground

Integrate with your coding harness

Monitoring

How It Works

Candidate Models

Parameters

Routing effort presets

Reading Routing Decisions