Pioneer model catalog: encoders, decoders, and inference

Pioneer supports two model families: encoder models (GLiNER) for structured extraction tasks like named entity recognition, and decoder models (LLMs) for text generation, classification, and open-ended prompting. The tables below are a snapshot of the current catalog — use GET /base-models to query the live list, which always reflects current availability and capabilities.

Encoder models (GLiNER)

GLiNER models perform named entity recognition and structured extraction. All encoder models support both LoRA and full fine-tuning, and are served on-demand after training.

Model ID	Label	Training	Inference
`fastino/gliner2-base-v1`	GLiNER2 Base	LoRA, Full	On-demand
`fastino/gliner2-large-v1`	GLiNER2 Large	LoRA, Full	On-demand
`fastino/gliner2-multi-v1`	GLiNER2 Multi	LoRA, Full	On-demand
`fastino/gliner2-multi-large-v1`	GLiNER2 Multi Large	LoRA, Full	On-demand

fastino/gliner2-multi-v1 and fastino/gliner2-multi-large-v1 are multilingual variants suitable for non-English text.

Decoder models — training

These LLMs are available for LoRA fine-tuning via POST /felix/training-jobs. When you submit a job, Pioneer automatically routes it to the best available provider.

Model ID	Label	Context
`Qwen/Qwen3-32B`	Qwen3 32B	131K
`Qwen/Qwen3-30B-A3B-Instruct-2507`	Qwen3 30B A3B Instruct	262K
`Qwen/Qwen3-30B-A3B`	Qwen3 30B A3B	131K
`Qwen/Qwen3-8B`	Qwen3 8B	131K
`Qwen/Qwen3-8B-Base`	Qwen3 8B Base	32K
`Qwen/Qwen3-4B-Instruct-2507`	Qwen3 4B Instruct	262K
`Qwen/Qwen2.5-Coder-0.5B`	Qwen2.5 Coder 0.5B	32K
`Qwen/Qwen2.5-7B-Instruct`	Qwen2.5 7B Instruct	131K
`Qwen/Qwen2.5-14B-Instruct`	Qwen2.5 14B Instruct	131K
`google/gemma-4-31b-it`	Gemma 4 31B IT	128K
`meta-llama/Llama-3.3-70B-Instruct`	Llama 3.3 70B Instruct	131K
`meta-llama/Llama-3.1-8B-Instruct`	Llama 3.1 8B Instruct	131K
`meta-llama/Llama-3.1-70B-Instruct`	Llama 3.1 70B Instruct	131K
`meta-llama/Llama-3.2-3B-Instruct`	Llama 3.2 3B Instruct	131K
`meta-llama/Llama-3.2-1B-Instruct`	Llama 3.2 1B Instruct	131K
`meta-llama/Llama-3.2-3B`	Llama 3.2 3B	131K
`meta-llama/Llama-3.2-1B`	Llama 3.2 1B	32K
`nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16`	Nemotron 3 Nano 30B	64K
`openai/gpt-oss-120b`	GPT-OSS 120B	131K
`openai/gpt-oss-20b`	GPT-OSS 20B	131K
`deepseek-ai/DeepSeek-V3.1`	DeepSeek V3.1	163K

Decoder models — serverless inference

These models are pre-deployed and available for inference immediately — no fine-tuning required and no startup latency. You pay per token.

Model ID	Label	Context
`Qwen/Qwen3-235B-A22B-Instruct-2507`	Qwen3 235B A22B Instruct	262K
`Qwen/Qwen3-8B`	Qwen3 8B	131K
`deepseek-ai/DeepSeek-V3.1`	DeepSeek V3.1	163K
`openai/gpt-oss-120b`	GPT-OSS 120B	131K
`openai/gpt-oss-20b`	GPT-OSS 20B	131K
`meta-llama/Llama-3.3-70B-Instruct`	Llama 3.3 70B Instruct	131K
`moonshotai/Kimi-K2.6`	Kimi K2.6	262K

On-demand vs. serverless inference

Pioneer offers two ways to serve predictions, and the right choice depends on your workflow. Serverless inference uses pre-deployed base model endpoints. There is no startup delay and you are billed per token. This is the default for models in the serverless table above and is ideal when you want to call a frontier model without fine-tuning. On-demand inference provisions a dedicated GPU after fine-tuning completes. Your LoRA adapter is loaded onto the GPU and served exclusively for your requests. Pioneer routes inference calls to an on-demand deployment automatically when you pass a training job ID as model_id.

Querying the live catalog

The tables above may lag behind newly added models. Use GET /base-models to get the current catalog at runtime.

# All models
curl https://api.pioneer.ai/base-models \
  -H "X-API-Key: YOUR_API_KEY"

# Only models that support inference
curl "https://api.pioneer.ai/base-models?supports_inference=true" \
  -H "X-API-Key: YOUR_API_KEY"

# Only models that support training
curl "https://api.pioneer.ai/base-models?supports_training=true" \
  -H "X-API-Key: YOUR_API_KEY"

# Filter by model family
curl "https://api.pioneer.ai/base-models?task_type=encoder" \
  -H "X-API-Key: YOUR_API_KEY"

curl "https://api.pioneer.ai/base-models?task_type=decoder" \
  -H "X-API-Key: YOUR_API_KEY"

Each entry in the response includes the model ID, its display label, context length, and boolean flags for supports_training and supports_inference. Use the model ID value directly in training job requests and inference calls.

Get Started

Core Concepts

Guides

Plans & Pricing

Pioneer model catalog: encoders, decoders, and inference

Encoder models (GLiNER)

Decoder models — training

Decoder models — serverless inference

On-demand vs. serverless inference

Querying the live catalog

Get Started

Core Concepts

Guides

Plans & Pricing

​Encoder models (GLiNER)

​Decoder models — training

​Decoder models — serverless inference

​On-demand vs. serverless inference

​Querying the live catalog

Encoder models (GLiNER)

Decoder models — training

Decoder models — serverless inference

On-demand vs. serverless inference

Querying the live catalog