Documentation Index
Fetch the complete documentation index at: https://docs.pioneer.ai/llms.txt
Use this file to discover all available pages before exploring further.
Pioneer supports two model families: encoder models (GLiNER) for structured extraction tasks like named entity recognition, and decoder models (LLMs) for text generation, classification, and open-ended prompting. The tables below are a snapshot of the current catalog — use GET /base-models to query the live list, which always reflects current availability and capabilities.
Encoder models (GLiNER)
GLiNER models perform named entity recognition and structured extraction. All encoder models support both LoRA and full fine-tuning, and are served on-demand after training.
| Model ID | Label | Training | Inference |
|---|
fastino/gliner2-base-v1 | GLiNER2 Base | LoRA, Full | On-demand |
fastino/gliner2-large-v1 | GLiNER2 Large | LoRA, Full | On-demand |
fastino/gliner2-multi-v1 | GLiNER2 Multi | LoRA, Full | On-demand |
fastino/gliner2-multi-large-v1 | GLiNER2 Multi Large | LoRA, Full | On-demand |
fastino/gliner2-multi-v1 and fastino/gliner2-multi-large-v1 are multilingual variants suitable for non-English text.
Decoder models — training
These LLMs are available for LoRA fine-tuning via POST /felix/training-jobs. When you submit a job, Pioneer automatically routes it to the best available provider.
| Model ID | Label | Context |
|---|
Qwen/Qwen3-32B | Qwen3 32B | 131K |
Qwen/Qwen3-30B-A3B-Instruct-2507 | Qwen3 30B A3B Instruct | 262K |
Qwen/Qwen3-30B-A3B | Qwen3 30B A3B | 131K |
Qwen/Qwen3-8B | Qwen3 8B | 131K |
Qwen/Qwen3-8B-Base | Qwen3 8B Base | 32K |
Qwen/Qwen3-4B-Instruct-2507 | Qwen3 4B Instruct | 262K |
Qwen/Qwen2.5-Coder-0.5B | Qwen2.5 Coder 0.5B | 32K |
Qwen/Qwen2.5-7B-Instruct | Qwen2.5 7B Instruct | 131K |
Qwen/Qwen2.5-14B-Instruct | Qwen2.5 14B Instruct | 131K |
google/gemma-4-31b-it | Gemma 4 31B IT | 128K |
meta-llama/Llama-3.3-70B-Instruct | Llama 3.3 70B Instruct | 131K |
meta-llama/Llama-3.1-8B-Instruct | Llama 3.1 8B Instruct | 131K |
meta-llama/Llama-3.1-70B-Instruct | Llama 3.1 70B Instruct | 131K |
meta-llama/Llama-3.2-3B-Instruct | Llama 3.2 3B Instruct | 131K |
meta-llama/Llama-3.2-1B-Instruct | Llama 3.2 1B Instruct | 131K |
meta-llama/Llama-3.2-3B | Llama 3.2 3B | 131K |
meta-llama/Llama-3.2-1B | Llama 3.2 1B | 32K |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | Nemotron 3 Nano 30B | 64K |
openai/gpt-oss-120b | GPT-OSS 120B | 131K |
openai/gpt-oss-20b | GPT-OSS 20B | 131K |
deepseek-ai/DeepSeek-V3.1 | DeepSeek V3.1 | 163K |
Decoder models — serverless inference
These models are pre-deployed and available for inference immediately — no fine-tuning required and no startup latency. You pay per token.
| Model ID | Label | Context |
|---|
Qwen/Qwen3-235B-A22B-Instruct-2507 | Qwen3 235B A22B Instruct | 262K |
Qwen/Qwen3-8B | Qwen3 8B | 131K |
deepseek-ai/DeepSeek-V3.1 | DeepSeek V3.1 | 163K |
openai/gpt-oss-120b | GPT-OSS 120B | 131K |
openai/gpt-oss-20b | GPT-OSS 20B | 131K |
meta-llama/Llama-3.3-70B-Instruct | Llama 3.3 70B Instruct | 131K |
moonshotai/Kimi-K2.6 | Kimi K2.6 | 262K |
On-demand vs. serverless inference
Pioneer offers two ways to serve predictions, and the right choice depends on your workflow.
Serverless inference uses pre-deployed base model endpoints. There is no startup delay and you are billed per token. This is the default for models in the serverless table above and is ideal when you want to call a frontier model without fine-tuning.
On-demand inference provisions a dedicated GPU after fine-tuning completes. Your LoRA adapter is loaded onto the GPU and served exclusively for your requests. Pioneer routes inference calls to an on-demand deployment automatically when you pass a training job ID as model_id.
Querying the live catalog
The tables above may lag behind newly added models. Use GET /base-models to get the current catalog at runtime.
# All models
curl https://api.pioneer.ai/base-models \
-H "X-API-Key: YOUR_API_KEY"
# Only models that support inference
curl "https://api.pioneer.ai/base-models?supports_inference=true" \
-H "X-API-Key: YOUR_API_KEY"
# Only models that support training
curl "https://api.pioneer.ai/base-models?supports_training=true" \
-H "X-API-Key: YOUR_API_KEY"
# Filter by model family
curl "https://api.pioneer.ai/base-models?task_type=encoder" \
-H "X-API-Key: YOUR_API_KEY"
curl "https://api.pioneer.ai/base-models?task_type=decoder" \
-H "X-API-Key: YOUR_API_KEY"
Each entry in the response includes the model ID, its display label, context length, and boolean flags for supports_training and supports_inference. Use the model ID value directly in training job requests and inference calls.