Skip to main content
Pioneer supports LoRA fine-tuning on a wide range of open-source decoder models — from compact 1B-parameter models to 70B+ frontier models. You bring your training data, choose a base model that fits your task and budget, and Pioneer handles the infrastructure, routing, and serving. The result is a fine-tuned model you can call over the same API, with no GPU management required.
1

Choose a decoder base model

Use GET /base-models to see the full current catalog, filtered to models that support training:
curl "https://api.pioneer.ai/base-models?task_type=decoder&supports_training=true" \
  -H "X-API-Key: YOUR_API_KEY"
The table below shows a selection of popular options. Context window size matters if your training examples or inference prompts are long.
Model IDLabelContext
Qwen/Qwen3-32BQwen3 32B131K
Qwen/Qwen3-30B-A3B-Instruct-2507Qwen3 30B A3B Instruct262K
Qwen/Qwen3-8BQwen3 8B131K
Qwen/Qwen3-4B-Instruct-2507Qwen3 4B Instruct262K
Qwen/Qwen2.5-7B-InstructQwen2.5 7B Instruct131K
Qwen/Qwen2.5-14B-InstructQwen2.5 14B Instruct131K
meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B Instruct131K
meta-llama/Llama-3.1-8B-InstructLlama 3.1 8B Instruct131K
meta-llama/Llama-3.1-70B-InstructLlama 3.1 70B Instruct131K
meta-llama/Llama-3.2-3B-InstructLlama 3.2 3B Instruct131K
deepseek-ai/DeepSeek-V3.1DeepSeek V3.1163K
google/gemma-4-31b-itGemma 4 31B IT128K
openai/gpt-oss-120bGPT-OSS 120B131K
Choosing a model size: Smaller models (1B–8B) train and respond faster and cost less. Larger models (30B–70B) handle complex reasoning and longer inputs more reliably. Start with Qwen/Qwen3-8B or meta-llama/Llama-3.1-8B-Instruct for most tasks and scale up if needed.
2

Prepare your training data

LLM fine-tuning uses the decoder task type. Your dataset should contain prompt-completion pairs — conversations, instruction-response examples, or any input-output format relevant to your task.You can generate synthetic decoder training data with Pioneer’s /generate endpoint:
curl -X POST https://api.pioneer.ai/generate \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_type": "decoder",
    "dataset_name": "my-llm-dataset",
    "num_examples": 200,
    "domain_description": "Customer support for a SaaS product"
  }'
See the Synthetic Data guide for full details on generation options.Once generated or uploaded, confirm your dataset is ready:
curl https://api.pioneer.ai/felix/datasets/my-llm-dataset \
  -H "X-API-Key: YOUR_API_KEY"
3

Start a training job

Submit a training job with your chosen decoder base model and training_type: "lora".
curl -X POST https://api.pioneer.ai/felix/training-jobs \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "my-llm-model",
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "datasets": [{"name": "my-llm-dataset"}],
    "training_type": "lora",
    "nr_epochs": 3,
    "learning_rate": 2e-4
  }'
Pioneer routes your job automatically to the best available provider. The response includes your job ID:
{ "id": "uuid-of-training-job", "status": "requested" }
4

Poll until training is complete

Check job status by polling GET /felix/training-jobs/:id.
curl https://api.pioneer.ai/felix/training-jobs/YOUR_JOB_ID \
  -H "X-API-Key: YOUR_API_KEY"
Status transitions: requestedrunningcomplete (or failed / stopped).You can also retrieve training logs while the job is running:
curl https://api.pioneer.ai/felix/training-jobs/YOUR_JOB_ID/logs \
  -H "X-API-Key: YOUR_API_KEY"
5

Run inference on your fine-tuned model

Once the job status is "complete", use your job ID as the model identifier. Pioneer supports three inference interfaces.Pioneer native API — use "task": "generate" for decoder models:
curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{                    
  "model_id":"YOUR_JOB_ID",                                                                       
  "task": "generate",    
  "messages": [{"role": "user", "content": "Summarize this article: ..."}]                                          
}'   
OpenAI-compatible endpoint — drop-in replacement for the OpenAI SDK:
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.pioneer.ai/v1"
)

response = client.chat.completions.create(
    model="YOUR_JOB_ID",
    messages=[{"role": "user", "content": "Summarize this article: ..."}]
)
print(response.choices[0].message.content)
Anthropic-compatible endpoint:
curl -X POST https://api.pioneer.ai/v1/messages \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_JOB_ID",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Summarize this article: ..."}]
  }'
Streaming is supported on all three interfaces.
Downloading your trained model weights is available on the Pro plan and above. Use GET /felix/training-jobs/:id/download to retrieve the weights once training is complete.

Serverless inference for base models

If you want to run inference on a base model without fine-tuning, several models are available as serverless endpoints with no startup latency:
Model IDLabelContext
Qwen/Qwen3-235B-A22B-Instruct-2507Qwen3 235B A22B Instruct262K
Qwen/Qwen3-8BQwen3 8B131K
deepseek-ai/DeepSeek-V3.1DeepSeek V3.1163K
openai/gpt-oss-120bGPT-OSS 120B131K
meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B Instruct131K
moonshotai/Kimi-K2.6Kimi K2.6262K
Use GET /base-models?task_type=decoder&supports_inference=true to see the current serverless catalog.

Next steps