Skip to main content
The Pioneer inference endpoint accepts a model ID, input text, and a schema that defines exactly what to extract. You can target a fine-tuned model from a completed training job or call a base model directly. For encoder models (GLiNER), use the schema field to declare entities, classifications, structures, or relations. For decoder models, use "task": "generate" instead.

Endpoints

MethodPathDescription
POST/inferenceRun inference on a model
GET/base-modelsList the model catalog

List the model catalog

Use GET /base-models to fetch the current list of available models. Filter by ?supports_inference=true to narrow to inference-ready models, and by ?task_type=encoder or ?task_type=decoder to filter by architecture.
curl "https://api.pioneer.ai/base-models?supports_inference=true" \
  -H "X-API-Key: YOUR_API_KEY"

Run inference

Request parameters

model_id
string
required
The ID of the model to run inference against. Use the job ID returned by POST /felix/training-jobs (e.g. job_abc123) to target a fine-tuned model, or a base model ID like fastino/gliner2-base-v1 to call a base model directly.
text
string
required
The input text to run the model against.
schema
object
Defines what to extract from the input text. Used with encoder models. Accepts up to four optional keys — include any combination based on your task.
entities
string[]
List of entity type labels to extract (Named Entity Recognition). Example: ["organization", "product", "location"].
classifications
object[]
List of classification tasks. Each object has a task string (the classification label group name) and a labels array of candidate class strings.
structures
object
Dictionary of structure definitions for JSON extraction. Each key is a structure name; the value defines the shape of the output.
relations
object[]
List of relation definitions. Each object describes a directional relationship between entity types to extract.
threshold
number
default:"0.5"
Confidence threshold for returned predictions. Values range from 0 to 1. Lower values return more candidates at the cost of precision; higher values return fewer, higher-confidence results.

Example — NER with a fine-tuned model

curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "job_abc123",
    "text": "Apple announced the MacBook Pro at WWDC in Cupertino.",
    "schema": {
      "entities": ["organization", "product", "event", "location"]
    },
    "threshold": 0.5
  }'

Example — combined schema (entities + classifications)

curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "job_abc123",
    "text": "Apple announced the MacBook Pro at WWDC in Cupertino.",
    "schema": {
      "entities": ["organization", "product", "location"],
      "classifications": [
        {
          "task": "sentiment",
          "labels": ["positive", "negative", "neutral"]
        }
      ]
    },
    "threshold": 0.5
  }'
For decoder models, omit schema and pass "task": "generate" in the request body instead. The model will respond with generated text rather than a structured extraction result.
The threshold default is 0.5. Lower it (e.g. 0.3) to surface more candidates at the cost of more false positives. Raise it (e.g. 0.7) for tighter, higher-precision extractions.

Using a base model ID

If you haven’t fine-tuned a model yet, you can call a base model directly. Use a model ID from GET /base-models, such as fastino/gliner2-base-v1.
curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "fastino/gliner2-base-v1",
    "text": "Tim Cook spoke at the Apple event in San Francisco.",
    "schema": {
      "entities": ["person", "organization", "location"]
    }
  }'