POST /inference — Pioneer native inference endpoint

The Pioneer inference endpoint accepts a model ID, input text, and a schema that defines exactly what to extract. You can target a fine-tuned model from a completed training job or call a base model directly. For encoder models (GLiNER), use the schema field to declare entities, classifications, structures, or relations. For decoder models, use "task": "generate" instead.

Endpoints

Method	Path	Description
`POST`	`/inference`	Run inference on a model
`GET`	`/base-models`	List the model catalog

List the model catalog

Use GET /base-models to fetch the current list of available models. Filter by ?supports_inference=true to narrow to inference-ready models, and by ?task_type=encoder or ?task_type=decoder to filter by architecture.

curl "https://api.pioneer.ai/base-models?supports_inference=true" \
  -H "X-API-Key: YOUR_API_KEY"

Run inference

Request parameters

model_id

string

required

The ID of the model to run inference against. Use the job ID returned by POST /felix/training-jobs (e.g. job_abc123) to target a fine-tuned model, or a base model ID like fastino/gliner2-base-v1 to call a base model directly.

text

string

required

The input text to run the model against.

schema

object

Defines what to extract from the input text. Used with encoder models. Accepts up to four optional keys — include any combination based on your task.

entities

string[]

List of entity type labels to extract (Named Entity Recognition). Example: ["organization", "product", "location"].

classifications

object[]

List of classification tasks. Each object has a task string (the classification label group name) and a labels array of candidate class strings.

structures

object

Dictionary of structure definitions for JSON extraction. Each key is a structure name; the value defines the shape of the output.

relations

object[]

List of relation definitions. Each object describes a directional relationship between entity types to extract.

threshold

number

default:"0.5"

Confidence threshold for returned predictions. Values range from 0 to 1. Lower values return more candidates at the cost of precision; higher values return fewer, higher-confidence results.

Example — NER with a fine-tuned model

curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "job_abc123",
    "text": "Apple announced the MacBook Pro at WWDC in Cupertino.",
    "schema": {
      "entities": ["organization", "product", "event", "location"]
    },
    "threshold": 0.5
  }'

Example — combined schema (entities + classifications)

curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "job_abc123",
    "text": "Apple announced the MacBook Pro at WWDC in Cupertino.",
    "schema": {
      "entities": ["organization", "product", "location"],
      "classifications": [
        {
          "task": "sentiment",
          "labels": ["positive", "negative", "neutral"]
        }
      ]
    },
    "threshold": 0.5
  }'

For decoder models, omit schema and pass "task": "generate" in the request body instead. The model will respond with generated text rather than a structured extraction result.

The threshold default is 0.5. Lower it (e.g. 0.3) to surface more candidates at the cost of more false positives. Raise it (e.g. 0.7) for tighter, higher-precision extractions.

Using a base model ID

If you haven’t fine-tuned a model yet, you can call a base model directly. Use a model ID from GET /base-models, such as fastino/gliner2-base-v1.

curl -X POST https://api.pioneer.ai/inference \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "fastino/gliner2-base-v1",
    "text": "Tim Cook spoke at the Apple event in San Francisco.",
    "schema": {
      "entities": ["person", "organization", "location"]
    }
  }'

OpenAI-compatible inference — call Pioneer models through the OpenAI SDK
Anthropic-compatible inference — call Pioneer models through the Anthropic SDK
Inference history and feedback — retrieve past results and submit corrections
Available models — encoder and decoder model catalog

Overview

Inference

Training & Data

Projects

POST /inference — Pioneer native inference endpoint

Endpoints

List the model catalog

Run inference

Request parameters

Example — NER with a fine-tuned model

Example — combined schema (entities + classifications)

Using a base model ID

Overview

Inference

Training & Data

Projects

​Endpoints

​List the model catalog

​Run inference

​Request parameters

​Example — NER with a fine-tuned model

​Example — combined schema (entities + classifications)

​Using a base model ID

​Related

Endpoints

List the model catalog

Run inference

Request parameters

Example — NER with a fine-tuned model

Example — combined schema (entities + classifications)

Using a base model ID

Related