schema field to declare entities, classifications, structures, or relations. For decoder models, use "task": "generate" instead.
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /inference | Run inference on a model |
GET | /base-models | List the model catalog |
List the model catalog
UseGET /base-models to fetch the current list of available models. Filter by ?supports_inference=true to narrow to inference-ready models, and by ?task_type=encoder or ?task_type=decoder to filter by architecture.
Run inference
Request parameters
The ID of the model to run inference against. Use the job ID returned by
POST /felix/training-jobs (e.g. job_abc123) to target a fine-tuned model, or a base model ID like fastino/gliner2-base-v1 to call a base model directly.The input text to run the model against. Pass an array of strings to run batch inference — the response
resultDefines what to extract from the input text. Used with encoder models. For simple NER, pass a flat array of entity labels (e.g.
["organization", "product"]). For multi-task extraction, pass an object with any combination of the following keys:List of entity type labels to extract (Named Entity Recognition). Example:
["organization", "product", "location"].List of classification tasks. Each object has a
task string (the classification label group name) and a labels array of candidate class strings.Dictionary of structure definitions for JSON extraction. Each key is a structure name; the value defines the shape of the output.
List of relation definitions. Each object describes a directional relationship between entity types to extract.
Confidence threshold for returned predictions. Values range from
0 to 1. Lower values return more candidates at the cost of precision; higher values return fewer, higher-confidence results.Example — NER with a fine-tuned model
Example — combined schema (entities + classifications)
Decoder models use a different request shape. Pass
"task": "generate" and a messages array instead of text and schema:Using a base model ID
If you haven’t fine-tuned a model yet, you can call a base model directly. Use a model ID fromGET /base-models, such as fastino/gliner2-base-v1.
Related
- OpenAI-compatible inference — call Pioneer models through the OpenAI SDK
- Anthropic-compatible inference — call Pioneer models through the Anthropic SDK
- Inference history and feedback — retrieve past results and submit corrections
- Available models — encoder and decoder model catalog