Platform – Custom Vision Language Models (VLM)
To access the Ximilar VLM Platform, first register at Ximilar App to get your API token. This service is currently in beta and only available to selected users.
Vision Language Models (VLM) by Ximilar enable you to train custom vision-language models (VLM) for structured image analysis tasks. Unlike traditional simple image classification, VLM models can:
- generate structured outputs like JSON/YAML/XML/CSV responses with explanations
- analyze multiple images (video frames) at once
- accept meta data to guide the analysis
Instruction Fine-tuning
VLM training uses instruction fine-tuning – a technique where you teach the model to follow specific
instructions by providing example input-output pairs. Each dataset defines a template for the expected
outputs, including prompts and variables that represent the output schema. Training samples contain
images along with the annotated variable values that serve as ground truth. During training, the model
learns to generate structured outputs matching your template based on the visual content of the images.
Key Concepts
The VLM system uses a hierarchical structure:
- Prompt: Reusable named prompt (system, user, or template) that can be shared across multiple tasks and datasets
- Task: Defines the AI model with references to system and user prompts, connects to multiple datasets
- Model: The result of the training process of a Task. A trained AI model with stored weights and metrics.
- Dataset: Collection of training samples with references to prompts and a result template that defines the output format
- Variable: Schema definition for output variables (type, constraints, validation rules)
- Sample: Individual training example with images and annotated variable values
Use Cases
- Product Description Generation: Generate structured product descriptions from images
- Quality Grading: Analyze items and generate quality grades with explanations
- Image Comparison: Compare multiple images and describe differences
- Structured Data Extraction: Extract specific data points from images in JSON format
- Advanced OCR: Extract structured text from images like invoices with OCR like system
All Endpoints
https://api.ximilar.com/vlm/v2/prompt/
https://api.ximilar.com/vlm/v2/prompt/__PROMPT_ID__/
Prompt Endpoints
Prompts are reusable named text blocks that can be shared across multiple tasks and datasets. Instead of storing prompt text directly on tasks and datasets, you create prompt objects and reference them by ID. This allows you to update a prompt once and have all referencing tasks and datasets use the updated content.
Prompt Types
| Type | Description |
|---|---|
system | System prompt that sets the AI model's behavior and role |
user | User prompt (instruction) that tells the model what to do |
template | Result template that defines the expected output format |
Prompt Formats
The format field is optional and indicates the desired output format. When not specified (null), no specific format is enforced.
| Format | Description |
|---|---|
json | JSON output format |
yaml | YAML output format |
xml | XML output format |
csv | CSV output format |
List Prompts
List all VLM prompts in your workspace. Returns paginated results.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
search- Type
- string
- Description
Search prompts by name.
- Name
type- Type
- string
- Description
Filter by prompt type (
system,user, ortemplate).
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/prompt/
Response
{
"count": 3,
"next": null,
"previous": null,
"results": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Product Analysis System Prompt",
"description": "System prompt for product image analysis",
"type": "system",
"format": null,
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
},
{
"id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
"name": "Analyze Image Instruction",
"description": "User instruction for analyzing images",
"type": "user",
"format": null,
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
]
}
Get Prompt
Get details of a specific VLM prompt by its ID.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
prompt_id- Type
- string
- Description
UUID of the prompt.
Returns
- Name
id- Type
- string
- Description
UUID of the prompt.
- Name
name- Type
- string
- Description
Name of the prompt.
- Name
description- Type
- string
- Description
Optional description.
- Name
content- Type
- string
- Description
The prompt text content.
- Name
type- Type
- string
- Description
Prompt type:
system,user, ortemplate.
- Name
format- Type
- string
- Description
Optional output format:
json,yaml,xml, orcsv.nullwhen not specified.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/prompt/__PROMPT_ID__/
Response
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "Product Analysis System Prompt",
"description": "System prompt for product image analysis",
"content": "You are a helpful assistant that analyzes product images and provides structured data about them.",
"type": "system",
"format": null,
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
Create Prompt
Create a new reusable prompt.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
name- Type
- string
- Description
Name for the prompt.
- Name
type- Type
- string
- Description
Prompt type:
system,user, ortemplate.
Optional attributes
- Name
description- Type
- string
- Description
Human-readable description of the prompt.
- Name
content- Type
- string
- Description
The prompt text content.
- Name
format- Type
- string
- Description
Output format:
json,yaml,xml, orcsv. Optional — omit for no specific format.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"name": "Product Analysis System Prompt",
"type": "system",
"description": "System prompt for product image analysis",
"content": "You are a helpful assistant that analyzes product images and provides structured data about them."
}' \
https://api.ximilar.com/vlm/v2/prompt/
Update Prompt
Update an existing prompt. All tasks and datasets referencing this prompt will automatically use the updated content.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
prompt_id- Type
- string
- Description
UUID of the prompt to update.
Optional attributes
- Name
name- Type
- string
- Description
Updated name.
- Name
description- Type
- string
- Description
Updated description.
- Name
content- Type
- string
- Description
Updated prompt text content.
- Name
format- Type
- string
- Description
Updated output format.
Request
curl -v -XPATCH \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"content": "You are a helpful assistant that analyzes product images. Provide structured JSON output."
}' \
https://api.ximilar.com/vlm/v2/prompt/__PROMPT_ID__/
Delete Prompt
Delete a prompt. Any tasks or datasets referencing this prompt will have their prompt reference set to null.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
prompt_id- Type
- string
- Description
UUID of the prompt to delete.
Request
curl -v -XDELETE \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/prompt/__PROMPT_ID__/
Task Endpoints
List Tasks
List all VLM tasks in your workspace. Returns paginated results.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
workspace- Type
- string
- Description
Filter by workspace ID.
- Name
search- Type
- string
- Description
Search tasks by name.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/task/
Get Task
Get details of a specific VLM task by its ID.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
task_id- Type
- string
- Description
UUID of the task.
Returns
- Name
id- Type
- string
- Description
UUID of the task.
- Name
name- Type
- string
- Description
Name of the task.
- Name
created- Type
- string
- Description
Timestamp when the task was created (ISO 8601 format).
- Name
description- Type
- string
- Description
Description of the task.
- Name
auto_deploy- Type
- boolean
- Description
Whether to automatically deploy the latest trained model version.
- Name
production_version- Type
- integer
- Description
Currently active model version.
- Name
last_version- Type
- integer
- Description
Latest trained model version number.
- Name
max_tokens- Type
- integer
- Description
Maximum number of tokens for model output.
- Name
datasets- Type
- array
- Description
List of dataset IDs connected to this task.
- Name
dataset_count- Type
- integer
- Description
Number of datasets connected to this task.
- Name
training_params- Type
- object
- Description
Training parameters (hyperparameters) for the model.
- Name
training_time- Type
- integer
- Description
Training time in seconds.
- Name
system_prompt_id- Type
- string
- Description
UUID of the referenced system prompt. Set to a prompt ID to assign a system prompt.
- Name
user_prompt_id- Type
- string
- Description
UUID of the referenced user prompt. Set to a prompt ID to assign a user prompt.
- Name
system_prompt- Type
- string
- Description
Resolved system prompt text content (read-only, derived from the referenced prompt).
- Name
user_prompt- Type
- string
- Description
Resolved user prompt text content (read-only, derived from the referenced prompt).
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/task/__TASK_ID__/
Response
{
"id": "dbb498ba-cf24-4400-9897-d5196444a880",
"name": "My Custom Task",
"created": "2025-12-17T14:59:34.529951Z",
"description": "Custom VLM task for structured image analysis",
"auto_deploy": true,
"production_version": 0,
"last_version": 0,
"max_tokens": 1000,
"datasets": ["8797c273-b1d3-4e6f-82bb-adfb719415fe"],
"dataset_count": 1,
"training_params": {},
"training_time": 180,
"system_prompt_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"user_prompt_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
"system_prompt": "You are a helpful assistant...",
"user_prompt": "Analyse the image[s]...",
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
Train Task
Start training a VLM model for the specified task. The training process uses all samples from connected datasets to fine-tune the vision-language model.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
task_id- Type
- string
- Description
UUID of the task to train.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/task/__TASK_ID__/train/
Add Dataset to Task
Connect a dataset to a VLM task. A task can have multiple datasets for training.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
task_id- Type
- string
- Description
UUID of the task.
- Name
dataset_id- Type
- string
- Description
UUID of the dataset to add.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"dataset_id": "__DATASET_ID__"}' \
https://api.ximilar.com/vlm/v2/task/__TASK_ID__/add-dataset/
Remove Dataset from Task
Remove a dataset from a VLM task.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
task_id- Type
- string
- Description
UUID of the task.
- Name
dataset_id- Type
- string
- Description
UUID of the dataset to remove.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"dataset_id": "__DATASET_ID__"}' \
https://api.ximilar.com/vlm/v2/task/__TASK_ID__/remove-dataset/
Request Endpoints
Once a task has a trained, deployed model, you run inference by submitting an async request. Requests are processed in the background by a pool of workers: you submit a request, receive an id immediately, and then poll for the result (or receive it via a webhook).
The request body follows the OpenAI chat-completions messages format, so you can send one or more images together with optional prompt instructions.
Request Lifecycle
A request moves through a series of statuses. Submit returns CREATED; the result is ready once the status reaches DONE.
| Status | Description |
|---|---|
CREATED | Submitted and waiting in the queue to be picked up by a worker. |
PROCESSING | A worker has claimed the request and the model is generating the response. |
RETRY | A transient failure occurred (e.g. rate limit or GPU OOM); the request will be retried automatically. |
DONE | Completed successfully — the response field holds the model output. |
FAILED | Processing failed after exhausting all retries. |
API_LIMIT_EXCEEDED | Rate limit hit or insufficient credits. |
NOT_SUPPORTED | The request or the model is not supported. |
Completed requests are retained for a limited time (2 weeks by default) and then automatically cleaned up. Download or store the response if you need it long-term.
Prompt Resolution
You usually do not need to send any prompt text — the model uses the system and user prompts configured on the task. When you do want to override them for a single request, the prompt is resolved in this order of priority:
system_prompt_id/user_prompt_id— a reference to a saved prompt. Highest priority.- Inline text in
messages— asystemmessage and/or ausertext part written directly in the request. - Task default — the prompt referenced by the task. Used when neither of the above is provided.
Submit Request
Submit a new async inference request for a trained task. Returns the created request immediately with status CREATED — poll Get Request Status or Get Request for the result.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
task_id- Type
- string
- Description
UUID of the VLM task to run inference on. The task must have a trained model.
- Name
request- Type
- object
- Description
The inference payload in OpenAI chat-completions format. Contains a
messagesarray with the image(s) and any inline prompt text. Provide between 1 and 10 images per request.
Optional attributes
- Name
version- Type
- integer
- Description
Model version to use. Defaults to the task's current
production_version.
- Name
system_prompt_id- Type
- string
- Description
UUID of a saved prompt to use as the system prompt for this request. Takes priority over the task's system prompt and any inline
systemmessage.
- Name
user_prompt_id- Type
- string
- Description
UUID of a saved prompt to use as the user prompt (instruction) for this request. Takes priority over the task's user prompt and any inline
usertext.
- Name
webhook- Type
- object
- Description
Callback target for the result, e.g.
{"url": "https://...", "headers": {...}}. When set, the response is POSTed to this URL once the request isDONE.
- Name
priority- Type
- integer
- Description
Queue priority (higher is processed first). Defaults to your account priority.
Returns
- Name
id- Type
- string
- Description
UUID of the created request. Use it to poll for the result.
- Name
status- Type
- string
- Description
Current status —
CREATEDright after submission.
- Name
task_id- Type
- string
- Description
UUID of the task the request was submitted to.
- Name
version- Type
- integer
- Description
Model version the request will be processed with.
- Name
created- Type
- string
- Description
Timestamp when the request was created (ISO 8601 format).
- Name
workspace- Type
- string
- Description
UUID of the workspace the request belongs to.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"task_id": "__TASK_ID__",
"request": {
"messages": [
{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}
]
}
}' \
https://api.ximilar.com/vlm/v2/request/
Response
{
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"status": "CREATED",
"task_id": "dbb498ba-cf24-4400-9897-d5196444a880",
"version": 2,
"created": "2026-01-15T09:21:04.118273Z",
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
Get Request Status
Get only the status of a request. This is a lightweight endpoint intended for polling — it does not return the (potentially large) request and response payloads.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
request_id- Type
- string
- Description
UUID of the request.
Returns
- Name
status- Type
- string
- Description
Current status of the request (see Request Lifecycle).
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/request/__REQUEST_ID__/status/
Response
{
"status": "DONE"
}
Get Request
Get the full detail of a request, including the original request payload and, once the request is DONE, the model response and token usage.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
request_id- Type
- string
- Description
UUID of the request.
Returns
- Name
id- Type
- string
- Description
UUID of the request.
- Name
status- Type
- string
- Description
Current status of the request.
- Name
task_id- Type
- string
- Description
UUID of the task.
- Name
version- Type
- integer
- Description
Model version used.
- Name
request- Type
- object
- Description
The original inference payload that was submitted.
- Name
response- Type
- object
- Description
The model output.
nulluntil the request isDONE(or an error object on failure).
- Name
usage- Type
- object
- Description
Token usage statistics (e.g.
prompt_tokens,completion_tokens,total_tokens,processing_time).nulluntil the request completes.
- Name
system_prompt_id- Type
- string
- Description
Saved system prompt override applied to this request, if any.
- Name
user_prompt_id- Type
- string
- Description
Saved user prompt override applied to this request, if any.
- Name
webhook- Type
- object
- Description
Webhook target the result is/was delivered to, if configured.
- Name
retry_count- Type
- integer
- Description
Number of times the request has been retried.
- Name
created- Type
- string
- Description
Timestamp when the request was created (ISO 8601 format).
- Name
started_at- Type
- string
- Description
Timestamp when a worker started processing the request.
- Name
completed_at- Type
- string
- Description
Timestamp when the request reached a terminal state.
- Name
workspace- Type
- string
- Description
UUID of the workspace the request belongs to.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/request/__REQUEST_ID__/
Response
{
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"status": "DONE",
"task_id": "dbb498ba-cf24-4400-9897-d5196444a880",
"version": 2,
"request": {
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this product and estimate its price."},
{"type": "image_url", "image_url": {"url": "https://example.com/front.jpg"}}
]
}
]
},
"response": {"description": "A wooden puzzle for kids.", "price": 8.99},
"usage": {
"prompt_tokens": 27,
"completion_tokens": 92,
"total_tokens": 359,
"processing_time": 3.33
},
"system_prompt_id": null,
"user_prompt_id": null,
"webhook": null,
"retry_count": 0,
"created": "2026-01-15T09:21:04.118273Z",
"started_at": "2026-01-15T09:21:06.402551Z",
"completed_at": "2026-01-15T09:21:09.731904Z",
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
List Requests
List requests in your workspace. Returns paginated results with large payload fields omitted for performance.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
workspace- Type
- string
- Description
Workspace UUID. Defaults to your default workspace.
- Name
status- Type
- string
- Description
Filter by status, e.g.
CREATED,PROCESSING,DONE,FAILED.
- Name
task_id- Type
- string
- Description
Filter by task UUID.
- Name
page_size- Type
- integer
- Description
Number of results per page.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
'https://api.ximilar.com/vlm/v2/request/?status=DONE&task_id=__TASK_ID__'
Request History
List completed (DONE) requests, including their response and usage. Uses fast cursor-based pagination — follow the next cursor returned in the response to page through results.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
task_id- Type
- string
- Description
Filter history by task UUID.
- Name
page_size- Type
- integer
- Description
Number of results per page (default 20, max 100).
- Name
cursor- Type
- string
- Description
Pagination cursor returned in a previous response.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
'https://api.ximilar.com/vlm/v2/request/history/?task_id=__TASK_ID__&page_size=20'
Resubmit Request
Re-queue a request that finished in a terminal state (DONE, FAILED, API_LIMIT_EXCEEDED, or NOT_SUPPORTED). The status is reset to CREATED and the previous response, usage, and timing fields are cleared so the request is processed again.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
request_id- Type
- string
- Description
UUID of the request to resubmit.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/request/__REQUEST_ID__/resubmit/
Delete Request
Delete a request. A request that is currently PROCESSING cannot be deleted, and a DONE request can only be deleted after it has been billed.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
request_id- Type
- string
- Description
UUID of the request to delete.
Request
curl -v -XDELETE \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/request/__REQUEST_ID__/
Dataset Endpoints
List Datasets
List all VLM datasets in your workspace. Returns paginated results.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/dataset/
Get Dataset
Get details of a specific VLM dataset by its ID.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
dataset_id- Type
- string
- Description
UUID of the dataset.
Returns
- Name
id- Type
- string
- Description
UUID of the dataset.
- Name
name- Type
- string
- Description
Name of the dataset.
- Name
description- Type
- string
- Description
Description of the dataset.
- Name
version- Type
- string
- Description
Auto-generated version date (updated on each save).
- Name
created_date- Type
- string
- Description
Timestamp when the dataset was created (ISO 8601 format).
- Name
result_format- Type
- string
- Description
Resolved result format from the referenced template prompt, or the dataset's own format. One of
json,yaml,xml,csv, ornull.
- Name
default_input_meta_data- Type
- object
- Description
Default input metadata for samples in this dataset. Used to pre-populate sample's
input_meta_dataon creation.
- Name
meta_data- Type
- object
- Description
Additional metadata attached to the dataset.
- Name
samples_count- Type
- integer
- Description
Number of samples in this dataset.
- Name
variables_count- Type
- integer
- Description
Number of variables defined for this dataset.
- Name
system_prompt_id- Type
- string
- Description
UUID of the referenced system prompt (can override task's prompt).
- Name
user_prompt_id- Type
- string
- Description
UUID of the referenced user prompt.
- Name
result_template_id- Type
- string
- Description
UUID of the referenced result template prompt.
- Name
system_prompt- Type
- string
- Description
Resolved system prompt text content (read-only, derived from the referenced prompt).
- Name
user_prompt- Type
- string
- Description
Resolved user prompt text content (read-only, derived from the referenced prompt).
- Name
result_template- Type
- string
- Description
Resolved result template content (read-only, derived from the referenced prompt).
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/dataset/__DATASET_ID__/
Response
{
"id": "8797c273-b1d3-4e6f-82bb-adfb719415fe",
"name": "My Training Dataset",
"description": "Training samples for the task",
"version": "2025-12-21",
"created_date": "2025-12-17T10:30:00.000000Z",
"result_format": "json",
"default_input_meta_data": {"category": "jacket"},
"meta_data": {},
"samples_count": 150,
"variables_count": 2,
"system_prompt_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"user_prompt_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
"result_template_id": "05dea907-3a59-4f77-93f0-07a922422bbf",
"system_prompt": "You are a helpful assistant...",
"user_prompt": "Analyse the image[s]...",
"result_template": "{\"grade\": {{grade}}, \"explain\": \"{{explain}}\"}",
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
Variable Endpoints
Variables define the schema for your dataset's output format. Each variable has a type and validation constraints.
Supported Variable Types
| Type | Description |
|---|---|
string | Text values with optional min/max length |
integer | Whole numbers with optional min/max value |
float | Decimal numbers with optional min/max value and step size |
boolean | True/false values |
array | List of values |
object | Nested JSON objects |
List Variables
List all variables, optionally filtered by dataset.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
dataset- Type
- string
- Description
Filter variables by dataset ID.
- Name
page_size- Type
- integer
- Description
Number of results per page.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
'https://api.ximilar.com/vlm/v2/variable/?dataset=__DATASET_ID__'
Response
{
"count": 3,
"next": null,
"previous": null,
"results": [
{
"id": "f82b01ed-e65d-4730-a458-2966cbf86994",
"dataset": "2e3a2346-fa3b-43ca-8a4c-168941692c58",
"dataset_name": "My Training Dataset",
"name": "grade",
"type": "float",
"required": true,
"min_value": 0.0,
"max_value": 10.0,
"step_size": 0.5
},
{
"id": "7b661f0e-7c00-4b59-b334-aca2a0249e2f",
"dataset": "2e3a2346-fa3b-43ca-8a4c-168941692c58",
"dataset_name": "My Training Dataset",
"name": "explain",
"type": "string",
"required": false
}
]
}
Get Variable
Get details of a specific variable by its ID.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
variable_id- Type
- string
- Description
UUID of the variable.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/variable/__VARIABLE_ID__/
Create Variable
Create a new variable for a dataset.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
dataset- Type
- string
- Description
UUID of the dataset this variable belongs to.
- Name
name- Type
- string
- Description
Variable name (must follow programming naming conventions).
- Name
type- Type
- string
- Description
Variable type:
string,integer,float,boolean,array, orobject.
Optional attributes
- Name
required- Type
- boolean
- Description
Whether this variable must be provided (default: false).
- Name
description- Type
- string
- Description
Human-readable description.
- Name
choices- Type
- string
- Description
Comma-separated list of allowed values (for string type).
- Name
min_value- Type
- number
- Description
Minimum value (for numeric types).
- Name
max_value- Type
- number
- Description
Maximum value (for numeric types).
- Name
step_size- Type
- number
- Description
Step size (for float type).
- Name
min_length- Type
- integer
- Description
Minimum length (for string/array types).
- Name
max_length- Type
- integer
- Description
Maximum length (for string/array types).
- Name
default_value- Type
- any
- Description
Default value if not provided.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"dataset": "__DATASET_ID__",
"name": "condition",
"type": "string",
"required": false,
"description": "Condition of the item",
"choices": "mint,near_mint,excellent,good,poor"
}' \
https://api.ximilar.com/vlm/v2/variable/
Sample Endpoints
Samples are individual training examples consisting of one or more images and annotated variable values.
List Samples
List all samples, optionally filtered by dataset.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
Optional attributes
- Name
dataset- Type
- string
- Description
Filter samples by dataset ID.
- Name
test- Type
- boolean
- Description
Filter by test/training samples.
- Name
search- Type
- string
- Description
Search samples by name.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
'https://api.ximilar.com/vlm/v2/sample/?dataset=__DATASET_ID__'
Get Sample
Get details of a specific sample including images and variable values.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
Returns
- Name
id- Type
- string
- Description
UUID of the sample.
- Name
dataset- Type
- string
- Description
UUID of the dataset this sample belongs to.
- Name
dataset_name- Type
- string
- Description
Name of the parent dataset (read-only).
- Name
test- Type
- boolean
- Description
Whether this is a test sample.
- Name
name- Type
- string
- Description
Optional name of the sample.
- Name
description- Type
- string
- Description
Optional description of the sample.
- Name
type- Type
- string
- Description
Sample type:
single,multi_random, ormulti_ordered.
- Name
created_date- Type
- string
- Description
Timestamp when the sample was created (ISO 8601 format).
- Name
result_template- Type
- string
- Description
Result template for this sample. Falls back to the dataset's result template if not set.
- Name
user_prompt- Type
- string
- Description
User prompt for this sample. Falls back to the dataset's user prompt if not set.
- Name
input_meta_data- Type
- object
- Description
Input metadata for this sample. Can be used in the user prompt during training and inference.
- Name
meta_data- Type
- object
- Description
Additional metadata attached to the sample.
- Name
images_count- Type
- integer
- Description
Number of images in this sample.
- Name
objects_count- Type
- integer
- Description
Number of detection objects in this sample.
- Name
variables_count- Type
- integer
- Description
Number of variable values annotated for this sample.
- Name
dataset_variables_config- Type
- object
- Description
Configuration of all dataset variables (type, constraints, etc.) keyed by variable name.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/
Response
{
"id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
"dataset": "8797c273-b1d3-4e6f-82bb-adfb719415fe",
"dataset_name": "My Training Dataset",
"test": false,
"name": "Sample 1",
"description": null,
"type": "multi_random",
"created_date": "2025-12-18T09:15:00.000000Z",
"result_template": "{\"grade\": {{grade}}, \"explain\": \"{{explain}}\"}",
"user_prompt": "Analyse the image[s]...",
"input_meta_data": {"style": "elegant", "category": "jacket"},
"meta_data": {},
"images_count": 2,
"objects_count": 0,
"variables_count": 2,
"dataset_variables_config": {
"grade": {
"id": "f82b01ed-e65d-4730-a458-2966cbf86994",
"type": "float",
"required": true,
"min_value": 0.0,
"max_value": 10.0,
"step_size": 0.5
},
"explain": {
"id": "7b661f0e-7c00-4b59-b334-aca2a0249e2f",
"type": "string",
"required": false
}
},
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
Create Sample
Create a new sample in a dataset.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
dataset- Type
- string
- Description
UUID of the dataset this sample belongs to.
Optional attributes
- Name
name- Type
- string
- Description
Optional name for the sample.
- Name
description- Type
- string
- Description
Optional description for the sample.
- Name
test- Type
- boolean
- Description
Whether this is a test sample (default: false).
- Name
type- Type
- string
- Description
Sample type:
single,multi_random(default), ormulti_ordered.
- Name
input_meta_data- Type
- object
- Description
Input metadata for this sample. Can be used in the user prompt during training and inference. If not provided, auto-populated from the dataset's
default_input_meta_dataand variables extracted from the user prompt.
- Name
meta_data- Type
- object
- Description
Additional metadata to attach to the sample.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"dataset": "__DATASET_ID__",
"name": "Sample 1",
"test": false
}' \
https://api.ximilar.com/vlm/v2/sample/
Add Images to Sample
Add images to an existing sample. Images must already be uploaded to your workspace.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
image_ids- Type
- array
- Description
List of image UUIDs to add.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"image_ids": ["__IMAGE_ID_1__", "__IMAGE_ID_2__"]}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/add-images/
Add Detection Objects to Sample
Add detection objects to an existing sample. Detection objects must already exist in your workspace. A sample can have at most 10 detection objects.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
detection_object_ids- Type
- array
- Description
List of detection object UUIDs to add.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"detection_object_ids": ["__OBJECT_ID_1__", "__OBJECT_ID_2__"]}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/add-objects/
Response
{
"added": 2
}
Remove Detection Objects from Sample
Remove detection objects from a sample.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
detection_object_ids- Type
- array
- Description
List of detection object UUIDs to remove.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"detection_object_ids": ["__OBJECT_ID_1__"]}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/remove-objects/
Response
{
"removed": 1
}
List Sample Images
List all images attached to a sample with their metadata (text, order, resize settings).
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
Returns (each item)
- Name
id- Type
- string
- Description
UUID of the sample-image relation.
- Name
image_id- Type
- string
- Description
UUID of the image.
- Name
img_path- Type
- string
- Description
Full URL of the image.
- Name
thumb- Type
- string
- Description
URL of the image thumbnail.
- Name
text- Type
- string
- Description
Optional text associated with this image in the sample.
- Name
order- Type
- integer
- Description
Order of the image within the sample.
- Name
resize- Type
- string
- Description
Resize setting for this image.
- Name
meta_data- Type
- object
- Description
Additional metadata for this sample image.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/sample-images/
Response
[
{
"id": "d4e5f6a7-b8c9-0123-def4-567890123456",
"image_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"img_path": "https://images.ximilar.com/...",
"thumb": "https://images.ximilar.com/.../thumb",
"text": null,
"order": 0,
"resize": null,
"meta_data": {},
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
]
List Sample Detection Objects
List all detection objects attached to a sample with their metadata, bounding boxes, and label information.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
Returns (each item)
- Name
id- Type
- string
- Description
UUID of the sample-object relation.
- Name
detection_object_id- Type
- string
- Description
UUID of the detection object.
- Name
image_id- Type
- string
- Description
UUID of the source image.
- Name
image_url- Type
- string
- Description
Full URL of the source image.
- Name
thumb_url- Type
- string
- Description
URL of the detection object thumbnail (cropped region).
- Name
label_name- Type
- string
- Description
Name of the detection label.
- Name
label_color- Type
- string
- Description
Color of the detection label.
- Name
bbox- Type
- array
- Description
Bounding box coordinates of the detection object.
- Name
text- Type
- string
- Description
Optional text associated with this object in the sample.
- Name
order- Type
- integer
- Description
Order of the object within the sample.
- Name
meta_data- Type
- object
- Description
Additional metadata for this sample object.
Request
curl -v -XGET \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/sample-objects/
Response
[
{
"id": "e5f6a7b8-c9d0-1234-ef56-789012345678",
"detection_object_id": "f6a7b8c9-d0e1-2345-f678-901234567890",
"image_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"image_url": "https://images.ximilar.com/...",
"thumb_url": "https://images.ximilar.com/.../thumb",
"label_name": "car",
"label_color": "#FF0000",
"bbox": [100, 150, 400, 350],
"text": null,
"order": 0,
"meta_data": {},
"workspace": "748e50e4-d081-4924-b9e7-f500aac6a71d"
}
]
Set Sample as Test
Mark a sample as a test sample. Test samples are used for model evaluation, not training.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/set-test/
Unset Sample as Test
Remove the test flag from a sample, making it a training sample.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/set-untest/
Set Sample Type
Set the sample type which determines how images are used during training.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
type- Type
- string
- Description
Sample type. Valid values:
single: Take just one image from the list during trainingmulti_random: Default, randomly pick all images during trainingmulti_ordered: Use all images preserving order during training
Request
curl -v -XPATCH \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"type": "multi_ordered"}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/
Add Input Metadata to Sample
Add input metadata to a sample. This metadata can be used in the user prompt during training and inference. The metadata is merged with any existing input metadata.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
input_meta_data- Type
- object
- Description
JSON object containing metadata to add to the sample.
Request
curl -v -XPATCH \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{"input_meta_data": {"style": "elegant", "category": "jacket"}}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/
Add Variable Value to Sample
Add or update a variable value for a sample. This is how you annotate your training data.
Required attributes
- Name
Authorization- Type
- string
- Description
Unique API token for authentication.
- Name
sample_id- Type
- string
- Description
UUID of the sample.
- Name
dataset_variable- Type
- string
- Description
UUID of the variable to set.
- Name
value- Type
- any
- Description
The value for this variable (type must match variable definition).
Request
curl -v -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"dataset_variable": "__VARIABLE_ID__",
"value": {"value": 8.5}
}' \
https://api.ximilar.com/vlm/v2/sample/__SAMPLE_ID__/add-variable-value/
Using Different Workspace
When making an API request, the default workspace associated with the user's API token is used. To access data or upload to a different workspace, specify the workspace in the URL or JSON payload.
# Get all samples from a specific workspace
https://api.ximilar.com/vlm/v2/sample/?workspace=WORKSPACE_ID
# Create a sample in a specific workspace
curl -XPOST \
-H 'Authorization: Token __API_TOKEN__' \
-H 'Content-Type: application/json' \
-d '{
"dataset": "__DATASET_ID__",
"workspace": "WORKSPACE_ID"
}' \
https://api.ximilar.com/vlm/v2/sample/
from ximilar.client.vlm import VLMClient
# Initialize client with specific workspace
client = VLMClient(
token="__API_TOKEN__",
workspace="WORKSPACE_ID"
)