Skip to content

Custom Object Detection


Be aware that this project is currently in BETA.

Ximilar Custom Object Detection service provides a trainable custom detection API to detect objects (bounding boxes) on the image. It allows you to implement state-of-the-art artificial intelligence into your project. We provide a user interface for simple set up of your task, access to manage an account, upload images, train new models and evaluating result. After an easy setup, you get results over API and you are ready to build this functionality in your application. It’s easy, quick and highly scalable.


The task is where you start. Each task has a set of detection labels and a detection model. Only you can access your tasks and other data.

Model is the machine learning model behind your image detection API. Its a neural network trained on your specific images and thus highly accurate at recognizing new images. Each model has an accuracy measured at the end of the training. Model is private only to its owner. Each retraining increases the version of the model by one and you can select a model version that is deployed.

Label (category) is a feature you want to detect on your images.

Every detection/bounding box on the training image is represented by Object entity.

  • Every object is located on some training-image
  • has some type of detection Label
  • is represented by the bounding box with four coordinates [xmin, ymin, xmax, ymax] on the image

You must provide training images through the recognition api.

Create Detection task with Labels when:

  • You need to know exact location of the object you want to detect

Some examples of correct Detection tasks:

  • Detecting exact position of Logo on the Image
  • Detecting positions/bounding boxes of cars, traffic signs, persons, face
  • Detecting damages on the blades of Wind Power Plants
  • Detecting and counting ships in harbor with your Drones or Satellite Imagery

API reference

This is documentation of the Ximilar Custom Detection API. The API follows the general rules of Ximilar API as described in Section First steps.

Interactive API

You can find interactive API Reference after logging in at

The Custom Detection API is located at Only Image endpoint is located at Each API entity (task, label, object) is defined by its ID. ID is formatted as a universally unique identifier (UUID) string. You can use IDs from browser URLs to quick access entities over programmatic access.


Detect endpoint - /v2/detect/

Detect endpoint executes an image detection and it is the main endpoint of entire detection system. It allows POST method and you can find it in our interactive API reference. You can pass an image in _url or _base64 fields. API endpoint /v2/detect gets JSON-formatted body where you need to specify records to process (up to 10), identification of task and version of your model(optional).

To sum up /v2/detect

  • can detect a batch of images (up to 10) at once
  • you can optionally specify a version of the model to be used


  • records: A list of real-life photos to find similar products; each record
    • must contain either of _url or _base64 field
  • task_id: UUID identification of your task.
  • version: Optional. Version of the model, default is the active/last version of your model.
  • descriptor: Optional, experimental. If set as "descriptor": 1 then result of record contains also vector (list of floats) of visual descriptor which you can use for similarity search.
curl -H "Content-Type: application/json" -H "authorization: Token __API_TOKEN__" -d '{"task_id": "0a8c8186-aee8-47c8-9eaf-348103xa214d", "version": 2, "descriptor": 0, "records": [ {"_url": "" } ] }'
import requests
import json
import base64

url = ''
headers = {
    'Authorization': "Token __API_TOKEN__",
    'Content-Type': 'application/json'
with open(__IMAGE_PATH__, "rb") as image_file:
    encoded_string = base64.b64encode('utf-8')

data = {
    'task_id': __TASK_ID__,
    'records': [ {'_url': __IMAGE_URL__ }, {"_base64": encoded_string } ]

response =, headers=headers, data=json.dumps(data))
if response.raise_for_status():
    print(json.dumps(response.json(), indent=2))
    print('Error posting API: ' + response.text)

The result has similar json structure:


  "task_id": "__TASK_ID__",
  "records": [
      "_url": "__SOME_URL__",
      "_status": {
        "code": 200,
        "text": "OK"
      "_width": 2736,
      "_height": 3648,
      "_objects": [
          "name": "Person",
          "id": "b9124bed-5192-47b8-beb7-3eca7026fe14",
          "bound_box": [
          "prob": 0.9890862107276917
          "name": "Car",
          "id": "b9134c4d-5062-47b8-bcb7-3eca7226fa14",
          "bound_box": [
          "prob": 0.9890862107276917

Task endpoint - /v2/task/

Task endpoints let you manage tasks in your account. You can list all the tasks, create, delete, and modify created tasks. Until the first task training is successfully finished the production version of the task is -1 and the task cannot be used for detection.

List tasks (returns paginated result):

curl -v -XGET -H 'Authorization: Token __API_TOKEN__'

Delete task:

curl -v -XDELETE -H 'Authorization: Token __API_TOKEN__'

Label endpoint - /v2/label/

Label endpoints let you manage labels (categories) in your tasks. You manage your labels independently (list them, create, delete, and modify) and then you connect them to your tasks. Each task requires at least two labels for training. Each label must contain at least 20 images.

List all your labels (returns paginated result):

curl -v -XGET -H 'Authorization: Token __API_TOKEN__'

List all labels of the task:

curl -v -XGET -H 'Authorization: Token __API_TOKEN__'

Create new label (category):

curl -v -XPOST -H 'Authorization: Token __API_TOKEN__' -F 'name=New label'

Create new label (tag):

curl -v -XPOST -H 'Authorization: Token __API_TOKEN__' -F 'name=New label'  -F 'type=tag'

Connect a label to your task (category to Categorization task and tag to Tagging task):

curl -v -XPOST -H 'Authorization: Token __API_TOKEN__' -F 'label_id=__LABEL_ID__'

Remove a label from your task:

curl -v -XPOST -H 'Authorization: Token __API_TOKEN__' -F 'label_id=__LABEL_ID__'

Training — /v2/task/TASK_ID/train/

Use training endpoint to start a model training. It takes few minutes up to few hours to train a model depending on the number of images in your training collection. You are notified about the start and the finish of the training by email.

Start training:

curl -v -XPOST -H 'Authorization: Token __API_TOKEN__'