Skip to content

Photo & Product Similarity

This page describes API of the similarity search in generic and product photos. The API follows the general rules of Ximilar API as described in Section First steps.

The API is a set of HTTP REST services accepting JSON-formatted documents using POST and returning JSON documents. These services run at these URLs:

https://api.ximilar.com/similarity/photos/v2/<method> https://api.ximilar.com/similarity/products/v2/<method> https://api.ximilar.com/<your_private_cloud_service>/v2/<method>

Contact us before using this service

In order to get access to the Proto & Product Similarity services, please register at https://app.ximilar.com and then contact us at tech@ximilar.com to make the service accessible for your Ximilar account.

Description of the Service

Using this service, Ximilar can take your collection of images (general photos or product/fashion photos) and quickly find images that are visually similar to a given image (query image). You first use the API to upload the images into Ximilar cloud and then you call the search API methods. The query image can be from the collection or external. You can also store any additional fields into the index and use this metadata to filter the search results.

The system works like this:

  • every image is preprocessed by a special neural network and in this way we extract visual descriptor; this descriptor is typically a vector of numbers with no meaningful interpretation, but visually (and semantically) similar images have similar descriptors;
  • this visual similarity is tuned either for generic photos (service /similarity/photos) or for product (especially fashion) photos (service /similarity/products);
  • the image records inserted into the system are organized by a special indexing structure that can quickly search millions of image records using the descriptor similarity.

Overview of API Methods

You always work with your "collection" of images, which is specified in the collection-id header of each API call. Your collection(s) must be first created by Ximilar administrators.

Deprecated

Instead of ID of the collection specified in collection-id header, you may also directly use the name of the collection in collection header. This usage assumes that you have only one collection with this name. Usage of collection header is deprecated and might be removed in future.

All API methods use POST, require a JSON record (JSON map) in the body of the request, and return answer as another JSON record. Here is an example of a communication with the API:

$ curl https://api.ximilar.com/similarity/photos/v2/visualKNN 
  -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24'
  -H 'collection-id: 85-XXX-0-aXX8-4d74-8885-744--XXX--f1'
  -H 'Content-Type: application/json;charset=UTF-8' 
  -d '{ 
    "query_record": { "_id": "11959" }, 
    "k": 3, 
    "filter": { "artist": { "$ne": "david_27" } } 
}' | json_pp

and example response:

{
    "status":{
        "text":"OK",
        "code":200
    },
    "answer_count":3,
    "answer_records":[
        { "_id":"11959" },
        { "_id":"5950"  },
        { "_id":"28043" }
    ],
    "answer_distances":[
        0.0,
        45.725513,
        47.01006
    ],
    "statistics":{
        "OperationTime":235
    }
}

The following methods are currently available (click on the link to get directly to description of the method):

  • /v2/ping -- get basic info about the running index

  • /v2/insert -- insert a new batch of records into the index

  • /v2/delete -- delete records identified by their IDs from the index
  • /v2/get -- get a full record of a list of records identified by their IDs
  • /v2/find -- finds and returns given list of records which satisfies the given filter
  • /v2/update -- update attributes (additional fields) of records identified by their IDs
  • /v2/nearDupsInsert -- insert records checking that there are no near duplicate images there yet

  • /v2/visualKNN -- find visually similar images from the collection to a given image

  • /v2/visualTagsKNN -- find images that are similar by combination of visual and tags similarity
  • /v2/visualKNNMulti -- find visually similar images to a given list of images
  • /v2/visualTagsKNNMulti -- find images that are similar to a list of images by combination
  • /v2/random -- returns a given number of random records stored in the collection
  • /v2/getRecordCount -- gets the number of records stored in the index collection
  • /v2/allRecords -- gets all records stored in the collection (or just their IDs)
  • /v2/deleteByFilter -- deletes all records matching given condition. This method is not available in all indexes.

  • /v2/range -- find visually similar images to a given image up to a certain query radius.

  • /v2/nearDuplicates -- finds images that are the same or very similar to the query image
  • /v2/allNearDupPairs -- finds all pairs of images that are the same or mutually very similar

Parameters of API methods

The Ximilar Search API works with data records that represent a single image or video. It has the same format in all operations and also in the responses. It is a JSON record (map) with the following fields:

  • _id -- unique record identifier (string); it should be unique within the collection
  • _file -- name of a PNG, JPG, or TIFF file on a local storage
  • _url -- URL with a PNG, JPG, or TIFF image file
  • _base64 -- base64-encoded content of a PNG, JPG or TIFF image file
  • tags -- a JSON array of string tags (keywords) corresponding to the image
  • attribute -- a JSON representation of any attribute of the record; it can be used later for filtering.

Example of a data record:

{
    "_id":"321",
    "_url":"https://yourdomain.com/images/product_image_321.jpg",
    "category": "apparel",
    "tags":[
        "dresses",
        "short",
        "polka dot"
    ],
    "seller":"shop_543",
    "price":35.5,
    "active":true
}

The following parameters are used in the API methods. Each of the method has its own selection of these parameters (see below), but meaning of the parameters is the same for all the methods.

  • query_record -- one data record, examples: {"query_record": {"_id": "54321" } } or {"query_record": {"_url": "http://example.com/myqueryimage.png" } }
  • k -- a number (integer) of similar records to be returned. Default value: 20, example of a query: {"query_record": {"_id": "321"}, "k": 30 } from -- integer number saying how many records from the beginning of should be skipped (typically because they were already returned). Default value: 0, example: {"query_record": {"_id": "321"}, "k": 30, "from": 60 }
  • records -- a JSON array of "data records", example: {"records": [ {"_id":"321", "_file": "/local/storage/uri/image321.jpg"}, {"_id":"322", "_file": "/local/storage/uri/image322.jpg"} ] }
  • fields_to_return -- list of data record fields to be returned by the operation, example: { "fields_to_return": [ "_id", "tags", "_file" ] }. By default, search operations return only data record IDs: { "fields_to_return": ["_id"]}
  • filter -- search operations have this condition that is checked for all records in the answer. It uses the format of MongoDB command .find()(see documentation), but we require attributes and operators to be in " " (to make it a valid JSON). We support these operators and their combination:
    • SIMPLE CONDITION:
      • "filter": { "attribute": "value" }
      • returns just data records with value value in given attribute. If the attribute contains an array of values, value must be one of them.
    • SIMPLE CONDITION: $eq, $ne
      • "filter": { "int_attribute": { "$eq": int_value } }
      • "filter": { "attribute": { "$ne": "value" } }
      • returns just data records with ($eq) or without ($ne) given value in given attribute
    • NEGATION: $not
      • "filter": { "$not": { "attribute": "value" } }
      • operator $not negates the condition
    • CONJUNCTION:
      • "filter": { "attribute": "value", "bool_att": false }
      • a comma between conditions means conjunction (AND)
    • DISJUNCTION and CONJUNCTION: $or, $and
      • "filter": { "$or" : [ { "att": "value" }, { "int_att": value } ] }
      • "filter": { "$and": [ { "att": "value" }, { "int_att": value } ] }
      • disjunction combination of conditions (OR) and another way to write conjunction (AND)
    • INTERVAL condition: $lt, $lte, $gt, $gte
      • "filter": { "int_att": { "$lt": 30 } }
      • returns just records that have value in int_att less/greater than (or equal) to given value
    • SET OPERATORS: $in, $nin
      • "filter": { "attribute": { "$in": [ "value1", "value2" ] } }
      • attribute must ($in) or musn't ($nin) contain one of the values in the list
  • radius -- maximal visual distance between the query record and a returned record. Do not use this parameter if you are not sure.
  • approx_param -- internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default.

Return Values

All methods return (see example in section Overview of API Methods):

  • HTTP error code 2XX, if the method was OK and other HTTP error code, if the method failed
  • JSON-formatted body with the status, answer and statistics

Answer fields common for all types of answers:

  • status -- a JSON map with a status of the method processing. It contains these subfields:
  • code -- a numeric code of the operation status; it follows the concept of HTTP status codes (2XX, 4XX). Specific codes are described for each type of answer (or operation) (see below).
  • text -- a text describing the status code
  • error_description -- in case of the processing ended with error (codes 4XX), this field contains a detailed description of the error; this might include Java stack traces.

Generic statuses that can be returned by any operation:

  • "status": {"code": 200, "text": "OK"}
  • "status": {"code": 402, "text": "aborted by error", error_description="..."}
  • "status": {"code": 500, "text": "unknown error", "error_description": "..."}
  • statistics -- a map of various statistics about the processing. The only statistic included every time is
  • OperationTime -- time of actual processing of the query (in milliseconds)

Additional fields of all listing answers (e.g. answers to "random" or "list all records" operations):

  • answer_count -- number of data records in the answer
  • answer_records -- a JSON array of data records that form the result of the operation. The data records contain only fields specified by query parameter fields_to_return. See example in section Overview of API Methods".

The ranked answers (answers to all similarity search queries that have a query record, see below) have all fields from listing answer and additionally:

  • answer_distances -- a JSON array of floats with distances (dissimilarities) between the query record and records in the answer_records array
  • The ranked answers can have also the following status: "status": {"code": 403, "text": "wrong query record"}
  • the query record _id was not found in the collection AND the query record does not have a valid _file or _url

The data processing answers (answers to all operations that send a list of records and require some processing of these records, like CRUD operations) have all fields from the listing answer and additionally:

  • skipped_records -- a JSON array of data records that were skipped from the processing; each of these data records has field _id and field _reason with an "answer JSON map" that explains why this record was skipped. This reason answer always has at least the status field (see individual methods for details).

Detailed Descriptions of API Methods

/v2/ping

Description: returns a basic information about the index

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/ping' -i -X GET
$ http GET 'https://api.ximilar.com/similarity/photos/v2/ping'

Returns:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 1
  },
  "info" : "Java index for clothing objects in Smart Product Search"
}

/v2/insert

Description: inserts given list of records (images + metadata) into the index

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/insert' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "fields_to_return" : [ "_id" ],
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}'
from ximilar.client.search import SimilarityPhotosClient, SimilarityProductsClient

client = SimilarityPhotosClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')
client = SimilarityProductsClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')

# insert item tot he index with your _id, and onr of _url | _base64, and other fields (meta-info) which you can
# then use when applying filter in search or random menthods
result = client.insert([{'_id': '__ITEM_ID__', '_url': '__URL_PATH_TO_IMAGE__',
                         'meta-category-x': '__CATEGORY_OF_ITEM__',
                         'meta-info-y': '__ANOTHER_META_INFO__'}])
{
  "fields_to_return" : [ "_id" ],
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}
$ echo '{
  "fields_to_return" : [ "_id" ],
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/insert' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
records Array Yes Records to be inserted, ...
fields_to_return Array No List of strings, these fields are returned, defaults to ["_id"]
records[]._url String No Image specified by url
records[]._base64 String No Image encoded as base64

Example response:

{
  "status" : {
    "code" : 211,
    "text" : "record(s) duplicate"
  },
  "statistics" : {
    "OperationTime" : 2
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "skipped_records" : [ {
    "_id" : "3",
    "_reason" : {
      "status" : {
        "code" : 211,
        "text" : "record(s) duplicate"
      }
    }
  } ],
  "answer_count" : 2
}

Response description:

Path Type Description
status Object Status description
skipped_records Array Records that were skipped, usually because of duplicate _id
answer_records Array Successfully inserted records

Possible status values:

  • "status": {"code": 210, "text": "records inserted"} -- all records inserted
  • "status": {"code": 211, "text": "some records inserted"} -- some of the records were refused, typically because of record with the same _id already inserted. Answer field answer_records contains list of records actually inserted and answer_records contains records not inserted.
  • "status": {"code": 411, "text": "record duplicate"} -- all records refused because of _id duplicity
  • "status": {"code": 412, "text": "hard capacity exceeded"} -- records refused because of storage capacity exceeded

/v2/delete

Description: deletes given list of records (identified by _id) from the index

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/delete' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}'
from ximilar.client.search import SimilarityPhotosClient, SimilarityProductsClient

client = SimilarityPhotosClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')
client = SimilarityProductsClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')

# delete item from id
result = client.remove([{'_id': '__ITEM_ID__'}])
{
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}
$ echo '{
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/delete' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
records Array Yes Records to be deleted by their _id

Example response:

{
  "status" : {
    "code" : 213,
    "text" : "some of the records not found"
  },
  "statistics" : {
    "OperationTime" : 3
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "skipped_records" : [ {
    "_id" : "3",
    "_reason" : {
      "status" : {
        "code" : 404,
        "text" : "records not found"
      }
    }
  } ],
  "answer_count" : 2
}

Response description:

Path Type Description
answer_records Array Successfully deleted records
skipped_records Array Records that couldn't be deleted
answer_count Number Number of deleted records

Possible status values:

  • "status": {"code": 220, "text": "records deleted"} -- all records deleted
  • "status": {"code": 206, "text": "some of the records not found"} -- some of the records were not deleted. Answer field answer_records contains list of records actually deleted and "skipped_records contains the list of records not found.
  • "status": {"code": 404, "text": "records not found"} -- none of the requested records were found and deleted

/v2/get

Description: finds and returns given list of records (identified by _id) from the index

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/get' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}'
from ximilar.client.search import SimilarityPhotosClient, SimilarityProductsClient

client = SimilarityPhotosClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')
client = SimilarityProductsClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')

# get list of items from index
result = client.get([{'_id': '__ITEM_ID__'}, {'_id': '__ITEM_ID__'}])
{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}
$ echo '{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/get' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
records Array Yes Records to be returned, identified by their _id
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]

Example response:

{
  "status" : {
    "code" : 213,
    "text" : "some of the records not found"
  },
  "statistics" : {
    "OperationTime" : 9
  },
  "answer_records" : [ {
    "_url" : "http://mydomain.com/img.png",
    "_id" : "1"
  }, {
    "_file" : "/path/to/img.png",
    "_id" : "2"
  } ],
  "skipped_records" : [ {
    "_id" : "3",
    "_reason" : {
      "status" : {
        "code" : 404,
        "text" : "records not found"
      }
    }
  } ],
  "answer_count" : 2
}

Response description:

Path Type Description
answer_records Array Records found
skipped_records Array Records that couldn't be returned
answer_count Number Number of records found

Possible status values:

  • "status": {"code": 205, "text": "records found"} -- all records found and returned
  • "status": {"code": 206, "text": "some of the records not found"} -- some of the records are returned, the rest has not been found
  • "status": {"code": 404, "text": "records not found"} -- none of the requested records were found and empty answer is returned

/v2/find

Description: finds and returns given list of records which satisfies the given filter

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/find' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "fields_to_return" : [ "_id", "_url" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "limit" : 3
}'
{
  "fields_to_return" : [ "_id", "_url" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "limit" : 3
}
$ echo '{
  "fields_to_return" : [ "_id", "_url" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "limit" : 3
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/find' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
filter Object Yes Search will be applied only to records satisfying this filter
limit Number No Number of records to be returned, default: 10
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 9
  },
  "answer_records" : [ {
    "_url" : "http://my-website.com/1.png",
    "_id" : "1"
  }, {
    "_url" : "http://my-website.com/2.png",
    "_id" : "2"
  }, {
    "_url" : "http://my-website.com/3.png",
    "_id" : "3"
  } ],
  "answer_count" : 3
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of returned records

/v2/update

Description: updates attributes of the given list of records (identified by _id) stored in the index. This method can only update additional attributes used for filtering (including tags), but NOT the image (use delete and re-insert if you want to change the image).

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/update' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1",
    "day" : "monday"
  }, {
    "_id" : "2",
    "day" : "tuesday"
  }, {
    "_id" : "3",
    "day" : "wednesday"
  } ]
}'
from ximilar.client.search import SimilarityPhotosClient, SimilarityProductsClient

client = SimilarityPhotosClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')
client = SimilarityProductsClient(token='__API_TOKEN__', collection='__COLLECTION_ID__')

# update item in index with all additional fields and meta-info
result = client.update([{'_id': '__ITEM_ID__', 'some-additional-field': '__VALUE__'}])
{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1",
    "day" : "monday"
  }, {
    "_id" : "2",
    "day" : "tuesday"
  }, {
    "_id" : "3",
    "day" : "wednesday"
  } ]
}
$ echo '{
  "fields_to_return" : [ "*" ],
  "records" : [ {
    "_id" : "1",
    "day" : "monday"
  }, {
    "_id" : "2",
    "day" : "tuesday"
  }, {
    "_id" : "3",
    "day" : "wednesday"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/update' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
records Array Yes Records to be updated, identified by their _id
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]

Example response:

{
  "status" : {
    "code" : 213,
    "text" : "some of the records not found"
  },
  "statistics" : {
    "OperationTime" : 1
  },
  "answer_records" : [ {
    "another-field" : "another-value",
    "_id" : "1",
    "day" : "monday"
  }, {
    "day" : "tuesday",
    "_id" : "2"
  } ],
  "skipped_records" : [ {
    "_id" : "3",
    "_reason" : {
      "status" : {
        "code" : 404,
        "text" : "records not found"
      }
    }
  } ],
  "answer_count" : 2
}

Response description:

Path Type Description
answer_records Array Updated records
skipped_records Array Records that couldn't be updated
answer_count Number Number of records updated

Possible status values (the same as /v2/get operation):

  • "status": {"code": 205, "text": "records found"} -- all records found and returned
  • "status": {"code": 206, "text": "some of the records not found"} -- some of the records are returned, the rest has not been found
  • "status": {"code": 404, "text": "records not found"} -- none of the requested records were found and empty answer is returned

/v2/nearDupsInsert

Description: inserts given list of records (images + metadata) into the index, but first it checks that there are no near duplicate images in the collection yet. See method /v2/nearDuplicates for details.

Returns: Data processing answer with the same status values as for method /v2/insert. If a record is skipped from insertion because of near-duplicates were found, it appears in skipped_records with the status containing the same answer as method /v2/nearDuplicates would return.

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/nearDupsInsert' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "radius" : 5.0,
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}'
{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "radius" : 5.0,
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}
$ echo '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "radius" : 5.0,
  "records" : [ {
    "_id" : "1",
    "_url" : "http://mydomain.com/img.png"
  }, {
    "_base64" : "data:image/png;base64,ABC...",
    "_id" : "2"
  }, {
    "_id" : "3",
    "_url" : "http://mydomain.com/another_image.jpg"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/nearDupsInsert' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
records Array Yes Records to be updated, identified by their _id
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]
radius Number No Maximum distance for records to be considered near duplicates, defaults to 5.0
cand_set_size Number No Internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default

Example response:

{
  "status" : {
    "code" : 211,
    "text" : "record(s) duplicate"
  },
  "statistics" : {
    "OperationTime" : 2
  },
  "answer_records" : [ {
    "_id" : "1"
  } ],
  "skipped_records" : [ {
    "_id" : "3",
    "_reason" : {
      "status" : {
        "code" : 211,
        "text" : "record(s) duplicate"
      },
      "answer_records" : [ {
        "_id" : "3"
      } ]
    }
  }, {
    "_id" : "2",
    "_reason" : {
      "status" : {
        "code" : 211,
        "text" : "record(s) duplicate"
      }
    }
  } ],
  "answer_count" : 1
}

Response description:

Path Type Description
answer_records Array Updated records
skipped_records Array Records that couldn't be updated
skipped_records[]._reason.answer_records Array in case method 'nearDupsInsert' was used, this is list of near-duplicate records
skipped_records[]._reason.answer_distances class [I in case method 'nearDupsInsert' was used, this is list of distances to near-duplicates
answer_count Number Number of records updated

/v2/visualKNN

Description: find visually similar images to a given image from the collection

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/visualKNN' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  }
}'
{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  }
}
$ echo '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  }
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/visualKNN' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
query_record Object Yes Record to be search by
k Number No Number of records to be returned, default: 30
from Number No The number of records to be skipped, defaults to 0
fields_to_return Array No Fields to be returned in every record, defaults to ["_id"]
filter Object No Search will be applied only to records satisfying this filter
product_field String No If set, each record in the response will be different product. Every collection has it’s own default
cand_set_size Number No Internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 1
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ],
  "answer_distances" : [ 40.0, 50.0, 60.0 ],
  "answer_count" : 3
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of returned records
answer_distances Array Distances between query record and individual records in answer_records

/v2/visualTagsKNN

Description: find images that are similar based on combination of visual similarity and similarity of keywords that the user passes in field tags of each record. This field must be in both data records and the query record.

All parameters and the answer are the same as for /v2/visualKNN

/v2/visualKNNMulti

Description: find visually similar images from the collection to a given list of images (multi-query)

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/visualKNNMulti' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "query_records" : [ {
    "_id" : "10"
  }, {
    "_id" : "11"
  } ]
}'
{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "query_records" : [ {
    "_id" : "10"
  }, {
    "_id" : "11"
  } ]
}
$ echo '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "from" : 0,
  "k" : 3,
  "query_records" : [ {
    "_id" : "10"
  }, {
    "_id" : "11"
  } ]
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/visualKNNMulti' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
query_records Array Yes Records to be search by
k Number No Number of records to be returned, default: 30
from Number No The number of records to be skipped, defaults to 0
fields_to_return Array No Fields to be returned in every record, defaults to ["_id"]
filter Object No Search will be applied only to records satisfying this filter
cand_set_size Number No Internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 0
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ],
  "answer_distances" : [ 40.0, 50.0, 60.0 ],
  "answer_count" : 3
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of returned records
answer_distances Array Distances between query record and individual records in answer_records

/v2/visualTagsKNNMulti

Description: find images that are similar to a given list of images based on combination of visual and tags similarity.

All parameters and the answer are the same as for /v2/visualKNNMulti

/v2/random

Description: returns a given number of random records stored in the collection

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/random' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "count" : 2,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  }
}'
{
  "count" : 2,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  }
}
$ echo '{
  "count" : 2,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  }
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/random' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
count Number No Number of records to be returned, defaults to 1
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]
filter Object No If the filter is set, not all records are returned but only those matching the filter

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 1
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "answer_count" : 2
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of records updated

/v2/getRecordCount

Description: gets the number of records stored in the index collection

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/getRecordCount' -i -X GET \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24'

$ http GET 'https://api.ximilar.com/similarity/photos/v2/getRecordCount' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 0
  },
  "answer_count" : 5
}

Response description:

Path Type Description
answer_count Number Number of records returned

/v2/allRecords

Description: gets all records stored in the collection (or just their IDs). The answer is either returned as a standard answer, or stored into a file in the local file system, or both. The created file contains each record on a separate line (it can be directly used to bulk insert data into a new index).

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/allRecords' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}'
{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}
$ echo '{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/allRecords' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
output_file_name String No Name of the output (temporary) file, default: all-records-<temp_value>.json
fields_to_return Array No Fields to be returned in every record, defaults to ["_id"]
delete_file_after Boolean No If true then the output file is deleted after the processing, default: true
filter Object No If the filter is set, not all records are returned but only those matching the filter

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 3
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "output_file_name" : "some-file.json",
  "answer_count" : 2
}

Response description:

Path Type Description
output_file_name String Name of the file records were saved to
answer_records Array Data records
answer_count Number Number of records updated

/v2/deleteByFilter

Description: deletes all records matching given condition. This method is not available in all indexes.

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/allRecords' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}'
{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}
$ echo '{
  "delete_file_after" : true,
  "fields_to_return" : [ "*" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "output_file_name" : "some-file.json"
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/allRecords' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
output_file_name String No Name of the output (temporary) file, default: all-records-<temp_value>.json
fields_to_return Array No Fields to be returned in every record, defaults to ["_id"]
delete_file_after Boolean No If true then the output file is deleted after the processing, default: true
filter Object No If the filter is set, not all records are returned but only those matching the filter

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 3
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "output_file_name" : "some-file.json",
  "answer_count" : 2
}

Response description:

Path Type Description
output_file_name String Name of the file records were saved to
answer_records Array Data records
answer_count Number Number of records updated

/v2/range

Description: find visually similar images to a given image from the collection up to a certain query radius. The search is approximate -- there might be false negatives, especially with larger radius. Do not use this method, if you are not sure.

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/range' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 100
}'
{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 100
}
$ echo '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "product_field" : "product_id",
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 100
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/range' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
query_record Object Yes Record to be search by
radius Number Yes Radius of the search - maximum distance between the query and answer records
fields_to_return Array No Fields to be returned in every record, defaults to ["_id"]
filter Object No Search will be applied only to records satisfying this filter
product_field String No If set, each record in the response will be different product. Every collection has it’s own default
cand_set_size Number No Internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 0
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  }, {
    "_id" : "3"
  } ],
  "answer_distances" : [ 40.0, 50.0, 60.0 ],
  "answer_count" : 3
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of returned records
answer_distances Array Distances between query record and individual records in answer_records

/v2/nearDuplicates

Description: finds those images in the collection that are the same or very similar to the query image.

Example:

$ curl 'https://api.ximilar.com/similarity/photos/v2/nearDuplicates' -i -X POST \
    -H 'Content-Type: application/json;charset=UTF-8' \
    -H 'collection-id: mycoll_id' \
    -H 'Authorization: Token 1af538baa90-----XXX-----baf83ff24' \
    -d '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 5.0
}'
{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 5.0
}
$ echo '{
  "cand_set_size" : 1000,
  "fields_to_return" : [ "_id" ],
  "filter" : {
    "$gte" : {
      "price" : "200"
    }
  },
  "query_record" : {
    "_url" : "http://mydomain.com/my-image.png"
  },
  "radius" : 5.0
}' | http POST 'https://api.ximilar.com/similarity/photos/v2/nearDuplicates' \
    'Content-Type:application/json;charset=UTF-8' \
    'collection-id:mycoll_id' \
    'Authorization:Token 1af538baa90-----XXX-----baf83ff24'

Request description:

Path Type Required Description
query_record Object Yes Record to be search by
fields_to_return Array No Fields to be returned in every record, defaults to ["*"]
radius Number Yes Maximum distance for records to be considered near duplicates, defaults to 5.0
filter Object No Search will be applied only to records satisfying this filter
cand_set_size Number No Internal parameter (please, do not use) that can influence the response time of the operation (the lower the faster, but potentially less precise). Every collection has it’s own default

Example response:

{
  "status" : {
    "code" : 200,
    "text" : "OK"
  },
  "statistics" : {
    "OperationTime" : 4
  },
  "answer_records" : [ {
    "_id" : "1"
  }, {
    "_id" : "2"
  } ],
  "answer_distances" : [ 10.0, 20.0 ],
  "answer_count" : 2
}

Response description:

Path Type Description
answer_records Array Updated records
answer_count Number Number of records updated
answer_distances Array Distances between query record and individual records in answer_records

/v2/allNearDupPairs

Description: finds all pairs of images in the collection that are the same or mutually very similar (using visual similarity).

Example:

Request description:

Example response:

Response description: